technical details - bsa.natcen.ac.uk€¦ · 36 technical details 1 technical details in 2018, the...

The National Centre for Social Research

British Social Attitudes 36 | Technical details 1

Technical details In 2018, the sample for the British Social Attitudes (BSA) survey was split into four equally-sized portions. Each portion was asked a different version of the questionnaire (versions A, B,C and D). Depending on the number of versions in which it was included, questions were thus asked either of the full sample (3,879 respondents) or of a random quarter, half or three-quarters of the sample.

Sample designThe BSA survey is designed to yield a representative sample of adults aged 18 or over. Since 1993, the sampling frame for the survey has been the Postcode Address File (PAF), a list of addresses (or postal delivery points) compiled by the Post Office.1

For practical reasons, the sample is confined to those living in private households. People living in institutions (though not in private households at such institutions) are excluded, as are households whose addresses were not on the PAF.

The sampling method involved a multi-stage design, with three separate stages of selection.

Selection of sectors

At the first stage, postcode sectors were selected systematically from a list of all postal sectors in Britain. Before selection, any sectors with fewer than 500 addresses were identified and grouped together with an adjacent sector; in Scotland all sectors north of the Caledonian Canal were excluded (because of the prohibitive costs of interviewing there). Sectors were then stratified on the basis of: 36 sub-regions; population density, (population in private households/area of the postal sector in hectares), with variable banding used in order to create three equal-sized strata per sub-region; and ranking by percentage of homes that were owner-occupied.

This resulted in the selection of 395 postcode sectors, with probability proportional to the number of addresses in each sector.

Selection of addresses

Twenty-six addresses were selected in each of the 395 sectors or groups of sectors. The issued sample was therefore 395 x 26 = 10,270 addresses, selected by starting from a random point on the list of addresses for each sector, and choosing each address at a fixed interval. The fixed interval was calculated for each sector in order to generate the correct number of addresses.



The Multiple-Occupancy Indicator (MOI) available through PAF was used when selecting addresses in Scotland. The MOI shows the number of accommodation spaces sharing one address. Thus, if the MOI indicated more than one accommodation space at a given address, the chances of the given address being selected from the list of addresses would increase so that it matched the total number of accommodation spaces. The MOI is largely irrelevant in England and Wales, as separate dwelling units (DUs) generally appear as separate entries on the PAF. In Scotland, tenements with many flats tend to appear as one entry on the PAF. However, even in Scotland, the vast majority (99.9%) of MOIs in the sample had a value of one. The remainder had MOIs greater than one. The MOI affects the selection probability of the address, so it was necessary to incorporate an adjustment for this into the weighting procedures (described below).

Selection of individuals

Interviewers called at each address selected from the PAF and listed all those eligible for inclusion in the BSA sample – that is, all persons currently aged 18 or over and resident at the selected address. The interviewer then selected one respondent using a computer-generated random selection procedure. Where there were two or more DUs at the selected address, interviewers first had to select one DU using the same random procedure. They then followed the same procedure to select a person for interview within the selected DU.

WeightingThe weights for the BSA survey correct for the unequal selection of addresses, DUs and individuals, and for biases caused by differential non-response. The different stages of the weighting scheme are outlined in detail below.

Selection weights

Selection weights are required because not all the units covered in the survey had the same probability of selection. The weighting reflects the relative selection probabilities of the individual at the three main stages of selection: address, DU and individual. First, because addresses in Scotland were selected using the MOI, weights were needed to compensate for the greater probability of an address with an MOI of more than one being selected, compared with an address with an MOI of one. (This stage was omitted for the English and Welsh data.) Secondly, data were weighted to compensate for the fact that a DU at an address that contained a large number of DUs was less likely to be selected for inclusion in the survey than a DU at an address that contained fewer DUs. (We used this procedure because in most cases where the MOI is greater than one, the two



stages will cancel each other out, resulting in more efficient weights.) Thirdly, data were weighted to compensate for the lower selection probabilities of adults living in large households, compared with those in small households.

At each stage the selection weights were trimmed to avoid a small number of very high or very low weights in the sample; such weights would inflate standard errors, reducing the precision of the survey estimates and causing the weighted sample to be less efficient. A small proportion (typically less than 1%) of the selection weights were trimmed at each stage.

Non-response model

It is known that certain subgroups in the population are more likely to respond to surveys than others. These groups can end up over-represented in the sample, which can bias the survey estimates. Where information is available about non-responding households, the response behaviour of the sample members can be modelled and the results used to generate a non-response weight. This non-response weight is intended to reduce bias in the sample resulting from differential response to the survey.

The data were modelled using logistic regression, with the dependent variable indicating whether or not the selected individual responded to the survey. Ineligible households2 were not included in the non-response modelling. A number of area-level and interviewer observation variables were used to model response. Not all the variables examined were retained for the final model: variables not strongly related to a household’s propensity to respond were dropped from the analysis.

The variables found to be related to response, once controlled for the rest of the predictors in the model, were: region, type of dwelling, whether there were entry barriers to the selected address, the relative condition of the immediate local area and the relative condition of the address. The model shows that response increases if there are no barriers to entry (for instance, if there are no locked gates around the address and no entry phone) and if the general condition of the address is better than other addresses in the area, rather than being about the same or worse. Response is also higher for flats than detached houses, and increases if the relative condition of the immediate surrounding area is mainly good. The full model is given in Table A.1.



Table A.1 The final non-response model

Variable B S.E. Wald Df Sig. Odds

Region 73.999 11 .000

Inner London (baseline)

North East .446 .158 8.002 1 .005 1.562

North West .248 .134 3.455 1 .063 1.282

Yorkshire and The Humber

.531 .140 14.495 1 .000 1.701

East Midlands .498 .142 12.283 1 .000 1.645

West Midlands -.021 .141 0.022 1 .881 0.979

East of England .222 .138 2.597 1 .107 1.249

Outer London .025 .138 0.032 1 .857 1.025

South East .080 .133 0.365 1 .546 1.083

South West .206 .142 2.122 1 .145 1.229

Wales -.113 .158 0.516 1 .473 0.893

Scotland .196 .138 2.011 1 .156 1.217

Type of dwelling 19.458 3 .000

Detached House (baseline)

Semi-detached house

-.221 .064 11.886 1 .001 .802

Terraced house (including end of terrace)

-.106 .068 2.418 1 .120 .899

Flat or maisonette and other

.077 .090 0.722 1 .396 1.080

Barriers to address

No barriers (baseline)

One or more -.448 .085 28.050 1 .000 .639

Relative condition of the local area

46.123 2 .000

Mainly good (baseline)

Mainly fair -.311 .048 41.749 1 .000 .733

Mainly bad or very bad

-.442 .111 15.849 1 .000 .642

Relative condition of the address

38.636 2 .000

Better (baseline)

About the same -.406 .081 24.929 1 .000 .666

Worse -.693 .114 36.753 1 .000 .500



Table A.1 The final non-response model (continued)


Percentage owner-occupied in quintiles

5.894 4 .207

1 lowest (constant)

2 -.046 .074 .387 1 .534 .955

3 -.113 .078 2.104 1 .147 .893

4 .010 .082 .015 1 .901 .010

5 highest -.119 .083 .023 1 .155 .888

Population density1 -.034 .021 2.520 1 .112 .967

Constant .342 .168 4.137 1 .042 1.407

The response is 1 = individual responding to the survey, 0 = non-responseVariables that are significant at the 0.05 level are included in the model. Percentage owner-occupied (in quintiles) and population density were not significant in 2018 but were kept in the model for consistency.The model R2 is 0.036 (Cox and Snell)B is the estimate coefficient with standard error S.E. The Wald-test measures the impact of the categorical variable on the model with the appropriate number of degrees of freedom (df). If the test is significant (sig. < 0.05), then the categorical variable is considered to be ‘significantly associated’ with the response variable and therefore included in the model1

The non-response weight was calculated as the inverse of the predicted response probabilities saved from the logistic regression model. The non-response weight was then combined with the selection weights to create the final non-response weight. The top 0.5% of the weight were trimmed before the weight was scaled to the achieved sample size (resulting in the weight being standardised around an average of one).

Calibration weighting

The final stage of weighting was to adjust the final non-response weight so that the weighted sample matched the population in terms of age, sex and region.

Only adults aged 18 or over are eligible to take part in the survey, therefore the data have been weighted to the British population aged 18+ based on 2017 Mid-Year Estimates data from the Office for National Statistics/General Register Office for Scotland.

The survey data were weighted to the marginal age/sex and region distributions using calibration weighting. As a result, the weighted data should exactly match the population across these three dimensions. This is shown in Table A.2.

1 Population density refers to the number of people per unit of area. This was achieved by calculating the ratio between the number of people in private households in each PSU divided by the area of each PSU in hectares.



Table A.2 Weighted and unweighted sample distribution, by region, age and sex

PopulationUnweighted respondents

Respondent weighted

by selection weight only

Respondent weighted

by un-calibrated

non-response

weight

Respondent weighted by final weight

Region % % % % %

North East 4.2 4.6 4.7 4.2 4.2

North West 11.3 11.4 11.0 10.8 11.3

Yorks. and Humber

8.5 10.5 10.2 8.7 8.5

East Midlands 7.5 9.3 9.2 7.7 7.5

West Midlands 9.0 7.6 7.5 8.6 9.0

East of England 9.6 10.2 10.3 9.9 9.6

London 13.5 10.0 10.9 13.3 13.5

South East 14.1 13.4 13.7 14.3 14.1

South West 8.8 9.5 9.5 9.1 8.8

Wales 4.9 4.4 4.5 5.0 4.9

Scotland 8.7 9.1 8.5 8.6 8.7

Age & sex % % % % %

M 18–24 5.5 2.6 3.8 4.0 5.5

M 25–34 8.6 5.5 6.2 6.6 8.6

M 35–44 7.9 6.2 6.4 6.5 7.9

M 45–54 8.7 7.5 7.6 7.8 8.7

M 55–59 4.1 4.2 4.4 4.5 4.1

M 60–64 3.5 4.1 4.2 4.0 3.5

M 65+ 10.8 13.5 12.8 12.2 10.8

F 18–24 5.2 3.0 4.0 4.1 5.2

F 25–34 8.5 7.7 7.6 8.1 8.5

F 35–44 8.0 10.0 10.0 10.1 8.0

F 45–54 8.9 8.8 9.8 9.6 8.9

F 55–59 4.2 4.4 4.7 4.7 4.2

F 60–64 3.6 4.4 4.1 4.0 3.6

F 65+ 12.4 18.0 14.3 13.9 12.5

Base 50,792,629 3,879 3,879 3,879 3,879

The calibration weight (WtFactor) is the final non-response weight to be used in the analysis of CAPI (face-to-face) variables in the 2018 survey; this weight has been scaled to the responding sample size. The range of the weights is given in Table A.4.



Self-completion weighting

The BSA survey requires respondents to answer a self-completion questionnaire in addition to the face-to-face interview. The rate of self-completion response differs from survey to survey but has trended downwards in recent years. In 2018, 79% of respondents completed a valid self-completion questionnaire, compared with 82% in both 2016 and 2017.

As in previous years, we investigated differences between the profile of respondents who completed the self-completion questionnaire and those who did not. In 2018, unlike recent years, this analysis showed that there were statistically significant differences in the profile of those who completed a valid self-completion survey and those who did not. As a result, it was deemed appropriate to generate a self-completion non-response weight which takes account of some of the systemic underlying factors which may be leading to these differences. In addition to the census data and interviewer observations that were used to compute the non-response component of the main weight, other respondent characteristics were also considered for the self-completion weight. More is known about individuals who did not complete a self-completion questionnaire compared to those who did not participate in BSA at all. Hence, this can be leveraged in modelling the characteristics of those who filled in the self-completion questionnaire, compared to those who did not.

As when modelling non-response for the main survey weight, logistic regression was used (with the dependent variable indicating whether the respondent completed their self-completion questionnaire). A number of demographic characteristics, responses to politics-related questions (asked in the face-to-face interview), as well as area-level and interviewer observation variables, were used to model response. Again, not all the variables examined were retained for the final model; variables which were found to not be significantly associated with an individual’s propensity to complete the self-completion supplement were excluded.

Among the key findings were that respondents from a BAME background were significantly less likely to complete the self-completion supplement. Those without any educational qualifications were also significantly less likely to make a valid return compared to those with a degree. So too, those with ‘little’ or ‘no’ interest in politics were less likely to fill in the self-completion questionnaire. The full results from the model are presented in Table A.3.



Table A.3 The final self-completion non-response model


Respondent sex/age

58.337 13 .000

Male 18-24 (baseline)

Male 25-34 -.005 .204 .001 1 .982 .995

Male 35-44 -.235 .204 1.322 1 .250 .791

Male 45-54 .399 .210 3.619 1 .057 1.490

Male 55-59 .366 .255 2.060 1 .151 1.442

Male 60-64 .275 .272 1.016 1 .314 1.316

Male 65+ .645 .216 8.945 1 .003 1.905

Female 18-24 .816 .252 10.499 1 .001 2.261

Female 25-34 .020 .204 .010 1 .921 1.020

Female 35-44 .259 .211 1.510 1 .219 1.296

Female 45-54 .619 .216 8.198 1 .004 1.857

Female 55-59 .384 .256 2.252 1 .133 1.468

Female 60-64 .801 .295 7.379 1 .007 2.227

Female 65+ .760 .212 12.890 1 .000 2.137

Region 54.292 10 .000

North East (baseline)

North West -.187 .242 .592 1 .441 .830

Yorkshire and The Humber

-.191 .251 .574 1 .449 .827

East Midlands .404 .278 2.111 1 .146 1.497

West Midlands -.836 .242 11.966 1 .001 .433

East of England .027 .254 .011 1 .916 1.027

London .113 .248 .206 1 .650 1.119

South East -.076 .241 .098 1 .754 .927

South West -.360 .253 2.030 1 .154 .698

Wales -.359 .276 1.685 1 .194 .699

Scotland -.283 .251 1.277 1 .258 .753

Ethnicity

White (baseline)

BAME -.770 .114 45.452 1 .000 .463

Highest qualification

30.452 3 .000

Degree (baseline)

A-level / higher (below degree)

-.156 .122 1.643 1 .200 .856

O-level/ GCSE -.518 .122 17.916 1 .000 .596

No qualifications / other

-.645 .138 21.883 1 .000 .524



Table A.3 The final self-completion non-response model (continued)


Interest in politics 15.324 4 .004

A great deal (baseline)

Quite a lot .063 .149 .178 1 .673 1.065

Some -.098 .143 .469 1 .494 .907

Not very much -.331 .153 4.692 1 .030 .718

None at all -.422 .175 5.840 1 .016 .656

Internet access

Yes (baseline)

No -.474 .150 9.996 1 .002 .623

Type of dwelling 13.644 3 .003

Detached house (baseline)

Semi-detached house

-.049 .123 .155 1 .694 .953

Terraced house (including end of terrace)

.001 .126 .000 1 .991 .001

Flat or maisonette and other

-.410 .142 8.350 1 .004 .664

Relative condition of the address

6.186 2 .045

Better (baseline)

About the same .331 .148 5.000 1 .025 1.393

Worse .479 .210 5.190 1 .023 1.615

Constant 2.743 .371 54.725 1 .000 15.540

The response is 1 = self-completion supplement returned (response), 0 = self-completion supplement not returned (non-response)Variables that were significant at the 0.05 level were included in the model.The model R2 is 0.106 (Cox and Snell)B is the estimate coefficient with standard error S.E. The Wald-test measures the impact of the categorical variable on the model with the appropriate number of degrees of freedom (df). If the test is significant (sig. < 0.05), then the categorical variable is considered to be ‘significantly associated’ with the response variable and therefore included in the model

The self-completion weight was calculated as the inverse of the predicted response probabilities saved from the logistic regression model. These were then combined with the main BSA weight to create the final self-completion weight. The top 0.5% of the weight were trimmed before the weight was scaled to the achieved sample size (resulting in the weight being standardised around an average of one).

The self-completion weight (WtFactorSC) is the weight to be used in the analysis of self-completion variables on the 2018 survey; this weight has been scaled to the responding self-completion sample



size. The range of the weights is given in Table A.4.

Table A.4 Range of weights

N Minimum Mean Maximum

DU and person selection weight 3,879 .55 1.00 2.20

Un-calibrated non-response and selection weight

3,879 .33 1.00 3.61

Final calibrated non-response weight 3,879 .28 1.00 4.39

Self-completion non-response weight 3,065 .28 1.00 3.97

Effective sample size

The effect of the sample design on the precision of survey estimates is indicated by the effective sample size (neff). The effective sample size measures the size of an (unweighted) simple random sample that would achieve the same precision (standard error) as the design being implemented. If the effective sample size is close to the actual sample size, then we have an efficient design with a good level of precision. The lower the effective sample size is, the lower the level of precision. The efficiency of a sample is given by the ratio of the effective sample size to the actual sample size. Samples that select one person per household tend to have lower efficiency than samples that select all household members. The final calibrated non-response weights have an effective sample size (neff) of 2,936 and efficiency of 76%.

Weighted bases

All the percentages presented in this report are based on weighted data. Only unweighted bases are presented in the tables. Details of weighted bases for standard demographics are shown in Table A.5.



Table A.5 Weighted bases for standard demographics, 2018

Weighted base Unweighted base

Sex

Male 1904 1691

Female 1975 2188

Age

18-24 417 218

25-34 664 512

35-44 622 629

45-54 682 631

55-59 320 335

60-64 274 330

65+ 896 1217

Ethnicity

White 3296 3433

Black and Minority Ethnic 571 432

Class group (NSSEC)

Managerial & professional occupations 1544 1601

Intermediate occupations 510 534

Employers in small org; own account workers 383 375

Lower supervisory & technical occupations 292 280

Semi-routine & routine occupations 941 930

Highest educational qualification

Degree 1061 1009

Higher education below degree 423 449

A level or equivalent 631 567

O level or equivalent 706 680

CSE or equivalent 288 297

Foreign or other 48 49

No qualification 699 806

Marital status

Married 1989 1769

Living as married 408 324

Separated or divorced after marrying 352 558

Widowed 224 414

Not married 903 812



Questionnaire versionsEach address in each sector (sampling point) was allocated to one of the portions of the sample: A, B ,C or D. As mentioned earlier, a different version of the questionnaire was used with each of the four sample portions. If one serial number was version A, the next was version B,the third version C and the fourth version D. There were 2,567 issued addresses for each of the versions of the sample.

Fieldwork

Interviewing was mainly carried out between July and October 2018, with a small number of interviews taking place in November 2018.

Fieldwork was conducted by interviewers drawn from The National Centre for Social Research’s regular panel and conducted using face-to-face computer-assisted interviewing.3 Interviewers attended a one-day briefing conference to familiarise them with the selection procedures and questionnaires, with the exception of experienced interviewers who completed a self-briefing containing updates to the questionnaire and procedures.

The mean interview length was 64 minutes for version A of the questionnaire, 55 minutes for version B, 62 minutes for version C and 63 minutes for version D.4 Interviewers achieved an overall response rate of between 41.9% and 42.4%. Details are shown in Table A.6.

Table A.6 Response rate on British Social Attitudes, 2018

NumberLower limit of response (%)

Upper limit of response (%)

Addresses issued 10,270

Out of scope 1,023 % %

Upper limit of eligible cases 9,247 100

Uncertain eligibility 99 1.1

Lower limit of eligible cases 9,148 100

Interview achieved 3,879 41.9 42.4

With self-completion 3,065

Interview not achieved 5,269 57 57.6

Refused 3,792 41 41.5

Non-contacted 892 9.6 9.8

Other non-response 585 6.3 6.4

Response is calculated as a range from a lower limit where all unknown eligibility cases (for example, address inaccessible, or unknown whether address is residential) are assumed to be eligible and therefore included in the unproductive outcomes, to an upper limit where all these cases are assumed to be ineligible and therefore excluded from the response calculation‘Refused’ comprises refusals before selection of an individual at the address, refusals to the office, refusal by the selected person, ‘proxy’ refusals (on behalf of the selected respondent) and broken appointments after which the selected person could not be re-contacted‘Non-contacted’ comprises households where no one was contacted and those where the selected person could not be contacted



As in earlier rounds of the series, the respondent was asked to fill in a self-completion questionnaire which, whenever possible, was collected by the interviewer. Otherwise, the respondent was asked to post it to NatCen Social Research.

A total of 814 respondents (21% of those interviewed) did not return their self-completion questionnaire. Version A of the self-completion questionnaire was returned by 78% of respondents to the face-to-face interview, version B of the questionnaire was returned by 80%, version C by 80% and Version D by 78%.

Advance letter

Advance letters describing the purpose of the survey and the coverage of the questionnaire, were sent to sampled addresses before the interviewer made their first call.5

Analysis variables

A number of standard analyses have been used in the tables that appear in this report. The analysis groups requiring further definition are set out below. For further details see Stafford and Thomson (2006). Where relevant the name given to the relevant analysis variable is shown in square brackets – for example [HHincQ].

Region

The dataset is classified by 12 regions, formerly the Government Office Regions.

Standard Occupational Classification

Respondents are classified according to their own occupation, not that of the ‘head of household’. Each respondent was asked about their current or last job, so that all respondents except those who had never worked were coded. Additionally, all job details were collected for all spouses and partners in work.

Since the 2011 survey, we have coded occupation to the Standard Occupational Classification 2010 (SOC 2010) instead of the Standard Occupational Classification 2000 (SOC 2000). The main socio-economic grouping based on SOC 2010 is the National Statistics Socio-Economic Classification (NS-SEC). However, to maintain time-series, some analysis has continued to use the older schemes based on SOC 90 – Registrar General’s Social Class and Socio-Economic Group – though these are now derived from SOC 2000 (which is derived from SOC 2010).

National Statistics Socio-Economic Classification (NS-SEC)

The combination of SOC 2010 and employment status for current or last job generates the following NS-SEC analytic classes:



• Employers in large organisations, higher managerial and professional

• Lower professional and managerial; higher technical and supervisory

• Intermediate occupations

• Small employers and own account workers

• Lower supervisory and technical occupations

• Semi-routine occupations

• Routine occupations

The remaining respondents are grouped as “never had a job” or “not classifiable”. For some analyses, it may be more appropriate to classify respondents according to their current socio-economic status, which takes into account only their present economic position. In this case, in addition to the seven classes listed above, the remaining respondents not currently in paid work fall into one of the following categories: “not classifiable”, “retired”, “looking after the home”, “unemployed” or “others not in paid occupations”.

Registrar General’s Social Class

As with NS-SEC, each respondent’s social class is based on his or her current or last occupation. The combination of SOC 90 with employment status for current or last job generates the following six social classes:

IProfessional etc. occupations

IIManagerial and technical occupations

‘Non-manual’

III (Non-manual) Skilled occupations

III (Manual) Skilled occupations

IVPartly skilled occupations

‘Manual’

V Unskilled occupations

They are usually collapsed into four groups: I & II, III Non-manual, III Manual, and IV & V.

Socio-Economic Group

As with NS-SEC, each respondent’s Socio-Economic Group (SEG) is based on his or her current or last occupation. SEG aims to bring



together people with jobs of similar social and economic status, and is derived from a combination of employment status and occupation. The full SEG classification identifies 18 categories, but these are usually condensed into six groups:

• Professionals, employers and managers

• Intermediate non-manual workers

• Junior non-manual workers

• Skilled manual workers

• Semi-skilled manual workers

• Unskilled manual workers

As with NS-SEC, the remaining respondents are grouped as “never had a job” or “not classifiable”.

Industry

All respondents whose occupation could be coded were allocated a Standard Industrial Classification 2007 (SIC 07). Two-digit class codes are used. As with social class, SIC may be generated on the basis of the respondent’s current occupation only, or on his or her most recently classifiable occupation.

Party identification

Respondents can be classified as identifying with a particular political party on one of three counts: if they consider themselves supporters of that party; closer to it than to others; or more likely to support it in the event of a general election. The three groups are generally described respectively as ‘partisans’, ‘sympathisers’ and ‘residual identifiers’. In combination, the three groups are referred to as ‘identifiers’. Responses are derived from the following questions:

Generally speaking, do you think of yourself as a supporter of any one political party? [Yes/No]

[If “No”/“Don’t know”]

Do you think of yourself as a little closer to one political party than to the others? [Yes/No]

[If “Yes” at either question or “No”/“Don’t know” at 2nd question]

Which one?/If there were a general election tomorrow, which political party do you think you would be most likely to support?

[Conservative; Labour; Liberal Democrat; Scottish National Party; Plaid Cymru; Green Party; UK Independence Party (UKIP)/Veritas; British National Party (BNP)/National Front; RESPECT/Scottish Socialist Party (SSP)/Socialist Party; Other party; Other answer; None; Refused to say]



Income

Respondent’s household income is classified by the standard BSA variable [HHInc]. The bandings used are designed to be representative of those that exist in Britain and are taken from the Family Resource Survey and adjusted for inflation. Two derived variables give income deciles/quartiles: [HHIncD] and [HHIncQ].

In 2018 BSA included some more detailed questions on respondent individual and household income. The dataset includes a derived variable using these data [eq_inc_quintiles] which is the net equivalised household income after housing costs in quintiles. More detailed income data can be made available to researchers on request.

Attitude scales

Since 1986, the BSA surveys have included two attitude scales which aim to measure where respondents stand on certain underlying value dimensions – left–right and libertarian–authoritarian.6 Since 1987 (except in 1990), a similar scale on ‘welfarism’ has also been included. Some of the items in the welfarism scale were changed in 2000–2001. The current version of the scale is shown below.

A useful way of summarising the information from a number of questions of this sort is to construct an additive index (Spector, 1992; DeVellis, 2003). This approach rests on the assumption that there is an underlying – ‘latent’ – attitudinal dimension which characterises the answers to all the questions within each scale. If so, scores on the index are likely to be a more reliable indication of the underlying attitude than the answers to any one question.

Each of these scales consists of a number of statements to which the respondent is invited to “agree strongly”, “agree”, “neither agree nor disagree”, “disagree” or “disagree strongly”.

The items are:

Left–right scale

Government should redistribute income from the better off to those who are less well off [Redistrb]

Big business benefits owners at the expense of workers [BigBusnN]

Ordinary working people do not get their fair share of the nation’s wealth [Wealth]7

There is one law for the rich and one for the poor [RichLaw]

Management will always try to get the better of employees if it gets the chance [Indust4]



Libertarian–authoritarian scale

Young people today don’t have enough respect for traditional British values. [TradVals]

People who break the law should be given stiffer sentences. [StifSent]

For some crimes, the death penalty is the most appropriate sentence. [DeathApp]

Schools should teach children to obey authority. [Obey]

The law should always be obeyed, even if a particular law is wrong. [WrongLaw]

Censorship of films and magazines is necessary to uphold moral standards. [Censor]

Welfarism scale

The welfare state encourages people to stop helping each other. [WelfHelp]

The government should spend more money on welfare benefits for the poor, even if it leads to higher taxes. [MoreWelf]

Around here, most unemployed people could find a job if they really wanted one. [UnempJob]

Many people who get social security don’t really deserve any help. [SocHelp]

Most people on the dole are fiddling in one way or another. [DoleFidl]

If welfare benefits weren’t so generous, people would learn to stand on their own two feet. [WelfFeet]

Cutting welfare benefits would damage too many people’s lives. [DamLives]

The creation of the welfare state is one of Britain’s proudest achievements. [ProudWlf]

The indices for the three scales are formed by scoring the leftmost, most libertarian or most pro-welfare position, as 1 and the rightmost, most authoritarian or most anti-welfarist position, as 5. The “neither agree nor disagree” option is scored as 3. The scores to all the questions in each scale are added and then divided by the number of items in the scale, giving indices ranging from 1 (leftmost, most libertarian, most pro-welfare) to 5 (rightmost, most authoritarian, most anti-welfare). The scores on the three indices have been placed on the dataset.8

The scales have been tested for reliability (as measured by Cronbach’s alpha). The Cronbach’s alpha (unstandardised items)



for the scales in 2018 are 0.818 for the left–right scale, 0.785 for the libertarian–authoritarian scale and 0.828 for the welfarism scale. This level of reliability can be considered ‘good’ for the left–right and welfarism scales and ‘respectable’ for the libertarian-authoritarian scale (DeVellis, 2003: 95–96).

Other analysis variables

These are taken directly from the questionnaire and to that extent are self-explanatory. The principal ones are:

• Sex

• Age

• Household income

• Economic position

• Religion

• Highest educational qualification obtained

• Marital status

• Benefits received

Sampling errors

No sample precisely reflects the characteristics of the population it represents, because of both sampling and non-sampling errors. If a sample was designed as a random sample (if every adult had an equal and independent chance of inclusion in the sample), then we could calculate the sampling error of any percentage, p, using the formula:

s.e. (p) = p(100 - p)

n

where n is the number of respondents on which the percentage is based. Once the sampling error had been calculated, it would be a straightforward exercise to calculate a confidence interval for the true population percentage. For example, a 95% confidence interval would be given by the formula:

p ± 1.96 x s.e. (p)

Clearly, for a simple random sample (srs), the sampling error depends only on the values of p and n. However, simple random sampling is almost never used in practice, because of its inefficiency in terms of time and cost.

As noted above, the BSA sample, like that drawn for most large-scale surveys, was clustered according to a stratified multi-stage design into 395 postcode sectors (or combinations of sectors).



With a complex design like this, the sampling error of a percentage giving a particular response is not simply a function of the number of respondents in the sample and the size of the percentage; it also depends on how that percentage response is spread within and between sample points.

The complex design may be assessed relative to simple random sampling by calculating a range of design factors (DEFTs) associated with it, where:

DEFT = Variance of estimator with complex design, sample size n

Variance of estimator with srs design, sample size n

and represents the multiplying factor to be applied to the simple random sampling error to produce its complex equivalent. A design factor of one means that the complex sample has achieved the same precision as a simple random sample of the same size. A design factor greater than one means the complex sample is less precise than its simple random sample equivalent. If the DEFT for a particular characteristic is known, a 95% confidence interval for a percentage may be calculated using the formula:

p ± 1.96 x complex sampling error (p)

= p ± 1.96 x DEFT x p(100 - p)

n

Table A.7 gives examples of the confidence intervals and DEFTs calculated for a range of different questions. Most background questions were asked of the whole sample, whereas many attitudinal questions were asked only of a third or two-thirds of the sample; some were asked on the interview questionnaire and some on the self-completion supplement.



Table A.7 Complex standard errors and confidence intervals of selected variables

% (p)Complex standard error of p

5% confidence interval

DEFT Base

Lower Upper

Classification variables

Party identification (full sample)

Conservative 27.3% 0.9% 25.6% 29.1% 1.247 1178

Labour 35.8% 1.1% 33.6% 38.0% 1.454 1329

Liberal Democrat 5.9% 0.4% 5.1% 6.8% 1.141 243

Scottish National Party 3.0% 0.4% 2.3% 3.8% 1.341 114

Plaid Cymru 0.4% 0.1% 0.2% 0.7% 1.075 11

UK Independence Party (UKIP)

1.4% 0.2% 1.1% 1.9% 1.106 64

Green Party 2.4% 0.3% 1.8% 3.1% 1.339 90

None 16.4% 0.9% 14.8% 18.2% 1.465 568

Housing tenure (full sample)

Owns 63.4% 1.1% 61.2% 65.5% 1.419 2528

Rents from local authority

9.1% 0.7% 7.9% 10.5% 1.425 378

Rents privately/HA 25.8% 1.0% 23.8% 27.9% 1.479 912

Age of completing continuous full-time education (full sample)

16 or under 40.3% 1.0% 38.4% 42.2% 1.228 1760

17 or 18 23.1% 0.8% 21.5% 24.8% 1.228 857

19 or over 31.7% 0.9% 29.9% 33.5% 1.242 1157

Does anyone have access to the internet from this address? (full sample)

Yes 92.5% 0.4% 91.7% 93.2% 0.911 3441

No 7.5% 0.4% 6.8% 8.3% 0.912 437

Can I just check, would you describe the place where you live as... (full sample)

...a big city 17.1% 1.4% 14.4% 20.1% 2.382 543

the suburbs or outskirts of a big city

22.2% 1.6% 19.1% 25.5% 2.440 842

a small city or town, 41.9% 1.8% 38.4% 45.4% 2.252 1684

a country village 15.7% 1.0% 13.8% 17.9% 1.773 676

or, a farm or home in the country?

2.5% 0.4% 1.8% 3.4% 1.576 102



Table A.7 Complex standard errors and confidence intervals of selected variables (continued)

% (p)Complex standard error of p

5% confidence interval

DEFT Base

Lower Upper

Attitudinal variables (face-to-face interview)

Some people say there is very little real poverty in Britain today. Others say there is quite a lot. Which comes closest to your view? (three quarters of sample)

... that there is very little real poverty in Britain,

31.1% 1.1% 28.9% 33.3% 1.285 896

or, that there is quite a lot?

64.6% 1.1% 62.4% 66.8% 1.256 1,865

How much interest do you generally have in what is going on in politics… (full sample)

... a great deal 14.1% 0.7% 12.8% 15.6% 1.259 562

quite a lot 24.0% 0.8% 22.4% 25.6% 1.206 951

Some 31.0% 0.9% 29.1% 32.9% 1.279 1,216

not very much 20.1% 0.8% 18.6% 21.7% 1.250 743

or, none at all? 10.8% 0.7% 9.6% 12.2% 1.327 407

How satisfied or dissatisfied are you with local doctors or GPs ... (one quarter of sample)

... Satisfied, 63.0% 1.9% 59.1% 66.7% 1.246 617

… neither, 12.6% 1.3% 10.2% 15.3% 1.211 115

… dissatisfied? 24.1% 1.6% 21.0% 27.4% 1.186 238

Attitudinal variables (self-completion)

Censorship of films and magazines is necessary to uphold moral standards (full sample)

Agree 51.6% 1.2% 49.2% 54.0% 1.349 1660

Neither agree nor disagree

24.1% 0.9% 22.2% 26.0% 1.225 698

Disagree 22.6% 1.0% 20.6% 24.7% 1.364 660

Do unmarried couples who live together have the same legal rights as married couples? (three quarters of sample)

Definitely do 14.2% 0.9% 12.6% 16.0% 1.211 330

Probably do 33.1% 1.2% 30.8% 35.5% 1.216 745

Probably do not 22.7% 1.0% 20.8% 24.8% 1.186 526

Definitely do not 16.8% 0.9% 15.1% 18.6% 1.142 444

If you were to consider your life in general how happy or unhappy would you say you are? (half sample)

Happy 88.9% 1.0% 86.8% 90.7% 1.235 1380

Not happy 9.2% 0.9% 7.5% 11.2% 1.263 142



The table shows that most of the questions asked of all sample members have a confidence interval of around plus or minus two to three of the survey percentage. This means that we can be 95% certain that the true population percentage is within two to three percentage points (in either direction) of the percentage we report.

Variables closely related to the geographic location of the respondent (for example, whether they live in a big city, a small town or a village) typically have larger variation. Consequently, the design effects calculated for these variables in a clustered sample will be greater than the design effects calculated for variables less strongly associated with area. Also, sampling errors for percentages based only on respondents to just one of the versions of the questionnaire, or on subgroups within the sample, are larger than they would have been had the questions been asked of everyone.

Analysis techniques

Regression

Regression analysis aims to summarise the relationship between a ‘dependent’ variable and one or more ‘independent’ variables. It shows how well we can estimate a respondent’s score on the dependent variable from knowledge of their scores on the independent variables. It is often undertaken to support a claim that the phenomena measured by the independent variables cause the phenomenon measured by the dependent variable. However, the causal ordering, if any, between the variables cannot be verified or falsified by the technique. Causality can only be inferred through special experimental designs or through assumptions made by the analyst.

All regression analysis assumes that the relationship between the dependent and each of the independent variables takes a particular form. In linear regression, it is assumed that the relationship can be adequately summarised by a straight line. This means that a 1 percentage point increase in the value of an independent variable is assumed to have the same impact on the value of the dependent variable on average, irrespective of the previous values of those variables.

Strictly speaking the technique assumes that both the dependent and the independent variables are measured on an interval-level scale, although it may sometimes still be applied even where this is not the case. For example, one can use an ordinal variable (e.g. a Likert scale) as a dependent variable if one is willing to assume that there is an underlying interval-level scale and the difference between the observed ordinal scale and the underlying interval scale is due to random measurement error. Often the answers to a number of Likert-type questions are averaged to give a dependent variable that is more like a continuous variable. Categorical or nominal data can



be used as independent variables by converting them into dummy or binary variables; these are variables where the only valid scores are 0 and 1, with 1 signifying membership of a particular category and 0 otherwise.

The assumptions of linear regression cause particular difficulties where the dependent variable is binary. The assumption that the relationship between the dependent and the independent variables is a straight line means that it can produce estimated values for the dependent variable of less than 0 or greater than 1. In this case it may be more appropriate to assume that the relationship between the dependent and the independent variables takes the form of an S-curve, where the impact on the dependent variable of a one-point increase in an independent variable becomes progressively less the closer the value of the dependent variable approaches 0 or 1. Logistic regression is an alternative form of regression which fits such an S-curve rather than a straight line. The technique can also be adapted to analyse multinomial non-interval-level dependent variables, that is, variables which classify respondents into more than two categories.

The two statistical scores most commonly reported from the results of regression analyses are:

A measure of variance explained: This summarises how well all the independent variables combined can account for the variation in respondents’ scores in the dependent variable. The higher the measure, the more accurately we are able in general to estimate the correct value of each respondent’s score on the dependent variable from knowledge of their scores on the independent variables.

A parameter estimate: This shows how much the dependent variable will change on average, given a one-unit change in the independent variable (while holding all other independent variables in the model constant). The parameter estimate has a positive sign if an increase in the value of the independent variable results in an increase in the value of the dependent variable. It has a negative sign if an increase in the value of the independent variable results in a decrease in the value of the dependent variable. If the parameter estimates are standardised, it is possible to compare the relative impact of different independent variables; those variables with the largest standardised estimates can be said to have the biggest impact on the value of the dependent variable.

Regression also tests for the statistical significance of parameter estimates. A parameter estimate is said to be significant at the 5% level if the range of the values encompassed by its 95% confidence interval (see also section on sampling errors) are either all positive or all negative. This means that there is less than a 5% chance that the association we have found between the dependent variable and the independent variable is simply the result of sampling error and does not reflect a relationship that actually exists in the general population.



Factor analysis

Factor analysis is a statistical technique which aims to identify whether there are one or more apparent sources of commonality to the answers given by respondents to a set of questions. It ascertains the smallest number of factors (or dimensions) which can most economically summarise all of the variation found in the set of questions being analysed. Factors are established where respondents who gave a particular answer to one question in the set tended to give the same answer as each other to one or more of the other questions in the set. The technique is most useful when a relatively small number of factors are able to account for a relatively large proportion of the variance in all of the questions in the set.

The technique produces a factor loading for each question (or variable) on each factor. Where questions have a high loading on the same factor, then it will be the case that respondents who gave a particular answer to one of these questions tended to give a similar answer to each other at the other questions. The technique is most commonly used in attitudinal research to try to identify the underlying ideological dimensions which apparently structure attitudes towards the subject in question.

Table and figure conventions The following conventions are used for tables and figures throughout the report.

1. Data in the tables are from the 2018 British Social Attitudes survey unless otherwise indicated.

2. Tables are percentaged as indicated by the percentage signs.

3. In tables, ‘*’ indicates less than 0.5 % but greater than zero, and ‘–’ indicates zero.

4. When findings based on the responses of fewer than 100 respondents are reported in the text, reference is made to the small base size. These findings are excluded from line charts, but included in tables.

5. Percentages equal to or greater than 0.5 have been rounded up (e.g. 0.5 % = 1 %; 36.5 % = 37 %).

6. In many tables the proportions of respondents answering “Don’t know” or not giving an answer are not shown. This, together with the effects of rounding and weighting, means that percentages will not always add up to 100 %.

7. The self-completion questionnaire was not completed by all respondents to the main questionnaire (see Technical details). Percentage responses to the self-completion questionnaire are based on all those who completed it.

8. The unweighted bases shown in the tables (the number of



respondents who answered the question) are printed in small italics.

9. In time series line charts, survey readings are indicated by data markers. While the line between data markers indicates an overall pattern, where there is no data marker the position of the line cannot be taken as an accurate reading for that year.

International Social Survey Programme The International Social Survey Programme (ISSP) is run by a group of research organisations in different countries, each of which undertakes to field annually an agreed module of questions on a chosen topic area. Since 1985, an International Social Survey Programme module has been included in one of the BSA self-completion questionnaires. Each module is chosen for repetition at intervals to allow comparisons both between countries (membership is currently standing at 42) and over time. In 2018, the chosen subject was Religion, and the module was carried on the A and B versions of the self-completion questionnaire (variables ruhappy – attathst). Further information on ISSP is available on their website: www.issp.org.



Notes

1. Until 1991 all British Social Attitudes samples were drawn from the Electoral Register (ER). However, following concern that this sampling frame might be deficient in its coverage of certain population subgroups, a ‘splicing’ experiment was conducted in 1991. We are grateful to the Market Research Development Fund for contributing towards the costs of this experiment. Its purpose was to investigate whether a switch to PAF would disrupt the time-series – for instance, by lowering response rates or affecting the distribution of responses to particular questions. In the event, it was concluded that the change from ER to PAF was unlikely to affect time trends in any noticeable ways, and that no adjustment factors were necessary. Since significant differences in efficiency exist between PAF and ER, and because we considered it untenable to continue to use a frame that is known to be biased, we decided to adopt PAF as the sampling frame for future British Social Attitudes surveys. For details of the PAF/ER ‘splicing’ experiment, see Lynn and Taylor (1995).

2. This includes households not containing any adults aged 18 or over, vacant dwelling units, derelict dwelling units, non-resident addresses and other deadwood.

3. In 1993 it was decided to mount a split-sample experiment designed to test the applicability of Computer-Assisted Personal Interviewing (CAPI) to the British Social Attitudes survey series. CAPI has been used increasingly over the past decade as an alternative to traditional interviewing techniques. As the name implies, CAPI involves the use of a laptop computer during the interview, with the interviewer entering responses directly into the computer. One of the advantages of CAPI is that it significantly reduces both the amount of time spent on data processing and the number of coding and editing errors. There was, however, concern that a different interviewing technique might alter the distribution of responses and so affect the year-on-year consistency of British Social Attitudes data. Following the experiment, it was decided to change over to CAPI completely in 1994 (the self-completion questionnaire still being administered in the conventional way). The results of the experiment are discussed in the British Social Attitudes 11th Report (Lynn and Purdon, 1994).

4. Interview times recorded as less than 20 minutes were excluded, as these timings were likely to be errors.

5. An experiment was conducted on the 1991 British Social Attitudes survey (Jowell et al., 1992) which showed that sending advance letters to sampled addresses before fieldwork begins has very little impact on response rates. However, interviewers do find that an advance letter helps them to introduce the survey on the doorstep, and a majority of respondents have said that



they preferred some advance notice. For these reasons, advance letters have been used on the British Social Attitudes surveys since 1991.

6. Because of methodological experiments on scale development, the exact items detailed in this section have not been asked on all versions of the questionnaire each year.

7. In 1994 only, this item was replaced by: Ordinary people get their fair share of the nation’s wealth [Wealth1].

8. In constructing the scale, a decision had to be taken on how to treat missing values (“Don’t know” and “Not answered”). Respondents who had more than two missing values on the left–right scale and more than three missing values on the libertarian–authoritarian and welfarism scales were excluded from that scale. For respondents with fewer missing values, “Don’t know” was recoded to the midpoint of the scale and “Not answered” was recoded to the scale mean for that respondent on their valid items.

References

DeVellis, R.F. (2003), Scale Development: Theory and Applications, 2nd edition, Applied Social Research Methods Series, 26, Thousand Oaks, California: Sage

Evans, G., Health, A. and Lalljee, M. (1996), ‘Measuring Left-Right and Libertarian-Authoritarian Values in the British Electorate’ The British Journal of Sociology, 47(1): 93-112

Jowell, R., Brook, L., Prior, G. and Taylor, B. (1992), British Social Attitudes: the 9th Report, Aldershot: Dartmouth

Lynn, P. and Purdon, S. (1994), ‘Time-series and lap-tops: the change to computer- assisted interviewing’, in Jowell, R., Curtice, J., Brook, L. and Ahrendt, D. (eds.), British Social Attitudes: the 11th Report, Aldershot: Dartmouth

Lynn, P. and Taylor, B. (1995), ‘On the bias and variance of samples of individuals: a comparison of the Electoral Registers and Postcode Address File as sampling frames’, The Statistician, 44: 173–194

Spector, P. (1992), Summated Rating Scale Construction: An Introduction, Quantitative Applications in the Social Sciences, 82, Newbury Park, California: Sage

Stafford, R. and Thomson, K. (2006), British Social Attitudes and Young People’s Social Attitudes surveys 2003: Technical Report, London: The National Centre for Social Research



Publication detailsCurtice, J., Clery, E., Perry, J., Phillips M. and Rahim, N. (eds.) (2019), British Social Attitudes: The 36th Report, London: The National Centre for Social Research

© The National Centre for Social Research 2019

First published 2019

You may print out, download and save this publication for your non-commercial use. Otherwise, and apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to The National Centre for Social Research.

The National Centre for Social Research 35 Northampton Square London EC1V 0AX

[email protected]

ISBN: 978-1-5272-4448-1

technical details - bsa.natcen.ac.uk€¦ · 36 technical details 1 technical details in 2018, the...

Documents