Reports , News of 5 March 2019

Quality Control in the IAB-BAMF-SOEP Survey of Refugees

Following the identification of interviews that were not conducted In line with the standards of the IAB-BAMF-SOEP group in the first wave of the IAB-BAMF-SOEP Survey of Refugees in 2016 (news and documentation), the project partners and the fieldwork institute substantially enhanced and reinforced their quality control and quality assurance processes. In addition to improvements in fieldwork monitoring  by the fieldwork institute and in standard procedures for monitoring statistical anomalies, a new procedure has been developed to identify statistical anomalies in interviewer data (Kosyakova et al. 2019).

Through the use of this new statistical procedure, three further suspected interviewers have been identified with statistical anomalies.  All affected interviews have been deleted from the dataset (version v.34) prior to distribution.

These enhanced monitoring procedures identified statistical anomalies in two further interviewers in addition to the previously identified interviewer who had conducted interviews in the first wave of the study (see Kosyakova et al. 2019). Although it has not been possible to determine conclusively whether these interviewers did not follow proper procedures and standards of the IAB-BAMF-SOEP group in all of their interviews, the project partners have decided, in consultation with the survey research institute Kantar Public, to delete all interviews by interviewers who are suspected of having failed to follow proper procedures in interviews. This means that 47 additional household interviews and 62 additional individual interviews have been deleted from the first wave. This leaves 4,465 respondents and 3,273 household interviews for analysis of the first wave of the IAB-BAMF-SOEP Survey of Refugees in version V34 of the data.

Impacts on the Net Sample

In sum, these deletions have a minimal impact on the results. The 62 deleted individual interviews are the equivalent of around 1 percent of the total sample (N = 4,527). This means that possible deviations in univariate statistics cannot exceed this 1 percent level. Furthermore, the responses in these interviews are distributed in a relatively unsystematic way. Changes are likely to be negligible. This can be seen when looking at the values for a range of variables, such as the distribution by age, gender, employment status, German proficiency, and completion of education and training (e.g., according to the ISCED classification). As Table 1 shows, the only changes are in the decimal range. All values are weighted.

Table 1: Selected characteristics of respondents by data version in percentages (weighted)

 

v.33.1

v.34

Gender

 

 

Male

73.6

73.3

Female

26.4

26.7

Total

100

100

Observations

4527

4465

Age (grouped)

 

 

18-29

54.5

54.6

30-59

43.2

43.2

60-83

2.3

2.2

Total

100

100

Observations

4525

4463

Employment

 

 

Part-time

2.9

2.8

Apprenticeship, training

1.2

1.2

Marginal

2.3

2.3

Not employed

88.1

88.0

Internship

2.5

2.6

Total

100

100

Observations

4527

4465

Additive Index of German language proficiency (reading, speaking, writing)

 

 

Excellent

24.4

25.3

Medium

19.4

19.8

Poor

56.2

54.9

Total

100

100

Observations

4523

4461

Schooling according to the ISCED 11 classification

 

 

In school

0.8

0.8

Primary

36.8

36.2

Lower Secondary

20.6

19.9

Upper Secondary

20.7

21.1

Post-Secondary

3.4

3.4

Bachelor

16.7

17.3

Promotion

1.1

1.1

Total

100

100

Observations

4167

4131

Regional Impacts

With regard to regional biases, we can currently rule out the possibility that the character of a random sample has been lost due to the aforementioned deletions. On the one hand, the design and nonresponse weight has been adjusted accordingly, and on the other, we have determined that the interviewers do not serve any one Primary Sampling Unit (PSU – regional cluster) in exclusivity. The regions affected therefore still have sufficient numbers of households both for analysis of the initial wave of the survey and for analysis of further survey waves. In total, seven PSUs are affected, with household interviews having been carried out in six of these. These PSUs are all in Bavaria. The deleted households make up around 1 percent of all household interviews conducted in this state (541).

The Second (2017) Wave of the Data

The enhanced quality control processes were used in the second wave of data collection from the beginning. As a result, any interviews that were not conducted according to proper procedures have been identified at a very early stage and deleted (see Kosyakova et al. 2019). Households whose data were deleted only in the second wave due to interviews that may not have been conducted according to proper procedures are treated as temporary dropouts and  have been contacted again in the third wave in survey year 2018. The weighting factors for the first and the second waves have been adjusted accordingly. By taking these steps, we have ensured the high quality and unrestricted usability of the data.