DOI 10.5684/soep.iab-soep-mig.2013

Release of the IAB-SOEP Migration Sample 2013


Title: IAB-SOEP Migration Sample 2013


Publication date: 17.10.2014


Principal investigators: Herbert Brücker, Jürgen Schupp

Co-PIs: Martin Kroh, Ingrid Tucci, Jan Goebel, Simone Bartsch, Elisabeth Liebau, Parvati Trübswetter

Affiliated Staff Members for providing the Scientific Use File: Elisabeth Bügelmayer, Klaudia Erhardt, Markus Grabka, Marco Giesselmann, Peter Krause, David Richter, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Carsten Schröder, Knut Wenzig

If you publish using this data, it is mandatory to quote the following references:

Herbert Brücker, Martin Kroh, Simone Bartsch, Jan Goebel, Simon Kühne, Elisabeth Liebau, Parvati Trübswetter, Ingrid Tucci & Jürgen Schupp (2014): The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents. SOEP Survey Paper 216, Series C. Berlin, Nuremberg: DIW Berlin.

Comprehensive documentation is available at (

Study information

The IAB-SOEP Migration Sample is a joint project of the Institute for Employment Research (IAB) and the Socio-Economic Panel (SOEP) at the German Institute for Economic Research (DIW Berlin). The project tempts to overcome limitations of previous datasets by drawing a sample that takes into account changes in the structure of migration to Germany since 1995. The dataset is an additional sample for the SOEPcore study and therefore completely harmonized with the SOEP and integrated into SOEP v30 (identical questionnaire with additional questions on the respondent's migration situation). The study opens up new perspectives for migration research and gives insights on the living situations of new immigrants to Germany.

Data collector: TNS Infratest Sozialforschung GmbH.

Collection period: 2013


Data access

All SOEP users with a valid contract can order the data from the IAB-SOEP Migration Sample in the usual ways (SOEPhotline, website) without needing to sign any additional data distribution contracts. The dataset is available upon request for free via personalized secure download. New users find information on the SOEP application process here.

The normal application process at the  IAB RDC is necessary.

Data description

Data structure

The data structure is closly related to the structure used in the SOEPcore study. Each wave (survey year) is identified by letters of the alphabet: the first wave in 1984 is wave "A", 1985 is wave "B", 2009 is wave "Z", followed by wave "BA" in 2010 and so on. Therefore survey year 2013 is identified by wave prefix "BD". For each year of SOEP data there are single data files for households (e.g. BDH) as well as for individual respondents (e.g. BDP and BDP_MIG) and children (e.g. BDKIND) based on interview information. These observations make up the net population, with each of these files containing as many records as interviews could be conducted. Additional data files with a limited number of variables based on the address log constitute the gross number of households and persons, i.e. all households and their members which were eligible for an interview in any given year. Please see the following table for an overview of all available datasets:

datasetlabelperiodanalysis unit
ppfad Individual Tracking File P
hpfad Household Tracking File H
bdp_mig Integrated personal and biographical questionnaire (specific for the Migration Sample) 2013 P
migspell Migration biography in spell format P
bdp Personal questionnaire 2013 P
bdh Household questionnaire 2013 H
bdkind Data on children (from HH-Questionnaire) 2013 P
bdpgen Generated Individual Data 2013 P
bdpkal Individual Calendar 2013 P
bdhgen Generated Household Data 2013 H
mihinc Multiple imputed data on monthly household income H
pflege Persons needing care within the household P
bdhbrutto Gross Household Data 2013 H
bdpbrutto Gross Individual Data 2013 P
biobirth Generated biographical information: Birth Biography of Female Respondents P
biobrthm Generated biographical information: Birth Biography of Male Respondents P
biocouplm Generated biographical information: couple history, monthly P
biocouply Generated biographical information: couple history, annual P
bioimmig Generated biographical information: Generated and Status Variables for Foreigners P
biomarsm Generated biographical information: marital history files, monthly P
biomarsy Generated biographical information: marital history files, annual P
bioparen Generated biographical information: Biography Information for the Parents of SOEP-Respondents P
biosib Generated biographical information: Information on siblings P
biosoc Generated biographical information: Retrospective Data on Youth and Socialization P
biotwin Generated biographical information: Twins in the SOEP P
pbiospe Generated biographical information: Activity Biography P
cirdef Random Groups H
design Survey Design H
kidlong Data on children P
lifespell Spell Information on the Pre- and Post-Survey History of SOEP-Respondents P

Missing conventions

Survey variables might be missing, i.e. without a valid code or value for different reasons. In the SOEP, negative values are not valid for any variable, but are used instead to code different reasons for missing information. There are two distinctions for missing values: they may originate in the respondent's answer or in the survey design. The following codes apply :

-1 no answer / don’t know
-2 does not apply
-3 implausible value
-4 Inadmissable multiple response
-5 Not included in this version of the questionnaire
-6 Version of questionnaire with modified filtering

With the extension of the SOEP in recent years, entirely new samples like the IAB-SOEP-Migrationsample, have been added to the core. In these samples, sometimes questions are left out completely, e.g. to shorten the questionnaire or because the focus of the sample is different. In such a case, the variable will be set to "-5 Not included in this version of the questionnaire" for an entire subsample.

With the use of CAPI, recent developments include an integrated person and biography questionnaire, i.e. the biography part and the regular part of the questionnaire are asked as one. Some of the questions in the biography part are repeated in the regular part. While in the PAPI mode, the respondent will answer the same question twice, the CAPI allows to filter the respondent around the question if it has already been asked. These cases are very rare - if they occur, they receive a code "-6 Version of questionnaire with modified filtering".

Additional documents

Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013 (SOEP Survey Paper 261)

How to Generate Spell Data from Data in "Wide" Format based on the migration biographies of the IAB-SOEP Migration Sample (SOEP Survey Paper 228)

Methodenbericht zum IAB-SOEP-Migrationssample 2013 (in German)