SOEP-Core v32 - Dataset Information

The German Socio-Economic Panel (SOEP) study is a wide-ranging representative longitudinal study of private households, located at the German Institute for Economic Research, DIW Berlin. Every year, there were nearly 15,000 households, and more than 25,000 persons sampled by the fieldwork organization TNS Infratest Sozialforschung. The data provide information on all household members, consisting of Germans living in the Eastern and Western German States, foreigners, and immigrants to Germany. The Panel was started in 1984. Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators. As early as June 1990—even before the Economic, Social and Monetary Union—SOEP expanded to include the states of the former German Democratic Republic (GDR), thus seizing the rare opportunity to observe the transformation of an entire society. Also immigrant samples were added in 1994/95 and 2013/2015 to account for the changes that took place in Germany society. Further new samples were added in 1998, 2000, 2002, 2006, 2009, 2010, 2011, and 2012. Since Version 31 (10.5684/soep.v31) the SOEP includes the complete data from “Familien in Deutschland” (Families in Germany, FiD) which has been retrospectively integrated into the SOEP and made available in user-friendly form to all SOEP users. The FiD survey has been carried out in parallel to the SOEP as a so-called “SOEP-related study” from 2010 to 2013. The survey is constantly being adapted and developed in response to current social developments. The international version contains 95% of all cases surveyed (see 10.5684/soep.v32i).

Dataset Information

Title: Socio-Economic Panel (SOEP), data from 1984-2015

DOI: 10.5684/soep.v32
Collection period: 1984-2015
Publication date: December 14, 2016
Principal investigators: Jürgen Schupp, Jan Goebel, Martin Kroh, Carsten Schröder, Charlotte Bartels, Klaudia Erhardt, Alexandra Fedorets, Marco Giesselmann, Markus Grabka, Peter Krause, Simon Kühne, David Richter, Diana Schacht, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Rainer Siegers, Knut Wenzig

Data collector: TNS Infratest Sozialforschung GmbH.

Population: Persons living in private households in Germany.

Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk. 

Collection mode: The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)

Data set information:

 Number of units 113,840
 Number of variables 61,902 in 413 data sets
 Data format STATA, SPSS, SAS, CSV

MD5 fingerprints

Distribution format
zip file
all files
Stata bilingual 76ac7dacaf663934cc1511b08e3510d3   | TXT, 18.11 KB
Stata German 088df0e5419d73f66d25b37ff6a41488   | TXT, 18.11 KB
Stata English d7f41baeeaffcbd8083f431a60860de5   | TXT, 18.11 KB
SPSS German 829d6893bc7ae8bb636158f13cc5dd6b   | TXT, 18.11 KB
SPSS English 48f1d9ef7314c38084f1ebe436b6ff39   | TXT, 18.11 KB
SAS German cac3ec7ccaa11e4484dbc4ed3eec8fcc   | TXT, 20.23 KB
SAS English 52c77d38b5fd8d57ab2575591ac698ae   | TXT, 20.23 KB
CSV dc3497a6af769753cd7c31a8ff93f3b0   | TXT, 18.11 KB
GGKBOU 0dd7977b178723ce512499d4bf7bb578   | TXT, 140 Byte
GGKBOU English d6aa5c401047e51f7877818ffc202453   | TXT, 140 Byte
Teaching version
Stata German (teaching) 020aa097295e452dd884992cd980a6ab  
Stata English (teaching) 4f08c9219fa5beb1df09bdcb2278b339  
SPSS German (teaching) 08c54677b3640b725ddca5fe5fcab5e9  
SPSS English (teaching) dd0c9903f53587be4b444dd228af0a6a  
SAS German (teaching) 8fd89a198861ff7ae6f9d73413fd2d75  
SAS English (teaching) 190d3260961cae7d9fcbcecf1ff4b8c9  

Publications:

  • Gert G. Wagner, Joachim R. Frick, and Jürgen Schupp (2007) The German Socio-Economic Panel Study (SOEP) - Scope, Evolution and Enhancements, Schmollers Jahrbuch (Journal of Applied Social Science Studies), 127 (1), 139-169 (download).
  • Schupp, Jürgen (2009): 25 Jahre Sozio-oekonomisches Panel - Ein Infrastrukturprojekt der empirischen Sozial- und Wirtschaftsforschung in Deutschland, Zeitschrift für Soziologie 38 (5),  350-357 (download).
  • Gert G. Wagner, Jan Göbel, Peter Krause, Rainer Pischner, and Ingo Sieber (2008) Das Sozio-oekonomische Panel (SOEP): Multidisziplinäres Haushaltspanel und Kohortenstudie für Deutschland - Eine Einführung (für neue Datennutzer) mit einem Ausblick (für erfahrene Anwender), AStA Wirtschafts- und Sozialstatistisches Archiv 2 (4), 301-328 (download)

SOEP-Core soep.v32.1

  • BIOCOUPLY and BIOMARSY:  By mistake in the first version of the data delivery wrong data were uploaded for the two datasets. This version contains the correct datasets .
  • NACE in BFP and BFPGEN: A user reported implausible values for the variables BFP55_NACE and NACE15 containing information on the current job's industry. In this version  the information is updated after a bug in the script has been fixed.
  • Scale shift in BFP: In the v32 data release, the scales in BFP on the probability of specific events occurring in working life, which in previous years had been coded from 0-100 at 10-point intervals, were given on a scale from 0-10 for the CAPI and CAWI interviews. This inconsistency was corrected in the update adapting the scales to the previously used coding: scales from bfp4201, bfp4202, bfp4203, bfp7201, bfp7202, and bfp7203 were multiplied by 10 where bfpinta = 9 or 10; also, one case in  bfp7201 was changed from 4 to 40 where bfpinta = 8.
  • einstieg_artk and einstieg_pbio: SOEP has offered two additional labor market entry variables since providing data version 32 as part of the BIOJOB file. They were constructed on the basis of employment history information to the exact year and month. They refer to a generic uniform definition of the first survey period after the transition from the educational system to the labor market. The construction details for these variables are documented in detail in the SOEP Survey Paper 429, a short version of the description is also available in the BIOJOB documentation. (SOEP Survey Paper 418)

SOEP-Core soep.v32

The new data release (1984–2015) "SOEP.v32" provides, for the most recent survey year 2015, the usual wave-specific data files BFPBRUTTO, BFP, BFPEQUIV, BFP_MIG, BFPKAL, BFPGEN, BFPAGE17, BFHBRUTTO, BFH, BFHGEN, BFKIND, and BEPLUECKE as well as the updated files with a longitudinal component  (PFAD files, biography files, spell data, and weighting factors).

1. New migrant subsample (M2)

In 2013, we conducted the first IAB-SOEP Migration Sample in partnership with the Institute for Employment Research (IAB) in Nuremberg (for an overview of M1, see SOEP Survey Paper 216). The households from the second IAB-SOEP Migration Sample surveyed in 2015 are now also included in the SOEP data. The target population of the second IAB-SOEP Migration Sample consists of immigrants to Germany who have arrived between 2010 and 2013. Migrants from the new EU member states in Eastern Europe dominate this group. This focus will make it possible to better describe the dynamic recent evolution of immigration to Germany. The sample M2 consists of 1,096 households, and was, like sample M1, drawn from register data from the Federal Employment Agency.

Record Linkage

Please note that data from both samples can be linked with administrative employment and income data: Survey respondents are asked to provide explicit consent to record linkage. But since this linked dataset contains social data, these weakly anonymized data are only accessible on site at the Research Data Center of the German Federal Employment Agency at the IAB (FDZ IAB). Researchers can access FDZ IAB data through a guest visit to the IAB or through remote data processing, also arranged with the IAB. The linked data will soon be available to external researchers. Requests for data access should be directed to FDZ IAB, since a contract with IAB for data use is required.

For more information, see the FDZ IAB website.

2. Weighting

  • In version v32 of the SOEP data, the new migrant subsample, M2, has been integrated into the SOEP weighting framework. As is our usual practice when a new sample is integrated into the SOEP, we make different weighting factors available for the first wave. The standard weights (bfhhrf/bfphrf) allow researchers to draw inferences about the underlying population of residents in Germany based on all SOEP samples. The variables bfhhrfam1/bfphrfam1 allow the same inferences, but only using data from the old Samples A to M1. Comparisons between both sets of weights thus enable researchers to gauge the influence of the recent enlargement of the SOEP for population estimates. Weights specific to the recent enlargement M2, bfhhrfm2/bphhrfm2, allow researchers to draw inferences about the target population of immigrants to Germany between 2010 and 2013.
  • The adjustment of weights to census margins on the individual level has been updated since 1984 so that now the number of women and men in each age group (five-year categories) is given as the margin. Up to now, two separate margins were used for sex and age group.
  • Upon request, we now provide weighting factors for survey years 2010 to 2013 (waves BA to BD) excluding Samples L1 to L3. Due to differences in survey instruments used with Samples L1 to L3 in the corresponding waves as part of the "Familien in Deutschland" (Families in Germany) survey,  a need for weighting may arise when variables are to be analyzed that were not surveyed in the other samples.

3. Changed datasets or variables

  • MIGSPELL: With the integration of the data from 2013 (BD) to 2015 (BF), larger changes in the number and coding of the MIGSPELL variables were necessary, since in particular the status upon entry to Germany was surveyed in the individual waves with differing degrees of specificity. In addition, an improved procedure was introduced for imputation of missing data. A detailed description of the new version of MIGSPELL can be found in the SOEP 2015 documentation on Biography and Life History Data (coming soon).
  • Variables connected to occupations:
    - The variables names have changed and should now be more informative; the name of the coding scheme is now part of the variable name, e.g., isco88.
    - The occupational codes (KldB92, ISCO-88) now comply better with official standards (e.g., variables with suffixes _kldb92 or _isco88 in $P files).
    - In $PGEN there are now also variables using the coding schemes for KldB2010 and ISCO-08.
    - The code for generating the derived prestige scales has been redesigned, e.g., egp88_12 for egp class based on ISCO-88 in the year 2012.
  • BIOIMMIG:  The variable biwfam ("Already Had Family In Country") was recoded incorrectly in the generated dataset for the migration samples in 2013 and 2014. This was corrected in the current data release.
  • Survey Year: With Version 32, variables referring to the survey year are referred to consistently as syear. Previously there were a few variables with names like erhebj and svyyear.

4. New datasets or variables

  • BIOIMMIG: Additional variable for the main reason for migrating to Germany (only available since 2014).
  • PFLEGE: A new variable, appraisal with the label: “officially assessed as in need of care”
  • $PEQUIV: six new variables:
        -  ichsu$$ Child support, caregiver alimony
        -  fchsu$$ Imputation flag child support, caregiver alimony
        -  ispou$$ Divorce alimony
        -  fspou$$ Imputation flag Divorce alimony
        -  irie1$$ Riester pension plan
        -  irie2$$ Riester widow pension plan

 

  • PPFAD: Person-related meta dataset
    -  Some immigration variables (GERMBORN, CORIGIN and IMMIYEAR) previously contained a -3 for all respondents in Sample G who were not asked to state their country of birth and year of immigration. Since respondents from other samples (e.g. A) were also not directly asked to provide this information and were coded -2, the coding of missing values was not consistent across samples. This inconsistency was corrected in the new update (v32).
    -   Respondents who immigrated in the year 1949 (when the Federal Republic of Germany was founded) were previously considered not to have been born in Germany due to a coding error. This has been fixed in the updated version, and now, in accordance with the German Microcensus, all persons who immigrated before 1950 (after 1949) are considered to have been born in Germany. This also led to a change in the value label of IMMIYEAR.
    -   More information was considered in the updated version of MIGINFO, leading to changes in the values.


Individual (PAPI) 2015: Field-de Var-de Var-en
Household 2015: Field-de Var-de Var-en
Biography (PAPI) 2015: Field-de Var-de Var-en
Youth (16-17 year-olds) 2015: Field-de Var-de Var-en
Pre-Teen (11-12 year-olds) 2015: Field-de
Mother and Child (Newborns) 2015: Field-de
Mother and Child (2-3-year-olds) 2015: Field-de
Mother and Child (5-6-year-olds) 2015: Field-de
Parents and Child (7-8-year-olds) 2015: Field-de
Mother and Child (9-10-year-olds) 2015: Field-de
Deceased Individual 2015: Field-de

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) Documentation on ISCED Generation Using the CAMCES Tool in the IAB-SOEP Migration Samples M1/M2

3) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

4) The Request for Record Linkage in the IAB-SOEP Migration Sample

5) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

6) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

7) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

8) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

9) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

10) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

11) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

12) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

13) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

14) SOEP Scales Manual (updated for SOEP-Core v32.1)

15) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

16) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

17) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

18) Multi-Itemskalen im SOEP Jugendfragebogen

19) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

20) Documentation of ISCED Generation Based on the CAMCES Tool in the IAB-SOEP Migration Samples M1/M2 and IAB-BAMF-SOEP Survey of Refugees M3/M4 until 2017

21) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

22) SOEP 2013 – Documentation of Generated Person-Level Long-Term Care Variables in PFLEGE

23) SOEP-Core v34 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables

24) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

25) SOEP-Core v34: Codebook for the EU-SILC-Like Panel for Germany Based on the SOEP

26) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

Alle Dokumentationen zum Filtern finden Sie auf dieser Seite