SOEP-Core v30 (data 1984-2013)

The German Socio-Economic Panel (SOEP) study is a wide-ranging representative longitudinal study of private households, located at the German Institute for Economic Research, DIW Berlin. Every year, there were nearly 11,000 households, and more than 20,000 persons sampled by the fieldwork organization TNS Infratest Sozialforschung. The data provide information on all household members, consisting of Germans living in the Old and New German States, foreigners, and recent immigrants to Germany. The Panel was started in 1984. Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators. As early as June 1990—even before the Economic, Social and Monetary Union—SOEP expanded to include the states of the former German Democratic Republic (GDR), thus seizing the rare opportunity to observe the transformation of an entire society. An immigrant sample was added as well to account for the changes that took place in German society in 1994/95. Further new samples were added in 1998, 2000, 2002, 2006, 2009, 2011, 2012 and 2013. The survey is constantly being adapted and developed in response to current social developments. The international version contains 95% of all cases surveyed.

Dataset Information

Title: German Socio-Economic Panel (SOEP), data of the years 1984-2013

DOI: 10.5684/soep.v30
Collection period: 1984-2013
Publication date: May 11, 2015
Principal investigators: Jürgen Schupp, Martin Kroh, Jan Goebel, Carsten Schröder, Elisabeth Bügelmayer, Marco Giesselmann, Markus Grabka, Peter Krause, Simon Kühne, Elisabeth Liebau, David Richter, Rainer Siegers, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Ingrid Tucci, Knut Wenzig

Data collector: TNS Infratest Sozialforschung GmbH.

Population: Persons living in private households in Germany.

Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk.

Collection mode: The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.).

 Data set information:

Number of units 87,095
Number of variables 52,997 in 389 Datensätzen
Data format STATA, SPSS, SAS, CSV
MD5 fingerprints of the data sets

Stata English and German | TXT, 17.09 KB
Stata English | TXT, 17.1 KB
Stata German | TXT, 17.1 KB
SPSS English | TXT, 17.1 KB
SPSS German | TXT, 17.1 KB
SAS English | TXT, 19 KB
SAS German | TXT, 19 KB
CSV | TXT, 17.1 KB
GGKBOU | TXT, 140 Byte

Teaching versions:
Stata English (50% version) | TXT, 17.1 KB
Stata German (50% version) | TXT, 17.1 KB
SPSS English (50% version) | TXT, 17.1 KB
SPSS German (50% version) | TXT, 17.1 KB
SAS Englisch (50% version) | TXT, 18.95 KB
SAS German (50% version) | TXT, 18.95 KB

Publications:

  • Jan Goebel, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, Jürgen Schupp. 2018. The German Socio-Economic Panel Study (SOEP). Jahrbücher für Nationalökonomie und Statistik / Journal of Economics and Statistics (online first), doi: 10.1515/jbnst-2018-0022
  • Schupp, Jürgen (2009): 25 Jahre Sozio-oekonomisches Panel - Ein Infrastrukturprojekt der empirischen Sozial- und Wirtschaftsforschung in Deutschland, Zeitschrift für Soziologie 38 (5),  350-357.
  • Gert G. Wagner, Jan Göbel, Peter Krause, Rainer Pischner, and Ingo Sieber (2008) Das Sozio-oekonomische Panel (SOEP): Multidisziplinäres Haushaltspanel und Kohortenstudie für Deutschland - Eine Einführung (für neue Datennutzer) mit einem Ausblick (für erfahrene Anwender), AStA Wirtschafts- und Sozialstatistisches Archiv 2 (4), 301-328 (download).

Publications using this file should refer to the above DOI infoFind an explanation on the usage of DOI here.and cite one of the following references

  • Goebel, Jan, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, and Jürgen Schupp. 2019. The German Socio-Economic Panel (SOEP). Jahrbücher für Nationalökonomie und Statistik (Journal of Economics and Statistics) 239 (2), 345-360. (https://doi.org/10.1515/jbnst-2018-0022)
  • Schröder, Carsten, Johannes König, Alexandra Fedorets, Jan Goebel, Markus M. Grabka, Holger Lüthen, Maria Metzing, Felicitas Schikora, and Stefan Liebig. 2020. The economic research potentials of the German Socio-Economic Panel study. German Economic Review 21 (3), 335-371. (https://doi.org/10.1515/ger-2020-0033)
  • Giesselmann, Marco, Sandra Bohmann, Jan Goebel, Peter Krause, Elisabeth Liebau, David Richter, Diana Schacht, Carsten Schröder, Jürgen Schupp, and Stefan Liebig. 2019. The Individual in Context(s): Research Potentials of the Socio-Economic Panel Study (SOEP) in Sociology. European Sociological Review 35 (5), 738-755. (https://doi.org/10.1093/esr/jcz029)

SOEP v30 (Original data set)

SOEP v30i (International Scientific Use File, 95%)

SOEP v30beta

SOEP v30ibeta (International Scientific Use File, 95%)

The new data distribution (1984–2013) “SOEP v30” provides, for the most recent survey year 2013, the usual wave-specific data files BDPBRUTTO, BDP, BDPKAL, BDPGEN, BDPAGE17, BDHBRUTTO, BDH, BDHGEN, BDKIND, and BCPLUECKE as well as the updated files with a longitudinal component (PFAD files, biography files, spell data, and weighting factors). Additional new samples, datasets, or variables are listed below:

1. Cross-Sectional Weights 2013

1. Cross-sectional weights 2013

We are pleased that with the figures now available from the official statistical agencies, we are now able to provide you the finalized weighting variables in this version of the data (doi:10.5684/soep.v30). As is always the case in years of refresher and enlargement samples, we are providing weights for the old and new samples, both separately and together. These different sets of weights are designed to make it easier for users to study how the integration of a new sample affects the analysis of specific research topics.

Please also note that the government census carried out in 2011 replaced the projected population figures, which had been regularly updated based on the last census in 1987, with current population of the Federal Statistical Office. This means that the post-stratification of SOEP weights from wave BD in data release v30 are based on a version of the Microcensus from 2013 that considers the 2011 census for the first time. It is therefore possible that changes in weighted analyses of the SOEP between 2012 (BC) and 2013 (BD) are the result of the government statistics switching over to the more recent census. The correction is evident in the fact that the estimated total number of individuals living in private households in Germany fell from 81 million in 2012 to less than 80 million in 2013.

Given the retrospective revision of the 2011 and 2012 Microcensus data to account for the census results, our next data release (soep.v31) will include retrospectively revised weighting variables for the 2011 and 2012 survey data.

If you have any comments on the weighting variables, we would be happy to hear from you (mkroh@diw.de).

2. New IAB-SOEP Migration Sample (Sample M)

2. New IAB-SOEP Migration Sample (Sample M)

The new IAB-SOEP Migration Sample (Sample M) is a joint project with the Institute for Employment Research (IAB). It is therefore provided as part of the normal SOEP distribution (see, for example, variable psample in dataset ppfad), but also as a separate study including only Sample M households (10.5684/soep.iab-soep-mig.2013).

The new sample takes into account changes in the structure of migration to Germany since 1995. It covers not only direct immigration but also the “second generation,” the children of immigrants. The new sample opens up new perspectives for migration research and provides insights into the lives of new immigrants to Germany. The new sample has the following key features:

  1. The IAB-SOEP Migration Sample substantially increases the sample size for research on migration and the lives of immigrants in Germany: 4,964 persons residing in 2,723 households participated in the first wave of the survey. Moreover, since the survey is included in the regular SOEP as subsample “M”, including migrants from the other SOEP samples in analyses may increase the number of observations further.
  2. The questionnaire used with the new migration sample covers respondents’ entire migration biography. Migration episodes to other countries than Germany are covered as well. This is an important extension over previous SOEP surveys of immigrants’ personal biographies. For the first time, we can now track whether important events in individual biographies occurred in the respondent’s home country, in Germany, or in other destination countries. This also takes into account that migration is no longer a one-time event that lasts for a lifetime but that individual biographies are becoming increasingly “transnational,” often with several migration episodes taking place during an individual’s lifetime and involving personal ties in different countries. We created a user-friendly spell data set, called MIGSPELL, for the use of this data.
  3. Following recent advances in the research on migration and immigration, the IAB-SOEP Migration Sample considers numerous new sets of questions that were not previously considered in the SOEP or other household surveys in Germany, at least not in the necessary depth. Examples of such question blocs are: earnings and the labor force and occupational status before migration; migration decisions in the family and partnership context; and the purposes and channels of transferring remittances.

3. New datasets / variables

3.1. MIGSPELL

3. New datasets / variables

For the comprehensively surveyed migration biography, we have created a user-friendly spell data set. Detailed documentation will be available in the biographical data documentation of the SOEP.

3.2. BDP_MIG

The original data from the Sample M specific survey instrument is included in the dataset BDPMIG, combining the individual and the biographical questionnaire. The variables are also included in the other standard or generated datasets:
  • Variables equivalent to variables in the individual questionnaire of other samples are included in the dataset BDP
  • Variables equivalent to variables in the biography questionnaire of other samples are included in the respective biography dataset (e.g. BIOMARSM)
  • The comprehensively surveyed migration biography can be found in the new dataset MIGSPELL.

3.3. JOBEND$$

Since a number of changes occurred in the categories for reasons for job dismissal, a new longitudinally consistent variable (JOBEND$$) is now offered in the $PGEN data sets./p>

3.4. New additional occupations codes

The data on occupations in the individual questionnaire are now additionally coded using KldB2010 and partly also ISCO-08. The following variables are included in the dataset BDP:

Varname

Variable Label

bdp38_kldb2010

Current Occupational Classification (KldB2010)

bdp38_isco08

Current Occupational Classification (ISCO-08)

bdp81_kldb2010

Current Occupational Classification Secondary Employment (KldB2010)

bdp81_isco08

Current Occupational Classification Secondary Employment (ISCO-08)

bdp9005_trainkldb2010

Vocational Training / Education Degree Prev. Yr. (KldB2010)

However, variables of derived scales (e.g. prestige scores in $$PGEN) are still based on ISCO-88.

3.5. Grip strength data for 2012

GRIPSTR update: The data on grip strength from the survey year 2012 is now included in the GRIPSTR dataset.

3.6. Wealth data for 2012

PWEALTH and HWEALTH updated: In the year 2012, all individuals aged 17 and over were again surveyed on wealth, just as they were in 2002 and 2007. These “raw” data were already part of the standard data distribution for Wave 29 and will be included in the upcoming data distribution in a file containing the data for 2002, 2007, and 2012 in “long format”—the file PWEALTH for individual data, HWEALTH with data aggregated according to household context. Values that are missing due to item or partial unit non-response (e.g., missing interviews with individual household members in interviewed households) will be subjected to multiple imputations in complex procedures taking longitudinal information into account.

3.7. BIOEDU now part of the regular data distribution

After it became impossible to update the beta version of this data set in version 29, the data have now been updated and incorporated into the regular data distribution. The information from the new IAB-SOEP Migration Sample was also integrated.

3.8. INTERVIEWER dataset

The dataset comprises demographic and employment information about interviewers, aggregated data on the interviewers’ fieldwork in each wave, as well as personal details that they provided in the two interviewer surveys of 2006 and 2012. In the process of creating the INTERVIEWER dataset, all interviewer indicators (INTID) in all of the SOEP datasets were checked thoroughly and in some cases revised.

4. Revisions and Bug fixes

4.1. Corrections in BILZTCH$$ and BILZTEV$$

The variables BILZTCH$$ and BILZTEV$$ lacked information on a number of waves up to now. As a result, false values were ascribed to variables in a number of cases: a total of 638 previously consistent cases proved to be inconsistent increases in educational levels and 2,582 previously inconsistent cases proved consistent.

4.2. Corrections in DUEBSTD

In addition to the generation of overtime work for 1984 and 1985 overtime work has now been generated for 1987 as well. For these years, overtime hours result from the difference between contractually agreed working hours and the number of hours actually worked per week.

4.3. Revisions of marital and relationship status

$FAMSTD: As a result of a new process for generating BIOMARSM/Y and BIOCUPLM/Y, two changes occurred in $FAMSTD: Since 2010 the question on marital status has included the categories “registered same-sex partnership, living together” and “registered same-sex partnership, not living together”. These two categories are also included in $FAMSTD as values “7” and “8”. Furthermore all spells of BIOMARSM/Y in the category “widowed or divorced” have been set to “not valid” in $FAMSTD. These changes were also applied to previous waves. The variable $FAMSTD is set to -3 if information is implausible, to -5 if persons were not interviewed, and to -1 if persons did not answer the question.

BIOCOUPLM/Y: For the process of generating BIOCOUPLM, the current relationship status and reported changes in the family situation are taken into account. Although the questionnaire asks for such events on a monthly basis, numerous changes in the relationship status are not reported as events. So in the new version of BIOCOUPLM, we have included a censor variable called “events” which gives you information on whether the exact month of an event is known or whether the begin or end of a spell reflects the month of the interview due to the lack of reported events. Finally a new category “added spell” has been introduced into the variable remark, which lets you distinguish between spells that have been edited (value 2) and spells that have been added (value 3). For further information, please see the new documentation on BIOMARSM/Y. The variable SPELLTYP is set to -3 if information is implausible.


BIOMARSM/Y: Because BIOMARSM is derived from the new version of BIOCOUPLM, we have copied the category “married, separated” from BIOCOUPLM. It reflects the time between a reported separation and divorce or the death of the spouse. Most of these spells of BIOCOUPLM were set to “married” in BIOMARSM, but for those spells without a reported end, event spells were set to “married, separated” and the end of the spells to missing. Parallel spells from the category “divorced or widowed” were added, whereas the outset of those spells was set to missing. Finally a new category “added spell” has been introduced into the variable remark, which let you distinguish between spells that have been edited (value 2) and spells that have been added (value 3). For further information, please see the new documentation on BIOCOUPLM/Y. The variable SPELLTYP is set to -3 if information is implausible.

4.4 $regtyp: conversion to urban / rural area

The new typology of German BBSR describes the settlement structure allowing for categorization into four types of regions. But the use of these four categories would, on the other hand, allow for the identification of specific administrative districts (Landkreise) in the counties of Saxonia, Mecklenburg-Western Pomerania, and Baden-Württemberg. Therefore, we must use a condensed two-category classification: urban and rural areas.


Individual (PAPI) 2013: Field-de
Household (PAPI) 2013: Field-de
Biography (PAPI) 2013: Field-de
Catch-up Individual 2013: Field-de
Youth (16-17 year-olds) 2013: Field-de
Mother and Child (Newborns) 2013: Field-de
Mother and Child (2-3-year-olds) 2013: Field-de
Mother and Child (5-6-year-olds) 2013: Field-de
Parents and Child (7-8-year-olds) 2013: Field-de
Mother and Child (9-10-year-olds) 2013: Field-de
Deceased Individual 2013: Field-de
Your Life abroad 2013: Field-de

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

3) The Request for Record Linkage in the IAB-SOEP Migration Sample

4) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

8) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

13) SOEP Scales Manual (updated for SOEP-Core v32.1)

14) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

15) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

16) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

17) Multi-Itemskalen im SOEP Jugendfragebogen

18) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

19) Documentation of ISCED Generation Based on the CAMCES Tool in the IAB-SOEP Migration Samples M1/M2 and IAB-BAMF-SOEP Survey of Refugees M3/M4 until 2017

20) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

21) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

22) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

All documentation for filtering can be found on this page

keyboard_arrow_up