SOEP-Core v28 (data 1984-2011)

The German Socio-Economic Panel Study (SOEP) is a wide-ranging representative longitudinal study of private households, located at the German Institute for Economic Research, DIW Berlin. Every year, there were nearly 11,000 households, and more than 20,000 persons sampled by the fieldwork organization TNS Infratest Sozialforschung. The data provide information on all household members, consisting of Germans living in the Old and New German States, Foreigners, and recent Immigrants to Germany. The Panel was started in 1984. Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators. As early as June 1990-even before the Economic, Social and Monetary Union-SOEP expanded to include the states of the former German Democratic Republic (GDR), thus seizing the rare opportunity to observe the transformation of an entire society. An immigrant sample was added as well to account for the changes that took place in Germany society in 1994/95. Further new samples were added in 1998, 2000, 2002, 2006, 2009 and 2011. The survey is constantly being adapted and developed in response to current social developments. The international version contains 95% of all cases surveyed.

Dataset Information

Title: German Socio-oeconomic Panel Study (SOEP), data of the years 1984-2011

DOI: 10.5684/soep.v28
Collection period: 1984-2011
Publication date: Nov. 12, 2012
Principal investigators
: Jürgen Schupp, Martin Kroh, Jan Goebel, Silke Anger, Marco Giesselmann, Markus Grabka, Peter Krause, Elisabeth Liebau, Henning Lohmann, David Richter,Christian Schmitt, Daniel Schnitzlein, Juliana Werneburg, Frauke Peter, Ingrid Tucci

Data collector: TNS Infratest Sozialforschung GmbH

Population: Persons living in private households in Germany

Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk.

Collection mode: The interview methodology of the SOEP is based on a set of pre-tested qustionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household3) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)

 Data set information:

 Number of units 74.137
 Number of variables 47.933 in 360 data sets
 Data format  STATA, SPSS, SAS, CSV
 MD5 fingerprints of the data sets

Stata German | TXT, 16.16 KB
Stata English | TXT, 16.16 KB
Stata German+Engl. | TXT, 16.16 KB
SPSS German | TXT, 16.16 KB
SPSS English | TXT, 16.16 KB
SAS German | TXT, 17.91 KB
SAS English | TXT, 17.91 KB
CSV | TXT, 16.16 KB
PanelWhiz | TXT, 16.29 KB

 

Publications:

  • Jan Goebel, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, Jürgen Schupp. 2018. The German Socio-Economic Panel Study (SOEP). Jahrbücher für Nationalökonomie und Statistik / Journal of Economics and Statistics (online first), doi: 10.1515/jbnst-2018-0022
  • Gert G. Wagner, Jan Göbel, Peter Krause, Rainer Pischner, and Ingo Sieber (2008) Das Sozio-oekonomische Panel (SOEP): Multidisziplinäres Haushaltspanel und Kohortenstudie für Deutschland - Eine Einführung (für neue Datennutzer) mit einem Ausblick (für erfahrene Anwender), AStA Wirtschafts- und Sozialstatistisches Archiv 2 (2008), No. 4, 301-328 (download)
  • Schupp, Jürgen (2009): 25 Jahre Sozio-oekonomisches Panel - Ein Infrastrukturprojekt der empirischen Sozial- und Wirtschaftsforschung in Deutschland, Zeitschrift für Soziologie 38(5), pp. 350-357.

Publications using this file should refer to the above DOI infoFind an explanation on the usage of DOI here.and cite one of the following references

  • Goebel, Jan, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, and Jürgen Schupp. 2019. The German Socio-Economic Panel (SOEP). Jahrbücher für Nationalökonomie und Statistik (Journal of Economics and Statistics) 239 (2), 345-360. (https://doi.org/10.1515/jbnst-2018-0022)
  • Schröder, Carsten, Johannes König, Alexandra Fedorets, Jan Goebel, Markus M. Grabka, Holger Lüthen, Maria Metzing, Felicitas Schikora, and Stefan Liebig. 2020. The economic research potentials of the German Socio-Economic Panel study. German Economic Review 21 (3), 335-371. (https://doi.org/10.1515/ger-2020-0033)
  • Giesselmann, Marco, Sandra Bohmann, Jan Goebel, Peter Krause, Elisabeth Liebau, David Richter, Diana Schacht, Carsten Schröder, Jürgen Schupp, and Stefan Liebig. 2019. The Individual in Context(s): Research Potentials of the Socio-Economic Panel Study (SOEP) in Sociology. European Sociological Review 35 (5), 738-755. (https://doi.org/10.1093/esr/jcz029)

SOEP v28 (Original Dataset)

SOEP v28i (International Version)

SOEP v28.1 (Bug Update Dec. 2012)

1. New additional missing codes

With the integration of sample J in 2011, conducting of the biographical questionnaire was moved from the second to the first wave and combined with the individual questionnaire in an integrated survey. This means that there are some slight differences in the survey instrument between the old samples A-H and the supplementary sample J.

The following additional missing codes have been introduced to the survey data to document these possible differences:

-4 "Inadmissible multiple response"
-5 "Not included in this version of the questionnaire"
-6 "Version of questionnaire with modified filtering"

2. Sample I now part of our new Innovation Sample

The SOEP Innovation Sample has been launched now and includes, inter alia, sample I. Sample I is therefore no longer part of the main survey as of 2011. See SOEP-IS on our website for further information about the Innovation Sample and the possibility of including your own questions.

3. New and renamed datasets

3.1 BIOCOUPLM
BIOCOUPLM provides spell data on partnership histories from the first to last personal interview of a respondent. Spells are measured on a monthly basis.

3.2 BIOCOUPLY
BIOCOUPLY provides spell data on partnership histories. It contains annual information on partnership status since the respondent’s year of birth, including available retrospective data and annually updated information.

3.3 BIOSIB (beta version)
The new file BIOSIB provides information on siblings living in the SOEP households. The dataset contains the person numbers of all siblings in an observed family. It includes information on their gender, their year of birth, and on the relationship between the observed siblings.
BIOSIB is included as a beta version in the current data release. Please do not hesitate to send both positive and negative feedback or suggestions to Daniel Schnitzlein (dschnitzlein@diw.de).

3.4 BIOEDU
The BIOEDU dataset contains details on educational transitions beginning with entry into childcare up to tertiary education in a consistently structured form.

3.5 BIOAGE long
In the new integrated bioage long dataset (BIOAGEL), data are presented in “long” format, i.e. this dataset will contain information from BIOAGE01, BIOAGE03, BIOAGE06, as well as BIOAGE08a and BIOAGE08b.

3.6 TRUST
Dataset on the Economic Behavior Experiment on Trust and Trustworthiness in the 2003, 2004, & 2005 SOEP Survey

This experiment to measure trust is based on the investment game introduced by Berg et al. (1995), a one-shot game for two players or movers who anonymously interact with each other. The first mover receives an endowment of 10 points and can transfer zero to ten points to the second mover. Every point that is transferred is doubled by the experimenters. The second mover is also given an endowment of ten points. After receiving points from the first mover, he/she decides on how much of the endowment to transfer back to the first mover (zero to ten points). As with the first mover's transfer, the back-transfer by the second mover is doubled by the experimenters. After the second mover's decision, the game ends and the subjects are paid their income in euros (one point equals one euro) by check sent a few days later.

A fundamental component of the game is that the participants actually receive money in accordance with the fixed payout function, i.e., all the decisions always have monetary consequences. This version of the game was developed by Fehr, Fischbacher, Schupp, von Rosenbladt & Wagner (2002).

The combination of representative survey and behavioral experiment was used in the SOEP main surveys in 2003, 2004, and 2005, with only minor modifications. Of the 1,432 original participants in 2003, 1,202 also took part in the experiment in 2004 and 2005.

The data are available in long format in the "TRUST" dataset. Consequently, this dataset contains information from each of the three waves in which the behavioral experiment was conducted.

3.6 TIMEPREF
Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

In this experiment on economic behavior, respondents were asked to decide how they would like to receive €200 in prize money: if they would rather receive it immediately by check, or if they would prefer to wait and receive a larger amount later—that is, with interest. By splitting the sample (N = 1,503 persons) into random subsamples (splits), it was possible to vary both the time horizon and the implied interest rate to test possible incentive effects on the choice between a low payoff in the short term and a high payoff in the long term. The scientific director of the project was Prof. Dr. Armin Falk, CENs, University of Bonn.

4. New or revised variables

4.1 $HBRUTTO dataset

REGTYP$$:
The $HBRUTTO dataset will include a new variable to distinguish between urban, suburban and rural regions. This is based on the spatial categories of counties (as of December 31, 2009) used by the Federal Institute for Research on Building, Urban Affairs and Spatial Development (BBSR). The following spatial structure characteristics are used to define the categories:

  • Share of county’s population in large or medium-sized cities
  • Population density of the county
  • Population density of the county without taking large or medium-sized cities into consideration

Thus, three categories can be defined:

  1. Urban regions (Cities with at least 100,000 inhabitants and counties with at least 50% of the population living in large or medium-sized cities and with a population density of at least 150 inhabitants/km²; and counties with a population density not including large or medium-sized cities of at least 150 inhabitants/km²)
  2. Regions undergoing urbanization (Counties with at least 50% of the population living in large or medium-sized cities but a population density of below 150 inhabitants/km², and counties with less than 50% of the population living in large or medium-sized cities, and with a population density (excluding large or medium-sized cities) of at least 100 inhabitants/km²)
  3. Rural regions (counties with less than 50% of the population living in large or medium-sized cities and population density (excluding large or medium-sized cities) of below 100 inhabitants/km²).

 

4.2 $PGEN dataset

BILZTCH$$ / BILZTEV$$:
BILZTCH$$ indicates whether the respondents’ answers suggest a downward shift in years of education or training ($BILZEIT) since the last observation or an upward change since the last year which is inconsistent with additional information on education or training recently completed.
BILZTEV$$
is a flag variable which indicates whether the respondent showed some inconsistent change in $BILZEIT either upwards or downwards over the entire observation period.

$VEBZEIT and $UEBSTD

To be consistent with the FID dataset, the missing values of the variables $VEBZEIT and $UEBSTD were slightly recoded, as the missing value –2 is now assigned to self-employed individuals. In previous waves, self-employed persons had the missing value –3 (implausible answer).

For $UEBSTD, the value –3 (implausible answer) is assigned to all individuals with more than ten hours of weekly overtime AND who also had an agreed working time of over 80 weekly hours ($VEBZEIT is implausible, value –3) or actual weekly working time of more than 80 hours a week ($TATZEIT is implausible, value –3).

4.3 BIOPAREN dataset

Seven new variables have been added to BIOPAREN:
VAORT11 and MAORT11 indicate the mother and father’s current place of residence.
GESCHW, GESCHWUP, NUMS, NUMB and TWIN
provide information on siblings. The variable GESCHW indicates whether the respondent ever had any siblings at the time of the interview. GESCHWUP gives information about the year the sibling information was collected. NUMB and NUMS provides information on the number of brothers or sisters the respondent reports and TWIN indicates whether any of these are TWIN siblings (and of which type) of the respondent.

1984-2011 (Wave BB)

Dec. 19,2012

BIOCOUPLM, BIOCOUPLY, BIOMARSM, BIOMARSY
In some cases, reports of a past divorce were not taken into account in the data generation process. In addition, the reported year of death of a former partner was in some cases overwritten by the respective current year of the interview. This affected not only the start and end date of some spells but also missing information and validation checks.

$FAMSTD
The mistakenly overwritten information in the generation of BIOCOUPL$ affected validation checks. A majority of the formerly missing information is now available. However, the number of implausible answers has also risen in the process.

An update for all corrected files can be downloaded by means of a personalized link. Please contact soepmail@diw.de to obtain your link. 

Please note: If you use one of the provided bugfixes in your analyses we recommend citing it as follows:
English:
Socio-Economic Panel (SOEP), data for years 1984-2011, version 28.1, SOEP, 2012.
German:
Sozio-oekonomisches Panel (SOEP), Daten für die Jahre 1984-2011, Version 28.1, SOEP, 2012.
Short Version:
SOEP v28.1 


Survey Instruments 2011: Field-de Field-de

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

3) The Request for Record Linkage in the IAB-SOEP Migration Sample

4) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

8) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

13) SOEP Scales Manual (updated for SOEP-Core v32.1)

14) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

15) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

16) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

17) Multi-Itemskalen im SOEP Jugendfragebogen

18) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

19) Documentation of ISCED Generation Based on the CAMCES Tool in the IAB-SOEP Migration Samples M1/M2 and IAB-BAMF-SOEP Survey of Refugees M3/M4 until 2017

20) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

21) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

22) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

All documentation for filtering can be found on this page

keyboard_arrow_up