SOEP-Core v35 (data 1984-2018)

The Socio-Economic Panel (SOEP) is a representative, multi-cohort survey that has been running since 1984. Every year, individuals in households throughout Germany are surveyed by our survey institute on behalf of DIW Berlin. These respondents provide information on topics such as their income, employment history, education, and health. Because the same people are surveyed every year, it is possible to track long-term psychological, economic, societal, and social developments. To keep pace with changes in society, random samples are added regularly and the survey is adapted accordingly. The international version of the data contains 95% of the total sample (see “Other editions”, soep-core.v35i).

Dataset Information

Title: Socio-Economic Panel (SOEP), data from 1984-2018

DOI infoFind an explanation on the usage of DOI here. : 10.5684/soep-core.v35
Collection period: 1984-2018
Publication date: 01.11.2019
Principal investigators: Stefan Liebig, Jan Goebel, Martin Kroh, Carsten Schröder, Markus Grabka, Jürgen Schupp, Charlotte Bartels, Alexandra Fedorets, Andreas Franken, Jannes Jacobsen, Selin Kara, Peter Krause, Hannes Kröger, Maria Metzing, David Richter, Diana Schacht, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Rainer Siegers, Knut Wenzig, Stefan Zimmermann

Contributor: Kantar Deutschland GmbH (Data Collector)

Population: Persons living in private households in Germany

Amount of households: 18.682

Amount of individuals: 31.997 + 3971 children

Special samples: Citizens of the GDR (1990), Immigration/Migration (1994/95, 2013, 2015), Refugees (seit 2016). See the chapter SOEP-Samples in Detail on the SOEPcompanion for a description of all our samples.

Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk.

Collection Mode:The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)

Citation of the Data Set: Socio-Economic Panel (SOEP), data for years 1984-2018, version 35, SOEP, 2019, doi:10.5684/soep.v35.

SOEP-Core - Reference Articles

Publications using this file should refer to the above DOI infoFind an explanation on the usage of DOI here.and cite following references

Goebel, Jan, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, and Jürgen Schupp. 2019. The German Socio-Economic Panel (SOEP). Jahrbücher für Nationalökonomie und Statistik (Journal of Economics and Statistics) 239 (2), 345-360. (https://doi.org/10.1515/jbnst-2018-0022)

If you do not exclude the cases of the migration samples in your analysis, then please also cite the following reference:

Herbert Brücker, Martin Kroh, Simone Bartsch, Jan Goebel, Simon Kühne, Elisabeth Liebau, Parvati Trübswetter, Ingrid Tucci & Jürgen Schupp. 2014. The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents. SOEP Survey Paper 216 (PDF, 444.25 KB), Series C. Berlin, Nürnberg: DIW Berlin.

If you do not exclude the cases of the refugee samples in your analysis, please also cite:

Herbert Brücker, Yuliya Kosyakova, Nina Rother, Sabine Zinn, Elisabeth Liebau, Wenke Gider, Silvia Schwanhäuser, & Manuel Siegert. 2025. Exploring Integration and Migration Dynamics: The Research Potentials of a Large-Scale Longitudinal Household Study of Refugees in Germany. European Sociological Review. https://doi.org/10.1093/esr/jcaf032.

If you use data from the SOEP-LEE2 surveys, please also cite:

Wenzel Matiaske, Torben Dall Schmidt, Christoph Halbmeier, Martina Maas, Doris Holtmann, Carsten Schröder, Tamara Böhm, Stefan Liebig, and Alexander S. Kritikos. 2023. SOEP-LEE2 : Linking Surveys on Employees to Employers in Germany. Jahrbücher für Nationalökonomie und Statistik Data Observer, 1–14. https://doi.org/10.1515/jbnst-2023-0031.

If you would like to refer more specifically, please also cite:

Schröder, Carsten, Johannes König, Alexandra Fedorets, Jan Goebel, Markus M. Grabka, Holger Lüthen, Maria Metzing, Felicitas Schikora, and Stefan Liebig. 2020. The economic research potentials of the German Socio-Economic Panel study. German Economic Review 21 (3), 335-371. (https://doi.org/10.1515/ger-2020-0033)
Giesselmann, Marco, Sandra Bohmann, Jan Goebel, Peter Krause, Elisabeth Liebau, David Richter, Diana Schacht, Carsten Schröder, Jürgen Schupp, and Stefan Liebig. 2019. The Individual in Context(s): Research Potentials of the Socio-Economic Panel Study (SOEP) in Sociology. European Sociological Review 35 (5), 738-755. (https://doi.org/10.1093/esr/jcz029)
Jacobsen, Jannes, Magdalena Krieger, Felicitas Schikora, and Jürgen Schupp. 2021. Growing Potentials for Migration Research using the German Socio-Economic Panel Study. Jahrbücher für Nationalökonomie und Statistik 241 (4), 527-549. (https://doi.org/10.1515/jbnst-2021-0001)
Fedorets, Alexandra, Stefan Kirchner, Jule Adriaans, and Oliver Giering. 2022. Data on Digital Transformation in the German Socio-Economic Panel. Jahrbücher für Nationalökonomie und Statistik 242 (5-6), 691-705. (https://doi.org/10.1515/jbnst-2021-0056)

SOEP-Core v35 (2018) - Other Editions

For the SOEP-Core data 1984-2018 (v35) - Wave A to BI - we provide the following versions:

soep-core.v35

soep-core.v35i (International Scientific Use Version, 95%)

soep-core.v35t (Teaching version)

These datasets are included in SOEP v35, but is also available as individual data sets upon reques:

soep.ddr18 (Living in the GDR)

soep.iab-soep-mig.2018 (Migration samples)

soep.iab-bamf-soep-mig.2018 (Refugee samples)

SOEP-Core v35 (2018) - Changes in the Dataset

SOEP-Core soep-core.v35

1. New sample in the main SOEP study

The new refresher sample, Subsample O, contains 1,000 new households. These were selected in cooperation with BBSR using a new sampling design based on regional data in areas where the “Soziale Stadt” (social city) urban development project is being carried out. Based on the digital data available on the boundaries of the “Soziale Stadt” areas, it was possible to create a new variable going back to the year 2000 that shows whether or not a household’s address is within an area covered by the project (see Variable Description below under 4.4).

2. Modifications in our new main data format, SOEPlong

We have made the following important changes over and beyond to our normal annual updates:

PKAL: Integration of the $PKALOST datasets
PL/PKAL: Calendar strings all now stored in PL and monthly variables in PKAL
PLUECKEL: Introduction of RYEAR and correction of SYEAR, which was RYEAR up to now
PBRUTTO: If a variable was not part of the year-specific gross file, the missing code has been changed to -8 and is no longer -2.
VPL: The case numbers for past years have increased since cases without a SOEP respondent are no longer deleted
KIDLONG: The harmonization concept has been adapted to the concept used with other datasets; more variables from $KIND datasets have been included (more information under 4)

3. New in SOEPhelp

SOEPhelp now includes links between topics and variables from the metadata. The data overview (command: soephelp (without variable)) lists all the topics in the dataset and tells which variables belong to which topic.
The variable overview (command: soephelp [variable]) lists the topics covered by the variable (and the relationships among topics and subtopics). The topic labels are linked to Paneldata.org.
SOEPhelp now has a search tool! If you type in the command: soephelp, search (SEARCH TERM) [verbose], you will get a list of the variables for which your SEARCH TERM is contained either in the question or one of the answer options. The variables are provided in list form and saved in r (for returns). The option “verbose” describes the variables in more detail.
More information on SOEPhelp

4. New Datasets and Variables

4.1. Early Childhood

New dataset BCBFK “Early Childhood" with geographically detailed information about the places where the respondents grew up. Because of the detailed regional data the dataset is only available with the RDC SOEP. The corresponding field report and questionnaire is available as SOEP Survey Paper 766 (PDF, 1.28 MB) (in German).

4.2. Your Life in the GDR

New dataset DDR18 “Your Life in the GDR”, the corresponding questionnaire is available as Survey Paper 676.

4.3. Biography follow-up survey

The variables from the biography follow-up survey on migration status have been integrated into the dataset BILELA or BIOL.

4.4. New variable SOCURBAN in dataset HBRUTTO

SOCURBAN: Household address is in an area where the “Soziale Stadt” (social city) urban development project is being carried out (as of July 2017) (Yes/No)

4.5. New variables in dataset EQUIV

ILIB1$$: pensions for liberal professions
ILIB2$$: widow / orphans pensions for liberal professions

4.6. New variables in dataset BIOJOB

In 2018, respondents received new survey instruments concerning job classifications and prestige score. This information is provided in new following variables: STBA10, ISCO08, EGP08, ISEI08, MPS08, and SIOPS08. Corresponding variables STBA, EGP, ISEI, MPS, und SIOPS of older versions of BIOJOB are renamed in STBA92, EGP88, ISEI88, MPS92, und SIOPS88.

5. Changes to datasets or individual variables

5.1. Weighting variable PHRF in the dataset PPATHL

There are slight changes concerning the poststratification of the weighting variables starting in 2013. The changes relate to the year of immigration. Previously, respondents who immigrated before 1955 were treated as migrants; they now constitute a distinct category of their own, along with recent immigrants and German-born respondents. The reason is that it is not possible to define ethnic Germans consistently between the Mikrozensus and the SOEP.

5.2. Variables representing occupational codes

Since 2013, open-ended questions on occupations have been coded in ISCO-08 and KldB 2010. This is the first year in which the old classifications ISCO-88 and KldB 92 are no longer available. We have therefore introduced new prestige scores based on the new classifications and discontinued the old scores.
Calendar strings have been moved from $PKAL to $P or standardized.

5.3. Educational variables

Up to soep.v34, the basic generated educational variables were generated annually and were cumulated over time. Due to the availability of SOEPlong, we have substantially revised the tools used for generating variables to always consider all available educational variables for each year.
In addition to the fact that all variables are now generated based entirely on SOEPlong files, we have also made two additional modifications:
First, the main educational variables now also take into account inconsistencies over time, in contrast to the educational variables in PGEN prior to soep.v34.
Second, variable “Amount of Education or Training in Years” ($$BILZEIT) has been slightly modified. To consider occupational training (for non-university degrees), we have adjusted the years of education for “civil servants” and “others” slightly.

5.4. Dataset KIDLONG

Errors in the integration of variables were corrected, split up in versioned variables, and harmonized variables were constructed. As a result, the number of variables has increased: 110 variables (v.34); 267 variables (v.35)
Missing variables from the $KIND datasets were incorporated into KIDLONG.
Corrected version of BHKIND was incorporated into KIDLONG.
KIDLONG now adheres to the classic harmonization concept).

5.5. Dataset BHKIND

Flag variable to identify child questionnaires that were not completed (BHKFLAG)
Missing observations were added: 15,032 (v.34) to 15,504 (v.35).
Errors in the integration of variables were corrected and missing variables were incorporated into BHKIND: 85 (v.34) variables; 129 variables (v.35).
All variables were renamed and now follow the SOEP naming conventions.

5.6. Dataset BIKIND

Flag variable added to identify child questionnaires that were not completed (BIKFLAG)
All variables now follow the SOEP naming conventions.

5.7. Variable PARID in the dataset PPATHL

Partnerships of respondents with net codes between 40 and 49 were dissolved and will be coded -2 “does not apply” in the future.

5.8. Variable HGOWNER in the dataset HGEN

In samples M3-M5 in 2017, several missing values in the variable HGOWNER were replaced with the information that a household is living in a shelter or housing for refugees.

5.9. Dataset INTERVIEWER

The year 2016 now contains information from Samples L2-M4.
The variable on the length of the interview (LENGTHINT) was eliminated and replaced by three variables, which each just give the average length of one questionnaire (LENGTHINT- H / P / J).
The youth surveys, which were previously counted in the number of interviews per person (AMOUNTINTP) now have their own variable (AMOUNTINTJ).

5.10. Dataset BIOAGE17

Previous versions of BIOAGE17 contained the identifier of the respondent’s mother (BYMNR) and father (BYVNR). The identifiers of the parents are found in BIOPAREN (MNR and VNR) and can be easily merged with BIOAGE17.
Desired occupation variables ISCO88 have been replaced by ISCO08. The same is true for BYKLAS: The old 1992 version has been replaced by the 2010 version.

5.11. Dataset BIOAGEL

The internal distinction between BIOAGE 8a and 8b, or between 81 and 82, has been eliminated, meaning that the dataset BIOAGEL now contains one line per child and respondent for questionnaires about 7-8-year-old children. As a result, when each parent completed a questionnaire on a child in a given year, there are two lines for that child (one line per parent). These can be identified by the different PIDE (PID of the respondent).

SOEP-Core v35 (2018) - Known Bugs/Fixes

(as of April 2020)

Dataset: bioage; variable clref
We detected a label error in the data set bioage in the variable clref that could be misleading when analyzing the data. The labels for values [1] und [2] need to be switched.

stata [de]

label def clref ///
1 "[1] Ja, sowohl spez. Klasse als auch Regelunterricht" ///
2 "[2] Ja, ausschliessl. spez. Klasse fuer gefluechtete Kinder", modify

stata [en]

label def clref ///
1 "[1] Yes, both special class and regular classes" ///
2 "[2] Yes, only special class for refugee children", modify

spss [de]

add value labels clref 1 '[1] Ja, sowohl spez. Klasse als auch Regelunterricht' 2 '[2] Ja, ausschliessl. spez. Klasse fuer gefluechtete Kinder' .

spss [en]

add value labels clref 1 '[1] Yes, both special class and regular classes' 2 '[2] Yes, only special class for refugee children' .

Survey Instruments

Field-de Var-de Var-en
Individual (PAPI) 2018: Field-de Field-de Var-de Var-en
Household (PAPI) 2018: Field-de Field-de Var-de Var-en
Biography (PAPI) 2018: Field-de,en Var-de Var-en
Catch-up Individual (PAPI) 2018: Field-de Var-de Var-en
Youth (16-17-year-olds, PAPI) 2018: Field-de Var-de Var-en
Early Youth (13-14-year-olds, PAPI) 2018: Field-de Var-de Var-en
Pre-teen (11-12-year-olds, PAPI) 2018: Field-de Var-en Var-en
Mother and Child (Newborns, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (2-3-year-olds, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (5-6-year-olds, PAPI) 2018: Field-de Var-de Var-en
Parents and Child (7-8-year-olds, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (9-10-year-olds, PAPI) 2018: Field-de Var-de Var-en
Deceased Individual (PAPI) 2018: Field-de Var-de Var-en
Grip Strength 2018: Field-de

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

Documentation of the datasets

1) SOEP-Core – 2018: Report of Survey Methodology and Fieldwork

2) SOEP-Core v35 – Documentation of Sample Sizes and Panel Attrition in the German Socio-Economic Panel (SOEP)(1984 until 2018)

3) SOEP-Core – 2018: Sampling, Nonresponse, and Weighting in the Sample O

4) SOEP-Core v35 – PPATHL: Person-Related Meta-Dataset

5) SOEP-Core v35 – Biographical Information in the Meta File PPATH (Month of Birth, Immigration Variables, Living in East or West Germany in 1989)

6) SOEP-Core v35 – HPATHL: Household-Related Meta-Dataset

7) SOEP-Core v35 – PBRUTTO: Person-Related Gross File

8) SOEP-Core v35 – HBRUTTO: Household-Related Gross File

9) SOEP-Core v35 – PGEN: Person-Related Status and Generated Variables

10) SOEP-Core v35 – HGEN: Household-Related Status and Generated Variables

11) SOEP-Core v35 – Codebook for the $PEQUIV File 1984-2018: CNEF Variables with Extended Income Information for the SOEP

12) SOEP-Core v35 – BIOIMMIG

13) SOEP-Core v35 – HEALTH

14) SOEP-Core v35 – BIOPAREN: Biography Information for the Parents of SOEP-Respondents

15) SOEP-Core v35 – BIOAGEL & BIOPUPIL: Generated Variables from the "Mother & Child", "Parent", "Pre-Teen", and "Early Youth" Questionnaires

16) SOEP-Core v35 – BIOSIB: Information on Siblings in the SOEP

17) SOEP-Core v35 – The Couple History Files BIOCOUPLM and BIOCOUPLY, and Marital History Files BIOMARSM and BIOMARSY

18) SOEP-Core v35 – BIOAGE17: The Youth Questionnaire

19) SOEP-Core v35 – BIOSOC: Retrospective Data on Youth and Socialization

20) SOEP-Core v35 – BIOJOB: Detailed Information on First and Last Job

21) SOEP-Core v35 – BIOEDU: Data on Educational Participation and Transitions

22) SOEP-Core v35 – BIORESID: Variables on Occupancy and Second Residence

23) SOEP-Core v35 – BIOBIRTH: A Data Set on the Birth Biography of Male and Female Respondents

24) SOEP-Core v35 – BIOTWIN: TWINS in the SOEP

25) SOEP-Core v35 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables

26) SOEP-Core – 2018: Documentation of the Interviewer Dataset (1984 until 2018)

27) SOEP-Core v35 – INTERVIEWER

28) SOEP-Core v35 – LIFESPELL: Information on the Pre- and Post-Survey History of SOEP-Respondents

29) SOEP-Core v35 – MIGSPELL and REFUGSPELL: The Migration-Biographies

30) SOEP-Core v35 – Activity Biography in the Files PBIOSPE and ARTKALEN

31) SOEP-Core v35: Codebook for the EU-SILC-Like Panel for Germany Based on the SOEP

Documentation

1) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

2) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

All documentation for filtering can be found on this page

Report on survey methods

1) SOEP-Core – 2018: Report of Survey Methodology and Fieldwork

SOEP-Core

SOEP-IS

Further Offer

Dataset Information

SOEP-Core - Reference Articles keyboard_arrow_up

SOEP-Core v35 (2018) - Other Editions keyboard_arrow_up

SOEP-Core v35 (2018) - Changes in the Dataset keyboard_arrow_up