Skip to content!

SOEP-Core v35 (data 1984-2018)

The Socio-Economic Panel (SOEP) is a representative, multi-cohort survey that has been running since 1984. Every year, individuals in households throughout Germany are surveyed by our survey institute on behalf of DIW Berlin. These respondents provide information on topics such as their income, employment history, education, and health. Because the same people are surveyed every year, it is possible to track long-term psychological, economic, societal, and social developments. To keep pace with changes in society, random samples are added regularly and the survey is adapted accordingly. The international version of the data contains 95% of the total sample (see “Other editions”, soep-core.v35i).

Dataset Information

Title: Socio-Economic Panel (SOEP), data from 1984-2018

DOI infoFind an explanation on the usage of DOI here. : 10.5684/soep-core.v35
Collection period: 1984-2018
Publication date: 01.11.2019
Principal investigators: Stefan Liebig, Jan Goebel, Martin Kroh, Carsten Schröder, Markus Grabka, Jürgen Schupp, Charlotte Bartels, Alexandra Fedorets, Andreas Franken, Jannes Jacobsen, Selin Kara, Peter Krause, Hannes Kröger, Maria Metzing, David Richter, Diana Schacht, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Rainer Siegers, Knut Wenzig, Stefan Zimmermann

Contributor: Kantar Deutschland GmbH (Data Collector)

Population: Persons living in private households in Germany

Amount of households: 18.682

Amount of individuals: 31.997 + 3971 children

Special samples: Citizens of the GDR (1990), Immigration/Migration (1994/95, 2013, 2015), Refugees (seit 2016). See the chapter SOEP-Samples in Detail on the SOEPcompanion for a description of all our samples. 

Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk.

Collection Mode:The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)

Citation of the Data Set: Socio-Economic Panel (SOEP), data for years 1984-2018, version 35, SOEP, 2019, doi:10.5684/soep.v35.

Publications using this file should refer to the above DOI infoFind an explanation on the usage of DOI here.and cite following references

  • Goebel, Jan, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, and Jürgen Schupp. 2019. The German Socio-Economic Panel (SOEP). Jahrbücher für Nationalökonomie und Statistik (Journal of Economics and Statistics) 239 (2), 345-360. (https://doi.org/10.1515/jbnst-2018-0022)

If you do not exclude the cases of the migration samples in your analysis, then please also cite the following reference

  • Herbert Brücker, Martin Kroh, Simone Bartsch, Jan Goebel, Simon Kühne, Elisabeth Liebau, Parvati Trübswetter, Ingrid Tucci & Jürgen Schupp (2014): The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents. SOEP Survey Paper 216 (PDF, 444.25 KB), Series C. Berlin, Nürnberg: DIW Berlin.

If you do not exclude the cases of the refugee samples in your analysis, please also cite: IAB-BAMF-SOEP survey of refugees (M3-M5), data for the years 2016-2021,

  • Herbert Brücker, Nina Rother, Jürgen Schupp. 2017. IAB-BAMF-SOEP Befragung von Geflüchteten 2016. Studiendesign, Feldergebnisse sowie Analysen zu schulischer wie beruflicher Qualifikation, Sprachkenntnissen sowie kognitiven Potenzialen. IAB Forschungsbericht 13/2017.

If you use data from the SOEP-LEE2 surveys, please also cite:

  • Matiaske, W., Schmidt, T. D., Halbmeier, C., Maas, M., Holtmann, D., Schröder, C., Böhm, T., Liebig, S., and Kritikos, A. S. (2023). SOEP-LEE2 : Linking Surveys on Employees to Employers in Germany. Journal of Economics and Statistics Data Observer, 1–14. https://doi.org/10.1515/jbnst-2023-0031.

If you would like to refer more specifically, please also cite:

  • Schröder, Carsten, Johannes König, Alexandra Fedorets, Jan Goebel, Markus M. Grabka, Holger Lüthen, Maria Metzing, Felicitas Schikora, and Stefan Liebig. 2020. The economic research potentials of the German Socio-Economic Panel study. German Economic Review 21 (3), 335-371. (https://doi.org/10.1515/ger-2020-0033)
  • Giesselmann, Marco, Sandra Bohmann, Jan Goebel, Peter Krause, Elisabeth Liebau, David Richter, Diana Schacht, Carsten Schröder, Jürgen Schupp, and Stefan Liebig. 2019. The Individual in Context(s): Research Potentials of the Socio-Economic Panel Study (SOEP) in Sociology. European Sociological Review 35 (5), 738-755. (https://doi.org/10.1093/esr/jcz029)
  • Jacobsen, Jannes, Magdalena Krieger, Felicitas Schikora, and Jürgen Schupp. 2021. Growing Potentials for Migration Research using the German Socio-Economic Panel Study. Jahrbücher für Nationalökonomie und Statistik 241 (4), 527-549. (https://doi.org/10.1515/jbnst-2021-0001)
  • Fedorets, Alexandra, Stefan Kirchner, Jule Adriaans, and Oliver Giering. 2022. Data on Digital Transformation in the German Socio-Economic Panel. Jahrbücher für Nationalökonomie und Statistik 242 (5-6), 691-705. (https://doi.org/10.1515/jbnst-2021-0056)

For the SOEP-Core data 1984-2018 (v35) - Wave A to BI - we provide the following versions:

soep-core.v35

soep-core.v35i (International Scientific Use Version, 95%)

soep-core.v35t (Teaching version)

These datasets are included in SOEP v35, but is also available as individual data sets upon reques:

soep.ddr18 (Living in the GDR)

soep.iab-soep-mig.2018 (Migration samples)

soep.iab-bamf-soep-mig.2018 (Refugee samples)

SOEP-Core soep-core.v35

1. New sample in the main SOEP study

The new refresher sample, Subsample O, contains 1,000 new households. These were selected in cooperation with BBSR using a new sampling design based on regional data in areas where the “Soziale Stadt” (social city) urban development project is being carried out. Based on the digital data available on the boundaries of the “Soziale Stadt” areas, it was possible to create a new variable going back to the year 2000 that shows whether or not a household’s address is within an area covered by the project (see Variable Description below under 4.4).

2. Modifications in our new main data format, SOEPlong

We have made the following important changes over and beyond to our normal annual updates:

  • PKAL: Integration of the $PKALOST datasets
  • PL/PKAL: Calendar strings all now stored in PL and monthly variables in PKAL
  • PLUECKEL: Introduction of RYEAR and correction of SYEAR, which was RYEAR up to now
  • PBRUTTO: If a variable was not part of the year-specific gross file, the missing code has been changed to -8 and is no longer -2.
  • VPL: The case numbers for past years have increased since cases without a SOEP respondent are no longer deleted
  • KIDLONG: The harmonization concept has been adapted to the concept used with other datasets; more variables from $KIND datasets have been included (more information under 4)

3. New in SOEPhelp

  • SOEPhelp now includes links between topics and variables from the metadata. The data overview (command: soephelp (without variable)) lists all the topics in the dataset and tells which variables belong to which topic.
  • The variable overview (command: soephelp [variable]) lists the topics covered by the variable (and the relationships among topics and subtopics). The topic labels are linked to Paneldata.org.
  • SOEPhelp now has a search tool! If you type in the command: soephelp, search (SEARCH TERM) [verbose], you will get a list of the variables for which your SEARCH TERM is contained either in the question or one of the answer options. The variables are provided in list form and saved in r (for returns). The option “verbose” describes the variables in more detail.
  • More information on SOEPhelp

4. New Datasets and Variables

4.1. Early Childhood

  • New dataset BCBFK “Early Childhood" with geographically detailed information about the places where the respondents grew up. Because of the detailed regional data the dataset is only available with the RDC SOEP. The corresponding field report and questionnaire is available as SOEP Survey Paper 766 (PDF, 1.28 MB) (in German).

4.2. Your Life in the GDR

  • New dataset DDR18 “Your Life in the GDR”, the corresponding questionnaire is available as Survey Paper 676.

4.3. Biography follow-up survey

  • The variables from the biography follow-up survey on migration status have been integrated into the dataset BILELA or BIOL.

4.4. New variable SOCURBAN in dataset HBRUTTO

  • SOCURBAN: Household address is in an area where the “Soziale Stadt” (social city) urban development project is being carried out (as of July 2017) (Yes/No)

4.5. New variables in dataset EQUIV

  • ILIB1$$: pensions for liberal professions
  • ILIB2$$: widow / orphans pensions for liberal professions

4.6. New variables in dataset BIOJOB

  • In 2018, respondents received new survey instruments concerning job classifications and prestige score. This information is provided in new following variables: STBA10, ISCO08, EGP08, ISEI08, MPS08, and SIOPS08. Corresponding variables STBA, EGP, ISEI, MPS, und SIOPS of older versions of BIOJOB are renamed in STBA92, EGP88, ISEI88, MPS92, und SIOPS88.

5. Changes to datasets or individual variables

5.1. Weighting variable PHRF in the dataset PPATHL

  • There are slight changes concerning the poststratification of the weighting variables starting in 2013. The changes relate to the year of immigration. Previously, respondents who immigrated before 1955 were treated as migrants; they now constitute a distinct category of their own, along with recent immigrants and German-born respondents. The reason is that it is not possible to define ethnic Germans consistently between the Mikrozensus and the SOEP.

5.2. Variables representing occupational codes

  • Since 2013, open-ended questions on occupations have been coded in ISCO-08 and KldB 2010. This is the first year in which the old classifications ISCO-88 and KldB 92 are no longer available. We have therefore introduced new prestige scores based on the new classifications and discontinued the old scores.
  • Calendar strings have been moved from $PKAL to $P or standardized.

5.3. Educational variables

  • Up to soep.v34, the basic generated educational variables were generated annually and were cumulated over time. Due to the availability of SOEPlong, we have substantially revised the tools used for generating variables to always consider all available educational variables for each year.
  • In addition to the fact that all variables are now generated based entirely on SOEPlong files, we have also made two additional modifications:
  • First, the main educational variables now also take into account inconsistencies over time, in contrast to the educational variables in PGEN prior to soep.v34.
  • Second, variable “Amount of Education or Training in Years” ($$BILZEIT) has been slightly modified. To consider occupational training (for non-university degrees), we have adjusted the years of education for “civil servants” and “others” slightly.

5.4. Dataset KIDLONG

  • Errors in the integration of variables were corrected, split up in versioned variables, and harmonized variables were constructed. As a result, the number of variables has increased: 110 variables (v.34); 267 variables (v.35)
  • Missing variables from the $KIND datasets were incorporated into KIDLONG.
  • Corrected version of BHKIND was incorporated into KIDLONG.
  • KIDLONG now adheres to the classic harmonization concept).

5.5. Dataset BHKIND

  • Flag variable to identify child questionnaires that were not completed (BHKFLAG)
  • Missing observations were added: 15,032 (v.34) to 15,504 (v.35).
  • Errors in the integration of variables were corrected and missing variables were incorporated into BHKIND: 85 (v.34) variables; 129 variables (v.35).
  • All variables were renamed and now follow the SOEP naming conventions.

5.6. Dataset BIKIND

  • Flag variable added to identify child questionnaires that were not completed (BIKFLAG)
  • All variables now follow the SOEP naming conventions.

5.7. Variable PARID in the dataset PPATHL

  • Partnerships of respondents with net codes between 40 and 49 were dissolved and will be coded -2 “does not apply” in the future.

5.8. Variable HGOWNER in the dataset HGEN

  • In samples M3-M5 in 2017, several missing values in the variable HGOWNER were replaced with the information that a household is living in a shelter or housing for refugees.

5.9. Dataset INTERVIEWER

  • The year 2016 now contains information from Samples L2-M4.
  • The variable on the length of the interview (LENGTHINT) was eliminated and replaced by three variables, which each just give the average length of one questionnaire (LENGTHINT- H / P / J).
  • The youth surveys, which were previously counted in the number of interviews per person (AMOUNTINTP) now have their own variable (AMOUNTINTJ).

5.10. Dataset BIOAGE17

  • Previous versions of BIOAGE17 contained the identifier of the respondent’s mother (BYMNR) and father (BYVNR). The identifiers of the parents are found in BIOPAREN (MNR and VNR) and can be easily merged with BIOAGE17.
  • Desired occupation variables ISCO88 have been replaced by ISCO08. The same is true for BYKLAS: The old 1992 version has been replaced by the 2010 version.

5.11. Dataset BIOAGEL

  • The internal distinction between BIOAGE 8a and 8b, or between 81 and 82, has been eliminated, meaning that the dataset BIOAGEL now contains one line per child and respondent for questionnaires about 7-8-year-old children. As a result, when each parent completed a questionnaire on a child in a given year, there are two lines for that child (one line per parent). These can be identified by the different PIDE (PID of the respondent).

(as of April 2020)

Dataset: bioage; variable clref
We detected a label error in the data set bioage in the variable clref that could be misleading when analyzing the data. The labels for values [1] und [2] need to be switched.

stata [de]

label def clref ///
1 "[1] Ja, sowohl spez. Klasse als auch Regelunterricht" ///
2 "[2] Ja, ausschliessl. spez. Klasse fuer gefluechtete Kinder", modify

stata [en]

label def clref ///
1 "[1] Yes, both special class and regular classes" ///
2 "[2] Yes, only special class for refugee children", modify

spss [de]

add value labels clref 1 '[1] Ja, sowohl spez. Klasse als auch Regelunterricht' 2 '[2] Ja, ausschliessl. spez. Klasse fuer gefluechtete Kinder' .

spss [en]

add value labels clref 1 '[1] Yes, both special class and regular classes' 2 '[2] Yes, only special class for refugee children' .


Individual (PAPI) 2018: Field-de Field-en Var-de Var-en
Household (PAPI) 2018: Field-de Field-en Var-de Var-en
Biography (PAPI) 2018: Field-de,en Var-de Var-en
Catch-up Individual (PAPI) 2018: Field-de Var-de Var-en
Youth (16-17-year-olds, PAPI) 2018: Field-de Var-de Var-en
Early Youth (13-14-year-olds, PAPI) 2018: Field-de Var-de Var-en
Pre-teen (11-12-year-olds, PAPI) 2018: Field-de Var-en Var-en
Mother and Child (Newborns, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (2-3-year-olds, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (5-6-year-olds, PAPI) 2018: Field-de Var-de Var-en
Parents and Child (7-8-year-olds, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (9-10-year-olds, PAPI) 2018: Field-de Var-de Var-en
Deceased Individual (PAPI) 2018: Field-de Var-de Var-en
Grip Strength 2018: Field-de

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

1) SOEP-Core v35 – Documentation of Sample Sizes and Panel Attrition in the German Socio-Economic Panel (SOEP)(1984 until 2018)

2) SOEP-Core – 2018: Sampling, Nonresponse, and Weighting in the Sample O

3) SOEP-Core v35 – PPATHL: Person-Related Meta-Dataset

4) SOEP-Core v35 – Biographical Information in the Meta File PPATH (Month of Birth, Immigration Variables, Living in East or West Germany in 1989)

5) SOEP-Core v35 – HPATHL: Household-Related Meta-Dataset

6) SOEP-Core v35 – PBRUTTO: Person-Related Gross File

7) SOEP-Core v35 – HBRUTTO: Household-Related Gross File

8) SOEP-Core v35 – PGEN: Person-Related Status and Generated Variables

9) SOEP-Core v35 – HGEN: Household-Related Status and Generated Variables

10) SOEP-Core v35 – Codebook for the $PEQUIV File 1984-2018: CNEF Variables with Extended Income Information for the SOEP

11) SOEP-Core v35 – BIOIMMIG

12) SOEP-Core v35 – HEALTH

13) SOEP-Core v35 – BIOPAREN: Biography Information for the Parents of SOEP-Respondents

14) SOEP-Core v35 – BIOAGEL & BIOPUPIL: Generated Variables from the "Mother & Child", "Parent", "Pre-Teen", and "Early Youth" Questionnaires

15) SOEP-Core v35 – BIOSIB: Information on Siblings in the SOEP

16) SOEP-Core v35 – The Couple History Files BIOCOUPLM and BIOCOUPLY, and Marital History Files BIOMARSM and BIOMARSY

17) SOEP-Core v35 – BIOAGE17: The Youth Questionnaire

18) SOEP-Core v35 – BIOSOC: Retrospective Data on Youth and Socialization

19) SOEP-Core v35 – BIOJOB: Detailed Information on First and Last Job

20) SOEP-Core v35 – BIOEDU: Data on Educational Participation and Transitions

21) SOEP-Core v35 – BIORESID: Variables on Occupancy and Second Residence

22) SOEP-Core v35 – BIOBIRTH: A Data Set on the Birth Biography of Male and Female Respondents

23) SOEP-Core v35 – BIOTWIN: TWINS in the SOEP

24) SOEP-Core v35 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables

25) SOEP-Core – 2018: Documentation of the Interviewer Dataset (1984 until 2018)

26) SOEP-Core v35 – INTERVIEWER

27) SOEP-Core v35 – LIFESPELL: Information on the Pre- and Post-Survey History of SOEP-Respondents

28) SOEP-Core v35 – MIGSPELL and REFUGSPELL: The Migration-Biographies

29) SOEP-Core v35 – Activity Biography in the Files PBIOSPE and ARTKALEN

30) SOEP-Core v35: Codebook for the EU-SILC-Like Panel for Germany Based on the SOEP

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

3) The Request for Record Linkage in the IAB-SOEP Migration Sample

4) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

8) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

13) SOEP Scales Manual (updated for SOEP-Core v32.1)

14) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

15) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

16) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

17) Multi-Itemskalen im SOEP Jugendfragebogen

18) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

19) Dokumentation zum Entwicklungsprozess des Moduls „Einstellungen zu sozialer Ungleichheit“ im SOEP (v38)

20) SOEP-CoV: Project and Data Documentation

21) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

22) SOEP-Core v34 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables

23) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

24) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

25) SOEP-Core v36: Codebook for the EU-SILC-like panel for Germany based on the SOEP

All documentation for filtering can be found on this page

keyboard_arrow_up