SOEP-Core Version 35

The Socio-Economic Panel (SOEP) is a representative, multi-cohort survey that has been running since 1984. Every year, individuals in households throughout Germany are surveyed by our survey institute on behalf of DIW Berlin. These respondents provide information on topics such as their income, employment history, education, and health. Because the same people are surveyed every year, it is possible to track long-term psychological, economic, societal, and social developments. To keep pace with changes in society, random samples are added regularly and the survey is adapted accordingly. The international version of the data contains 95% of the total sample (see “Other versions of this release”,10.5684/soep.v35i).

Dataset Information

Title: Socio-Economic Panel (SOEP), data from 1984-2018

DOI infoFind an explanation on the usage of DOI here . : 10.5684/soep-core.v35
Collection period: 1984-2018
Publication date: 01.11.2019
Principal investigators: Stefan Liebig, Jan Goebel, Martin Kroh, Carsten Schröder, Markus Grabka, Jürgen Schupp, Charlotte Bartels, Alexandra Fedorets, Andreas Franken, Jannes Jacobsen, Selin Kara, Peter Krause, Hannes Kröger, Maria Metzing, David Richter, Diana Schacht, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Rainer Siegers, Knut Wenzig, Stefan Zimmermann

Contributor: Kantar Deutschland GmbH (Data Collector)

Population: Persons living in private households in Germany

Amount of households: 18.682

Amount of individuals: 31.997 + 3971 children

Special samples: Citizens of the GDR (1990), Immigration/Migration (1994/95, 2013, 2015), Refugees (seit 2016). See the chapter SOEP-Samples in Detail on the SOEPcompanion for a description of all our samples. 

Collection Mode:The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)

Citation of the Data Set: Socio-Economic Panel (SOEP), data for years 1984-2018, version 35, SOEP, 2019, doi:10.5684/soep.v35.

For the SOEP-Core data 1984-2018 (v35) - Wave A to BI - we provide the following versions:

soep-core.v35

soep-core.v35i (International Scientific Use Version, 95%)

soep-core.v35t (Teaching version)

These datasets are included in SOEP v35, but is also available as individual data sets upon reques:

soep.ddr18 (Living in the GDR)

soep.iab-soep-mig.2018 (Migration samples)

soep.iab-bamf-soep-mig.2018 (Refugee samples)

SOEP-Core soep-core.v35

1. New sample in the main SOEP study

The new refresher sample, Subsample O, contains 1,000 new households. These were selected in cooperation with BBSR using a new sampling design based on regional data in areas where the “Soziale Stadt” (social city) urban development project is being carried out. Based on the digital data available on the boundaries of the “Soziale Stadt” areas, it was possible to create a new variable going back to the year 2000 that shows whether or not a household’s address is within an area covered by the project (see Variable Description below under 4.4).

2. Modifications in our new main data format, SOEPlong

We have made the following important changes over and beyond to our normal annual updates:

  • PKAL: Integration of the $PKALOST datasets
  • PL/PKAL: Calendar strings all now stored in PL and monthly variables in PKAL
  • PLUECKEL: Introduction of RYEAR and correction of SYEAR, which was RYEAR up to now
  • PBRUTTO: If a variable was not part of the year-specific gross file, the missing code has been changed to -8 and is no longer -2.
  • VPL: The case numbers for past years have increased since cases without a SOEP respondent are no longer deleted
  • KIDLONG: The harmonization concept has been adapted to the concept used with other datasets; more variables from $KIND datasets have been included (more information under 4)

3. New in SOEPhelp

  • SOEPhelp now includes links between topics and variables from the metadata. The data overview (command: soephelp (without variable)) lists all the topics in the dataset and tells which variables belong to which topic.
  • The variable overview (command: soephelp [variable]) lists the topics covered by the variable (and the relationships among topics and subtopics). The topic labels are linked to Paneldata.org.
  • SOEPhelp now has a search tool! If you type in the command: soephelp, search (SEARCH TERM) [verbose], you will get a list of the variables for which your SEARCH TERM is contained either in the question or one of the answer options. The variables are provided in list form and saved in r (for returns). The option “verbose” describes the variables in more detail.
  • More information on SOEPhelp

4. New Datasets and Variables

4.1. Early Childhood

  • New dataset BCBFK “Early Childhood" with geographically detailed information about the places where the respondents grew up. Because of the detailed regional data the dataset is only available with the RDC SOEP. The corresponding field report and questionnaire is available as SOEP Survey Paper 766 (PDF, 1.28 MB) (in German).

4.2. Your Life in the GDR

  • New dataset DDR18 “Your Life in the GDR”, the corresponding questionnaire is available as Survey Paper 676.

4.3. Biography follow-up survey

  • The variables from the biography follow-up survey on migration status have been integrated into the dataset BILELA or BIOL.

4.4. New variable SOCURBAN in dataset HBRUTTO

  • SOCURBAN: Household address is in an area where the “Soziale Stadt” (social city) urban development project is being carried out (as of July 2017) (Yes/No)

4.5. New variables in dataset EQUIV

  • ILIB1$$: pensions for liberal professions
  • ILIB2$$: widow / orphans pensions for liberal professions

4.6. New variables in dataset BIOJOB

  • In 2018, respondents received new survey instruments concerning job classifications and prestige score. This information is provided in new following variables: STBA10, ISCO08, EGP08, ISEI08, MPS08, and SIOPS08. Corresponding variables STBA, EGP, ISEI, MPS, und SIOPS of older versions of BIOJOB are renamed in STBA92, EGP88, ISEI88, MPS92, und SIOPS88.

5. Changes to datasets or individual variables

5.1. Weighting variable PHRF in the dataset PPATHL

  • There are slight changes concerning the poststratification of the weighting variables starting in 2013. The changes relate to the year of immigration. Previously, respondents who immigrated before 1955 were treated as migrants; they now constitute a distinct category of their own, along with recent immigrants and German-born respondents. The reason is that it is not possible to define ethnic Germans consistently between the Mikrozensus and the SOEP.

5.2. Variables representing occupational codes

  • Since 2013, open-ended questions on occupations have been coded in ISCO-08 and KldB 2010. This is the first year in which the old classifications ISCO-88 and KldB 92 are no longer available. We have therefore introduced new prestige scores based on the new classifications and discontinued the old scores.
  • Calendar strings have been moved from $PKAL to $P or standardized.

5.3. Educational variables

  • Up to soep.v34, the basic generated educational variables were generated annually and were cumulated over time. Due to the availability of SOEPlong, we have substantially revised the tools used for generating variables to always consider all available educational variables for each year.
  • In addition to the fact that all variables are now generated based entirely on SOEPlong files, we have also made two additional modifications:
  • First, the main educational variables now also take into account inconsistencies over time, in contrast to the educational variables in PGEN prior to soep.v34.
  • Second, variable “Amount of Education or Training in Years” ($$BILZEIT) has been slightly modified. To consider occupational training (for non-university degrees), we have adjusted the years of education for “civil servants” and “others” slightly.

5.4. Dataset KIDLONG

  • Errors in the integration of variables were corrected, split up in versioned variables, and harmonized variables were constructed. As a result, the number of variables has increased: 110 variables (v.34); 267 variables (v.35)
  • Missing variables from the $KIND datasets were incorporated into KIDLONG.
  • Corrected version of BHKIND was incorporated into KIDLONG.
  • KIDLONG now adheres to the classic harmonization concept).

5.5. Dataset BHKIND

  • Flag variable to identify child questionnaires that were not completed (BHKFLAG)
  • Missing observations were added: 15,032 (v.34) to 15,504 (v.35).
  • Errors in the integration of variables were corrected and missing variables were incorporated into BHKIND: 85 (v.34) variables; 129 variables (v.35).
  • All variables were renamed and now follow the SOEP naming conventions.

5.6. Dataset BIKIND

  • Flag variable added to identify child questionnaires that were not completed (BIKFLAG)
  • All variables now follow the SOEP naming conventions.

5.7. Variable PARID in the dataset PPATHL

  • Partnerships of respondents with net codes between 40 and 49 were dissolved and will be coded -2 “does not apply” in the future.

5.8. Variable HGOWNER in the dataset HGEN

  • In samples M3-M5 in 2017, several missing values in the variable HGOWNER were replaced with the information that a household is living in a shelter or housing for refugees.

5.9. Dataset INTERVIEWER

  • The year 2016 now contains information from Samples L2-M4.
  • The variable on the length of the interview (LENGTHINT) was eliminated and replaced by three variables, which each just give the average length of one questionnaire (LENGTHINT- H / P / J).
  • The youth surveys, which were previously counted in the number of interviews per person (AMOUNTINTP) now have their own variable (AMOUNTINTJ).

5.10. Dataset BIOAGE17

  • Previous versions of BIOAGE17 contained the identifier of the respondent’s mother (BYMNR) and father (BYVNR). The identifiers of the parents are found in BIOPAREN (MNR and VNR) and can be easily merged with BIOAGE17.
  • Desired occupation variables ISCO88 have been replaced by ISCO08. The same is true for BYKLAS: The old 1992 version has been replaced by the 2010 version.

5.11. Dataset BIOAGEL

  • The internal distinction between BIOAGE 8a and 8b, or between 81 and 82, has been eliminated, meaning that the dataset BIOAGEL now contains one line per child and respondent for questionnaires about 7-8-year-old children. As a result, when each parent completed a questionnaire on a child in a given year, there are two lines for that child (one line per parent). These can be identified by the different PIDE (PID of the respondent).

(as of April 2020)

Dataset: bioage; variable clref
We detected a label error in the data set bioage in the variable clref that could be misleading when analyzing the data. The labels for values [1] und [2] need to be switched.

stata [de]

label def clref ///
1 "[1] Ja, sowohl spez. Klasse als auch Regelunterricht" ///
2 "[2] Ja, ausschliessl. spez. Klasse fuer gefluechtete Kinder", modify

stata [en]

label def clref ///
1 "[1] Yes, both special class and regular classes" ///
2 "[2] Yes, only special class for refugee children", modify

spss [de]

add value labels clref 1 '[1] Ja, sowohl spez. Klasse als auch Regelunterricht' 2 '[2] Ja, ausschliessl. spez. Klasse fuer gefluechtete Kinder' .

spss [en]

add value labels clref 1 '[1] Yes, both special class and regular classes' 2 '[2] Yes, only special class for refugee children' .


Individual (PAPI) 2018: Field-de,en Var-de Var-en
Household (PAPI) 2018: Field-de,en Var-de Var-en
Biography (PAPI) 2018: Field-de,en Var-de Var-en
Catch-up Individual (PAPI) 2018: Field-de Var-de Var-en
Youth (16-17-year-olds, PAPI) 2018: Field-de Var-de Var-en
Early Youth (13-14-year-olds, PAPI) 2018: Field-de Var-de Var-en
Pre-teen (11-12-year-olds, PAPI) 2018: Field-de Var-en Var-en
Mother and Child (Newborns, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (2-3-year-olds, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (5-6-year-olds, PAPI) 2018: Field-de Var-de Var-en
Parents and Child (7-8-year-olds, PAPI) 2018: Field-de Var-de Var-en
Mother and Child (9-10-year-olds, PAPI) 2018: Field-de Var-de Var-en
Deceased Individual (PAPI) 2018: Field-de Var-de Var-en
Grip Strength 2018: Field-de

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

1) SOEP-Core v35 – Documentation of Sample Sizes and Panel Attrition in the German Socio-Economic Panel (SOEP)(1984 until 2018)

2) SOEP-Core – 2018: Sampling, Nonresponse, and Weighting in the Sample O

3) SOEP-Core v35 – PPATHL: Person-Related Meta-Dataset

4) SOEP-Core v35 – HPATHL: Household-Related Meta-Dataset

5) SOEP-Core v35 – PBRUTTO: Person-Related Gross File

6) SOEP-Core v35 – HBRUTTO: Household-Related Gross File

7) SOEP-Core v35 – PGEN: Person-Related Status and Generated Variables

8) SOEP-Core v35 – HGEN: Household-Related Status and Generated Variables

9) SOEP-Core v35 – Codebook for the $PEQUIV File 1984-2018: CNEF Variables with Extended Income Information for the SOEP

10) SOEP-Core v35 – BIOIMMIG

11) SOEP-Core v35 – HEALTH

12) SOEP-Core v35 – BIOAGEL & BIOPUPIL: Generated Variables from the "Mother & Child", "Parent", "Pre-Teen", and "Early Youth" Questionnaires

13) SOEP-Core v35 – The Couple History Files BIOCOUPLM and BIOCOUPLY, and Marital History Files BIOMARSM and BIOMARSY

14) SOEP-Core v35 – BIOAGE17: The Youth Questionnaire

15) SOEP-Core v35 – BIOSOC: Retrospective Data on Youth and Socialization

16) SOEP-Core v35 – BIOJOB: Detailed Information on First and Last Job

17) SOEP-Core v35 – BIOEDU: Data on Educational Participation and Transitions

18) SOEP-Core v35 – BIORESID: Variables on Occupancy and Second Residence

19) SOEP-Core v35 – BIOBIRTH: A Data Set on the Birth Biography of Male and Female Respondents

20) SOEP-Core v35 – BIOTWIN: TWINS in the SOEP

21) SOEP-Core – 2018: Documentation of the Interviewer Dataset (1984 until 2018)

22) SOEP-Core v35 – INTERVIEWER

23) SOEP-Core v35 – LIFESPELL: Information on the Pre- and Post-Survey History of SOEP-Respondents

24) SOEP-Core v35 – MIGSPELL and REFUGSPELL: The Migration-Biographies

25) SOEP-Core v35 – Activity Biography in the Files PBIOSPE and ARTKALEN

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) Documentation on ISCED Generation Using the CAMCES Tool in the IAB-SOEP Migration Samples M1/M2

3) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

4) The Request for Record Linkage in the IAB-SOEP Migration Sample

5) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

6) SOEP 2007 – Editing und multiple Imputation der Vermögensinformation 2002 und 2007 im SOEP

7) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

8) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

9) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

10) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

11) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

12) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

13) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

14) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

15) SOEP Scales Manual (updated for SOEP-Core v32.1)

16) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

17) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

18) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

19) Multi-Itemskalen im SOEP Jugendfragebogen

20) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

21) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

22) SOEP 2013 – Documentation of Generated Person-Level Long-Term Care Variables in PFLEGE

23) SOEP-Core v34 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables

24) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

25) SOEP-Core v34: Codebook for the EU-SILC-Like Panel for Germany Based on the SOEP

26) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

Alle Dokumentationen zum Filtern finden Sie auf dieser Seite