The Socio-Economic Panel (SOEP) is a representative, multi-cohort survey that has been running since 1984. Every year, individuals in households throughout Germany are surveyed by our survey institute on behalf of DIW Berlin. These respondents provide information on topics such as their income, employment history, education, and health. Because the same people are surveyed every year, it is possible to track long-term psychological, economic, societal, and social developments. To keep pace with changes in society, random samples are added regularly and the survey is adapted accordingly.
Title: Socio-Economic Panel (SOEP), data from 1984-2020, EU Edition
DOI : 10.5684/soep.core.v37eu
Collection period: 1984-2020
Publication date: 2022-04-08
Principal investigators: Stefan Liebig, Jan Goebel, Markus Grabka, Carsten Schröder, Sabine Zinn, Charlotte Bartels, Andreas Franken, Martin Gerike, Sascha-Christopher Geschke, Florian Griese, Selin Kara, Johannes König, Peter Krause, Hannes Kröger, Elisabeth Liebau, Jana Nebelin, Marvin Petrenz, David Richter, Jürgen Schupp, Rainer Siegers, Hans Walter Steinhauer, Knut Wenzig, Stefan Zimmermann
Contributor: Kantar Deutschland GmbH (Data Collector)
Population: Persons living in private households in Germany
Amount of households: 19.032
Amount of individuals: 32.050 + 3476 Children
Special samples: Citizens of the GDR (1990), Immigration/Migration (1994/95, 2013, 2015), Refugees (since 2016). See the chapter SOEP-Samples in Detail on the SOEPcompanion for a description of all our samples.
Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk or register sample.
Collection Mode:The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 12 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 12 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)
Citation of the Data Set: Socio-Economic Panel (SOEP), data for years 1984-2020, SOEP-Core v37, EU Edition, 2022, doi:10.5684/soep.core.v37eu
If you don‘t exclude observations from the Migration Samples in your analysis, please also cite as follows:
IAB-SOEP Migration Samples (M1, M2), data of the years 2013-2020, DOI: 10.5684/soep.iab-soep-mig.2020
If you don‘t exclude observations from the Refugee Samples in your analysis, please also cite as follows:
IAB-BAMF-SOEP Survey of Refugees (M3-M5), data of the years 2016-2020, DOI: 10.5684/soep.iab-bamf-soep-mig.2020
Publications using this file should refer to the above DOI Find an explanation on the usage of DOI here.and cite following references
If you do not exclude the cases of the migration samples in your analysis, then please also cite the following reference:
If you do not exclude the cases of the refugee samples in your analysis, please also cite: IAB-BAMF-SOEP survey of refugees (M3-M5), data for the years 2016-2021,
If you use data from the SOEP-LEE2 surveys, please also cite:
If you would like to refer more specifically, please also cite:
For the SOEP-Core data 1984-2020 (v37) - waves A bis BK - we provide the following editions:
soep.core.v37eu (EU Edition, 100%)
soep.core.v37i (International Scientific Use Version, 95%)
soep.core.v37t (Teaching Edition, 50%)
soep.core.v37at (Add-on: Area types)
soep.core.v37pr (Add-on: Planning regions)
soep.core.v37r (Remote Edition)
soep.core.v37o (Onsite Edition)
For detailed infomation on the different data editions, see SOEPcompanion.
These datasets are included in SOEP v37, but are also available as individual data sets upon request:
soep.iab-soep-mig.2020 (Migration Sample)
soep.iab-bamf-soep-mig.2020 (Refugee Sample)
New Sample M6
The 2020 boost sample M6 supplements the samples of the IAB-BAMF-SOEP Survey of Refugees by 1,141 households. To recruit these households, a random sample was drawn from the Central Register of foreigners.
The sample consists of two main groups, namely persons who entered Germany between January 2013 up to the end of December 2016, filed an asylum application and whose last change of asylum status took place in 2013 to the end of 2016 (refreshment). The second group consists of persons who entered Germany between January 2013 and end of June 2019, filed an asylum application and whose last change of asylum status took place in 2017 to the end of June 2019 (enlargement).
New Sample M7
The 2020 boost sample M7 supplements the samples of the IAB-SOEP Migration Survey by 783 households. Similar to the M1 and M2 sample, register data of the Federal Employment Agency was used as a sampling frame. Information is collected on households with recent migrants from Poland, Romania, and Bulgaria between January 2016 and December 2018.
New Sample M8
The 2020 boost sample M8 supplements the samples of the IAB-SOEP Migration Survey by 1,096 households. Register data of the Federal Employment Agency was used to identify the population of third-country nationals who applied for working in Germany as professionals ("Fachkräfte") based on the Residence Act (Zuwanderungsgesetz) and were granted a permission in the time from January 2019 until January 2020. The sample also provides a basis to evaluate the "Fachkräfteeinwanderungsgesetz" becoming effective on March 1, 2020.
Dataset COV, COV_BRUTTO, COV_CONTACT - Datasets of SOEP-CoV study
All three datasets are associated with the 9 tranches of the SOEP-CoV in 2020, the SOEP-CoV wave in 2021 and COVID-19-special interviews 2020 from the IAB-BAMF-SOEP Survey of Refugees in Germany. More information about the project can be found online at the SOEP-Cov Homepage or in the references below, which we also recommend to cite if the data is used.
Kühne, S., Kroh, M., Liebig, S. & S. Zinn, (2020): The Need for Household Panel Surveys in Times of Crisis: The Case of SOEP-CoV. In: Survey Research Methods 14(2): 195-203.
Siegers, R., Steinhauer, H.-W. & S. Zinn, (2021): Weighting the SOEP-CoV study 2020. No. 989. SOEP Survey Papers (PDF, 486.06 KB).
Dataset bkbiorki - Dataset of RKI-SOEp study
The dataset contains information about: "How many people have already been infected with the coronavirus, SARS-CoV-2? How many infections have gone undetected?" More information about the project can be found online at Nationwide Antibody “Study Living in Germany - Corona Monitoring” (RKI-SOEP).
bkbiorki contains results of PCR and DBS tests, as well as survey content. Data is available on request.
With V34, we introduced a new directory structure by merging our former independently delivered data formats SOEP-wide and SOEP-long. In the top-level (or root) directory, you find all "SOEPlong datasets" (pl, ppfadl, pl, hl, pgen, hgen etc.) as well as all of the biographical or spell datasets (bioparen, artkalen, etc.).
The raw directory provides the datasets in their original wide and cross-sectional SOEP format. What`s new is that we offer identifiers identical to the names in the long data (PERSNR to PID or $HHRNAKT to HID) and an additional variable survey year (SYEAR), so that users can easily merge variables from both data formats.
In order to ensure consistency of data and also to not alienate new users, these traditional "old" ID variables (PERSNR, HHNR, $HHNR, HHNRAKT) will no longer be delivered starting this year. Please use the new identifiers (PID, HID, CID, SYEAR).
Dataset PBRUTTO
Dataset BIOL
Dataset PL
Dataset JUGENDL
Dataset HBRUTTO
Dataset HBRUTT
Dataset KIDLONG
Dataset BIOBIRTH
BIOBIRTH now contains one row for each person who has ever lived in a SOEP household and therefore represents the population of PPFAD. Unlike in v36, where BIOBIRTH provided fertility information on every woman and man who has ever provided at least one successful SOEP Biography interview.
Precise identification of individuals and their information quality as well as the respondent´s status is possible via the variables biovalid and the new variable bioinfo. Theses variables provide information on whether individual level information is based merely on information derived from household composition and family relations, or on biographical questionnaire data. This should help data users to better assess how trustworthy a piece of information is. Birth information of persons without a completed SOEP Biography interview can only be inferred and is estimated via household composition and family relations. The variables kidpnr[nn], kidgeb[nn], kidmon[nn], kidsex[nn] were increased from a maximum of 15 possible children to 19 possible children. The information from kidmon[nn] is based on the information from the slightly new generated variable kidmon[nn] from PPFAD.
Dataset HGEN
New variable hgeqpfire introduced. The variable indicates whether a household has a fireplace or ceramic tiled stove.
Dataset PFLEGE
Two variables are no longer part of the file PFLEGE: MULTGRAD and WERPFLGT. Instead, 6 new dummy variables were included. These new variables describe by whom care is provided for a person in need of care in a household:
Dataset PEQUIV
The dataset PEQUIV contains two new additional variables. These are:
Dataset INTERVIEWER
Variable educ_i (surveyed education of interviewer) was recoded and incorrect value labels were corrected.
old (wrong) values:
[1] Secondary School Degree - Sekundarschulabschluss
[2] Intermediate School Degree - Mittlerer Schulabschluss
[3] Upper Secondary School Degree - Abschluss der Sekundarstufe II
[4] Left university without degree - Hochschule ohne Abschluss verlassen
[5] Graduate degree - Hochschulabschluss
new (correct) values:
[1] No School Degree - Ohne Abschluss
[2] Secondary School Degree (GDR: 8th grade) - Hauptschulabschluss (DDR: 8. Klasse)
[3] Intermediate School Degree (GDR: 10th grade) - Mittlere Reife(DDR: 10. Klasse)
[4] Technical School Degree - Fachhochschulreife
[5] Upper Secondary Degree - Abitur/Hochschulreife
The following data sets are still at the V36 level and have not been updated. We will update them as far as possible with the next realease of the data:
Dataset | Description |
migspell | Migrations History |
refugspell | Migration History for Refugees |
biojob | First and Last Job |
bioedu | Educationsl History |
bioparen | SES of Parents |
biosib | Sibling Information |
biotwin | Twins Information |
SOEP-CoV (Wave 1, Tranches 2-3) 2020: Var-de
SOEP-CoV (Wave 1, Tranche 4) 2020: Var-de
SOEP-CoV (Wave 1, Tranches 5-6) 2020: Var-de
SOEP-CoV (Wave 1, Tranches 7-8) 2020: Var-de
SOEP-CoV (Wave 1, Tranche 9) 2020: Var-de
Individual (PAPI) 2020: Field-de Field-en
Individual (CAPI) 2020: Var-de
Household (PAPI) 2020: Field-de Field-en
Household (CAPI) 2020: Var-de
Biography (PAPI) 2020: Field-de Field-en
Biography (CAPI) 2020: Var-de
Catch-up Individual (PAPI) 2020: Field-de
Corona 2020: Var-de
Catch-up Individual (CAPI) 2020: Var-de
Youth (16-17-year-olds, PAPI) 2020: Field-de
Youth (16-17-year-olds, CAPI) 2020: Var-de
Early Youth (13-14-year-olds, PAPI) 2020: Field-de
Early Youth (13-14-year-olds, CAPI) 2020: Var-de
Pre-teen (11-12-year-olds, PAPI) 2020: Field-de
Pre-teen (11-12-year-olds, CAPI) 2020: Var-de
Mother and Child (Newborns, PAPI) 2020: Field-de
Mother and Child (Newborns, CAPI) 2020: Var-de
Mother and Child (2-3-year-olds, PAPI) 2020: Field-de
Mother and Child (2-3-year-olds, CAPI) 2020: Var-de
Mother and Child (5-6-year-olds, PAPI) 2020: Field-de
Mother and Child (5-6-year-olds, CAPI) 2020: Var-de
Parents and Child (7-8-year-olds, PAPI) 2020: Field-de
Parents and Child (7-8-year-olds, CAPI) 2020: Var-de
Mother and Child (9-10-year-olds, PAPI) 2020: Field-de
Mother and Child (9-10-year-olds, CAPI) 2020: Var-de
Deceased Individual (PAPI) 2020: Field-de
Deceased Individual (CAPI) 2020: Var-de
Corona 2020 Round 2: Var-en
Corona 2020 Round 4: Var-en
Corona 2020 Round 5: Var-en
Corona 2020 Round 7: Var-en
Corona 2020 Round 9: Var-en
Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site
1) Weighting the SOEP-CoV Study 2020
5) SOEP-Core – 2020: Sampling, Nonresponse, and Weighting in the IAB-SOEP Migration Studies M7 and M8
7) SOEP-Core v37 – PPATHL: Person-Related Meta-Dataset
8) SOEP-Core v37 – HPATHL: Household-Related Meta-Dataset
9) SOEP-Core v37 – PBRUTTO: Person-Related Gross File
10) SOEP-Core v37 – HBRUTTO: Household-Related Gross File
11) SOEP-Core v37 – PGEN: Person-Related Status and Generated Variables
12) SOEP-Core v37 – HGEN: Household-Related Status and Generated Variables
14) SOEP-Core v37 – BIOIMMIG: Generated Information on Immigration History
1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008
2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents
3) The Request for Record Linkage in the IAB-SOEP Migration Sample
5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK
6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014
7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32
9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel
10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten
11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version
12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing
13) SOEP Scales Manual (updated for SOEP-Core v32.1)
17) Multi-Itemskalen im SOEP Jugendfragebogen
20) SOEP-CoV: Project and Data Documentation
22) SOEP 2013 – Documentation of Generated Person-Level Long-Term Care Variables in PFLEGE
23) SOEP-Core v34 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables
26) SOEP-Core v36: Codebook for the EU-SILC-like panel for Germany based on the SOEP
All documentation for filtering can be found on this page