Title: Socio-Economic Panel, data from 1984-2021 (SOEP-Core, v38.1, Onsite Edition - Update)
DOI : 10.5684/soep.core.v38.1o
Collection period: 1984-2021
Publication date: 2023-11-13
Principal investigators: Jan Goebel, Markus M. Grabka, Carsten Schröder, Sabine Zinn, Charlotte Bartels, Matthis Beckmannshagen, Andreas Franken, Martin Gerike, Florian Griese, Christoph Halbmeier, Selin Kara, Peter Krause, Elisabeth Liebau, Jana Nebelin, Marvin Petrenz, Sarah Satilmis, Rainer Siegers, Hans Walter Steinhauer, Felix Süttmann, Knut Wenzig, Stefan Zimmermann
Contributor: infas Institut für angewandte Sozialwissenschaft GmbH (Data Collector)
Population: Persons living in private households in Germany
Special samples: Migration (since 1994/95, 2013, 2015, 2020), Refugees (since 2016). A complete description of all samples can be found under SOEP Samples in Detail.
Sampling: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk or register sample.
Collection mode:The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 17 years of age, mainly concerning attendance at institutions (kindergarten, elementary school)
Citation of the data set: Socio-Economic Panel, data from 1984-2021 (SOEP-Core, v38.1, Onsite Edition - Update), 2023, doi:10.5684/soep.core.v38.1o
If you don‘t exclude observations from the Migration Samples in your analysis, please also cite as follows:
IAB-SOEP Migration Samples (M1, M2), data of the years 2013-2021, DOI: 10.5684/soep.iab-soep-mig.2021.1
If you don‘t exclude observations from the Refugee Samples in your analysis, please also cite as follows:
IAB-BAMF-SOEP Survey of Refugees (M3-M5), data of the years 2016-2021, DOI: 10.5684/soep.iab-bamf-soep-mig.2021.1
Summary: The Onsite Edition is the edition with the highest information level. Users can access additional information about the municipalities, postal codes and geocoded data of the SOEP households or data from microm GmbH on households’ neighborhoods. For more information, see the Editions chapter in SOEPcompanion.
Publications using this file should refer to the above DOI Find an explanation on the usage of DOI here.and cite following references
If you do not exclude the cases of the migration samples in your analysis, then please also cite the following reference:
If you do not exclude the cases of the refugee samples in your analysis, please also cite: IAB-BAMF-SOEP survey of refugees (M3-M5), data for the years 2016-2021,
If you use data from the SOEP-LEE2 surveys, please also cite:
If you would like to refer more specifically, please also cite:
For the update of the SOEP-Core data 1984-2021 (v38.1) - waves A bis BL - we provide the following editions:
soep.core.v38.1eu (Update, EU Edition, 100%)
soep.core.v38.1i (Update, International Scientific Use Version, 95%)
soep.core.v38.1t (Update, Teaching Edition, 50%)
soep.core.v38.1at (Update, Add-on: Area types)
soep.core.v38.1pr (Update, Add-on: Planning regions)
soep.core.v38.1r (Update, Remote Edition)
soep.core.v38.1o (Update, Onsite Edition)
For detailed infomation on the different data editions, see SOEPcompanion.
These datasets are included in SOEP v38.1, but are also available as individual data sets upon request:
soep.iab-soep-mig.2021.1 (Update, Migration Sample)
soep.iab-bamf-soep-mig.2021.1 (Update, Refugee Sample)
Version v38.1 includes the equivalent dataset blpequiv and correspondingly the 2021 data in the long format file pequiv.
Version v38.1 includes the first wave of establishment data from the SOEP-LEE2 project. Data were collected from enterprises whose employees include SOEP-Core respondents and can be linked to SOEP-Core. The collected data are available in the dataset lee2estab. The link between enterprises and the SOEP-Core respondents is included in lee2person. Additional fieldwork data is provided in lee2brutto. Interested researchers can use the EU edition of SOEP-Core, which includes most of the variables collected. More sensitive variables are only available onsite in the Research Data Center of the SOEP. Further information and data documentation can be found on the homepage of the RDC SOEP.
The Institute for Employment Research (IAB), the Research Center of the Federal Office for Migration and Refugees (BAMF-FZ) and the Socio-Economic Panel (SOEP) at the German Institute for Economic Research (DIW Berlin) jointly conduct a representative longitudinal survey of refugees in Germany. In addition to the survey institute, the cooperation partners regularly check the quality of the data. In the process, the IAB noticed that some interviewers have very conspicuous time stamps after a certain point in the survey. These seemed so unrealistic that the cooperation partners agreed to remove the conspicuous interviews with their entire household from the survey data. In addition, individual variables in the sample and method data will be adjusted for these individuals and households. The weights are updated accordingly.
This affects 87 households with 160 person interviews. As the size of the exclusion is only 0.65 percent of the realized household interviews from 2021 onwards, no differences in the analysis results are expected.
The variables containing information on the current occupation have more missing values than necessary. Normally, the information on whether there has been a job change (captured by the variable JOBCH) is used to update the job-related information, especially in years when no questions are asked on this aspect. However, this mechanism did not function properly for these variables when the survey year is 2021. This issue is resolved in version 38.1.
The problem with the spelltyp variable from artkalen.dta (which, in turn, affects the stib variable from pgen.dta) is that several young people end up with spelltyp==6 (retired).
The average number of months in retirement from the "activity calender" question is very high for samples M3-M6 in 2020, conditional on being younger than 60.
This problem is fixed in v38.1. The cause of the problem was the swapping of possible answers in some translated questionnaires.
Dataset pbrutto
Dataset pequiv
Dataset ppathl
Dataset pl
Dataset plueckel
Dataset blcorona
Dataset housing2021
Dataset gkal and lkal
Dataset instrumentation
Dataset vp
Dataset PBRUTTO
Dataset HBRUTTO/HBRUTT
Dataset PL
For large datasets like pl we recommend the use of Stata/MP or Stata/SE on a computer with an internal memory of 16GB.
Users can still work with the data in Stata/IC or on less powerful computers, but to work effectively SOEP offers for pl alternative data formats.
If you wish to order an alternative format for pl (e.g. pl in separate year or decade data sets) because your system requirements are not sufficient, please submit your request via the
[order form](https://www.diw.de/de/diw_01.c.357906.de/soep_bestellformular_mod.html) or contact the SOEP hotline by phone or e-mail.
The time use variables could have been -2 and 0 in the data, but both values meant "does not apply". All -2 values were therefore set to 0 as a correction process, since the questionnaire design expects a 0 to be assigned for "does not apply".
Dataset HL
Dataset BIOL
Dataset JUGENDL
imonth, iday, ihour and iminute for 2021 were moved to the INSTRUMENTATION dataset, where they can be found from this version on.
Dataset BIOAGEL
Dataset BIOPUPIL
Dataset KIDLONG
Dataset VPL
The ZIP-Files contain now a folder soepdata, which contains itself the folders eu-silc-like-panel and raw. This makes it easier to refer to the folders in the documentation. We called the soepdata folder sometimes "toplevel folder" or "./", what has been less informative for our users.
There have been over 40 datasets, which have been saved in the former toplevel folder (see above) and in the raw folder. You find them now exclusively in the new `soepdata` folder.
Dataset PGEN
Dataset BLP
Dataset PBIOSPE
Dataset ARTKALEN
Dataset CAMCES
Dataset BIOIMMIG
Dataset BIOBIRTH
Dataset PPATHL/PPATH
Dataset BIORESIDREFING
The following data sets are still at the V37 level and have not been updated. We will update them as far as possible with the next realease of the data:
Dataset | Description |
PEQUIV | CNEF Eqivalent File |
The following data sets are still at the V36 level and have not been updated. We will update them as far as possible with the next realease of the data:
Dataset | Description |
MIGSPELL | Migration History |
REFUGSPELL | Migration History for Refugees |
BIOJOB | First and last Job |
BIOEDU | Educational History |
BIOPAREN | SES of Parents |
BIOSIB | Siblings Information |
BIOTWIN | Twins Information |
Individual (PAPI) 2021: -de
Individual (techn) 2021: -de
Individual (PAPI) 2021: -en
Individual (techn) 2021: -en
Household (PAPI) 2021: -de
Household (techn) 2021: -de
Household (PAPI) 2021: -en
Household (techn) 2021: -en
Biography (PAPI) 2021: -de
Biography (techn) 2021: -de
Biography (PAPI) 2021: -en
Biography (techn) 2021: -en
Catch-up Individual (PAPI) 2021: -de
Catch-up Individual (techn) 2021: -de
Catch-up Individual (PAPI) 2021: -en
Catch-up Individual (techn) 2021: -en
Youth (16-17-year-olds, PAPI) 2021: -de
Youth (16-17-year-olds, techn) 2021: -de -de
Youth (16-17-year-olds, PAPI) 2021: -en
Early Youth (13-14-year-olds, PAPI) 2021: -de
Early Youth (13-14-year-olds, techn) 2021: -de
Early Youth (13-14-year-olds, PAPI) 2021: -en
Early Youth (13-14-year-olds, techn) 2021: -en
Pre-teen (11-12-year-olds, PAPI) 2021: -de
Pre-teen (11-12-year-olds, techn) 2021: -de
Pre-teen (11-12-year-olds, PAPI) 2021: -en
Pre-teen (11-12-year-olds, techn) 2021: -en
Mother and Child (Newborns, PAPI) 2021: -de
Mother and Child (Newborns, techn) 2021: -de
Mother and Child (Newborns, PAPI) 2021: -en
Mother and Child (Newborns, techn) 2021: -en
Mother and Child (2-3-year-olds, PAPI) 2021: -de
Mother and Child (2-3-year-olds, techn) 2021: -de
Mother and Child (2-3-year-olds, PAPI) 2021: -en
Mother and Child (2-3-year-olds, techn) 2021: -en
Mother and Child (5-6-year-olds, PAPI) 2021: -de
Mother and Child (5-6-year-olds, techn) 2021: -de
Mother and Child (5-6-year-olds, PAPI) 2021: -en
Mother and Child (5-6-year-olds, techn) 2021: -en
Parents and Child (7-8-year-olds, PAPI) 2021: -de
Parents and Child (7-8-year-olds, techn) 2021: -de
Parents and Child (7-8-year-olds, PAPI) 2021: -en
Parents and Child (7-8-year-olds, techn) 2021: -en
Mother and Child (9-10-year-olds, PAPI) 2021: -de
Mother and Child (9-10-year-olds, techn) 2021: -de
Mother and Child (9-10-year-olds, PAPI) 2021: -en
Mother and Child (9-10-year-olds, techn) 2021: -en
Deceased Individual (PAPI) 2021: -de
Deceased Individual (techn) 2021: -de
Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site
3) SOEP-Core v38.1 – INSTRUMENTATION: Information on the Utilization of Questionnaires
4) SOEP-Core v38.1 – PPATHL: Person-Related Meta-Dataset
5) SOEP-Core v38.1 – HPATHL: Household-Related Meta-Dataset
6) SOEP-Core v38 – PBRUTTO: Person-Related Gross File
7) SOEP-Core v38.1 – PBRUTTO: Person-Related Gross File
8) SOEP-Core v38 – HBRUTTO: Household-Related Gross File
9) SOEP-Core v38.1 – HBRUTTO: Household-Related Gross File
10) SOEP-Core v38.1 – PGEN: Person-Related Status and Generated Variables
11) SOEP-Core v38.1 – HGEN: Household-Related Status and Generated Variables
12) SOEP-Core v38.1 – BIOIMMIG: Generated Information on Immigration History
1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008
2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents
3) The Request for Record Linkage in the IAB-SOEP Migration Sample
5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK
6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014
7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32
9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel
10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten
11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version
12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing
13) SOEP Scales Manual (updated for SOEP-Core v32.1)
17) Multi-Itemskalen im SOEP Jugendfragebogen
20) SOEP-CoV: Project and Data Documentation
22) SOEP 2013 – Documentation of Generated Person-Level Long-Term Care Variables in PFLEGE
23) SOEP-Core v34 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables
26) SOEP-Core v36: Codebook for the EU-SILC-like panel for Germany based on the SOEP
All documentation for filtering can be found on this page