Direkt zum Inhalt

SOEP-Core v37eu (Daten 1984-2020, EU-Edition)

Das Sozio-oekonomische Panel (SOEP) ist eine repräsentative Wiederholungsbefragung, die bereits seit 1984 läuft. Im Auftrag des DIW Berlin werden jedes Jahr Personen aus Haushalten in ganz Deutschland von unserem Erhebungsinstitut befragt. Die Daten geben Auskunft zu Fragen über Einkommen, Erwerbstätigkeit, Bildung oder Gesundheit. Weil jedes Jahr die gleichen Personen befragt werden, können langfristige soziale und gesellschaftliche Trends besonders gut verfolgt werden. Zur adäquaten Erfassung des gesellschaftlichen Wandels werden immer wieder Stichproben implementiert, sowie eine Anpassung des Erhebungsprogramms vorgenommen.

Datensatzinformation

Titel: Sozio-oekonomisches Panel (SOEP), Daten der Jahre 1984-2020, (SOEP-Core, v37, EU Edition)

DOI infoZur Erklärung von DOI und dessen Verwendung gibt es hier Informationen . : 10.5684/soep.core.v37eu
Erhebungszeitraum: 1984-2020
Veröffentlichungsdatum: 08.04.2022
PrimärforscherInnen: Stefan Liebig, Jan Goebel, Markus Grabka, Carsten Schröder, Sabine Zinn, Charlotte Bartels, Andreas Franken, Martin Gerike, Sascha-Christopher Geschke, Florian Griese, Selin Kara, Johannes König, Peter Krause, Hannes Kröger, Elisabeth Liebau, Jana Nebelin, Marvin Petrenz, David Richter, Jürgen Schupp, Rainer Siegers, Hans Walter Steinhauer, Knut Wenzig, Stefan Zimmermann

Datenerhebung: Kantar Deutschland GmbH

Population: Personen in Privathaushalten in der Bundesrepulik Deutschland

Anzahl der Haushalte: 20.1847 (Quelle: SOEP Wave Report)

Anzahl der Personen: 32.022 + 3041 Kinder (Quelle: SOEP Wave Report)

Besondere Stichproben: BürgerIn der DDR (1990), Zuwanderung/Migration (1994/95, 2013, 2015, 2020), Geflüchtete (seit 2016). Eine ausführliche Beschreibung aller Stichproben können Sie im SOEPcompanion unter SOEP-Samples in Detail nachlesen.

Auswahlverfahren: Alle Samples des SOEP werden mittels mehrstufiger Stichprobenziehungen, die regional gebündelt sind, gezogen. Die Befragten (Haushalte) werden per random-walk oder per Registerstichprobe ausgesucht. 

Erhebungsverfahren: Die Methode der Datenerhebung des SOEP basiert auf einem Set von Fragebögen sowohl für die Haushalte als auch für die Individuen. Prinzipiell versucht die interviewende Person face-to-face-Interviews mit allen Haushaltsmitgliedern durchzuführen, die im Befragungsjahr 12 Jahre alt werden oder älter sind. Zusätzlich wird eine Person (Haushaltsvorstand) gebeten, einen Haushaltsfragebogen zu beantworten. Dort werden Fragen zu Wohnsituation und -kosten, verschiedenen Einkommensquellen sowie Fragen zu im Haushalt lebenden Kindern unter 12 Jahren (z.B. Besuch des Kindergartens, der Grundschule etc.) gestellt.

Zitation der Daten: Sozio-oekonomisches Panel (SOEP), Version 37, Daten der Jahre 1984-2020 (SOEP-Core v37, EU-Edition). 2022. DOI: 10.5684/soep.core.v37eu

Wenn Sie bei Ihrer Analyse nicht die Fälle der Migrations-Stichproben ausschliessen, dann zitieren Sie bitte auch:
IAB-SOEP-Migrationsstichproben (M1, M2), Daten der Jahre 2013-2020, DOI: 10.5684/soep.iab-soep-mig.2020

Wenn Sie bei Ihrer Analyse nicht die Fälle der Geflüchteten-Stichproben ausschliessen, dann zitieren Sie auch bitte auch: IAB-BAMF-SOEP-Befragung Geflüchteter (M3-M5), Daten der Jahre 2016-2020, DOI: 10.5684/soep.iab-bamf-soep-mig.2020

In Publikationen, die diese Datei verwenden, soll auf die oben genannte DOI infoZur Erklärung von DOI und dessen Verwendung gibt es hier Informationen . verwiesen und eine der folgende Referenzen zitiert werden:

  • Goebel, Jan, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, and Jürgen Schupp. 2019. The German Socio-Economic Panel (SOEP). Jahrbücher für Nationalökonomie und Statistik 239 (2), 345-360. (https://doi.org/10.1515/jbnst-2018-0022)
  • Schröder, Carsten, Johannes König, Alexandra Fedorets, Jan Goebel, Markus M. Grabka, Holger Lüthen, Maria Metzing, Felicitas Schikora, and Stefan Liebig. 2020. The economic research potentials of the German Socio-Economic Panel study. German Economic Review 21 (3), 335-371. (https://doi.org/10.1515/ger-2020-0033)
  • Giesselmann, Marco, Sandra Bohmann, Jan Goebel, Peter Krause, Elisabeth Liebau, David Richter, Diana Schacht, Carsten Schröder, Jürgen Schupp, and Stefan Liebig. 2019. The Individual in Context(s): Research Potentials of the Socio-Economic Panel Study (SOEP) in Sociology. European Sociological Review 35 (5), 738-755. (https://doi.org/10.1093/esr/jcz029)

Für die SOEP-Core-Daten 1984-2020 (v37) - Wellen A bis BK - stehen folgende Datensätze zur Verfügung:

soep.core.v37eu (EU Edition, 100%)

soep.core.v37i (International Scientific Use Version, 95%)

soep.core.v37t (Teaching Edition, 50%)

soep.core.v37at (Add-on: Area types)

soep.core.v37pr (Add-on: Planning regions)

soep.core.v37r (Remote Edition)

soep.core.v37o (Onsite Edition)

Ausführliche Informationen zu allen Editionen sind auf dem SOEPcompanion zu finden.

In der aktuellen Datenweitergabe komplett enthalten, auf spezielle Anfrage auch als Einzeldatensatz erhältlich:

soep.iab-soep-mig.2020 (Migrationsstichproben)

soep.iab-bamf-soep-mig.2020 (Geflüchtetenstichproben)

New samples in the main SOEP study

New Sample M6

The 2020 boost sample M6 supplements the samples of the IAB-BAMF-SOEP Survey of Refugees by 1,141 households. To recruit these households, a random sample was drawn from the Central Register of foreigners.
The sample consists of two main groups, namely persons who entered Germany between January 2013 up to the end of December 2016, filed an asylum application and whose last change of asylum status took place in 2013 to the end of 2016 (refreshment). The second group consists of persons who entered Germany between January 2013 and end of June 2019, filed an asylum application and whose last change of asylum status took place in 2017 to the end of June 2019 (enlargement).

New Sample M7

The 2020 boost sample M7 supplements the samples of the IAB-SOEP Migration Survey by 783 households. Similar to the M1 and M2 sample, register data of the Federal Employment Agency was used as a sampling frame. Information is collected on households with recent migrants from Poland, Romania, and Bulgaria between January 2016 and December 2018.

New Sample M8

The 2020 boost sample M8 supplements the samples of the IAB-SOEP Migration Survey by 1,096 households. Register data of the Federal Employment Agency was used to identify the population of third-country nationals who applied for working in Germany as professionals ("Fachkräfte") based on the Residence Act (Zuwanderungsgesetz) and were granted a permission in the time from January 2019 until January 2020. The sample also provides a basis to evaluate the "Fachkräfteeinwanderungsgesetz" becoming effective on March 1, 2020.

New datasets or variables

Dataset COV, COV_BRUTTO, COV_CONTACT - Datasets of SOEP-CoV study

All three datasets are associated with the 9 tranches of the SOEP-CoV in 2020, the SOEP-CoV wave in 2021 and COVID-19-special interviews 2020 from the IAB-BAMF-SOEP Survey of Refugees in Germany. More information about the project can be found online at the SOEP-Cov Homepage or in the references below, which we also recommend to cite if the data is used.

Kühne, S., Kroh, M., Liebig, S. & S. Zinn, (2020): The Need for Household Panel Surveys in Times of Crisis: The Case of SOEP-CoV. In: Survey Research Methods 14(2): 195-203.

Siegers, R., Steinhauer, H.-W. & S. Zinn, (2021): Weighting the SOEP-CoV study 2020. No. 989. SOEP Survey Papers (PDF, 486.06 KB).

  • COV contains the survey content
  • COV_Brutto contains the brutto information
  • COV_CONTACT contains the contact information

Dataset bkbiorki - Dataset of RKI-SOEp study

The dataset contains information about: "How many people have already been infected with the coronavirus, SARS-CoV-2? How many infections have gone undetected?" More information about the project can be found online at Nationwide Antibody “Study Living in Germany - Corona Monitoring” (RKI-SOEP).

bkbiorki contains results of PCR and DBS tests, as well as survey content. Data is available on request.

Old ID Variables no longer distributed

With V34, we introduced a new directory structure by merging our former independently delivered data formats SOEP-wide and SOEP-long. In the top-level (or root) directory, you find all "SOEPlong datasets" (pl, ppfadl, pl, hl, pgen, hgen etc.) as well as all of the biographical or spell datasets (bioparen, artkalen, etc.).

The raw directory provides the datasets in their original wide and cross-sectional SOEP format. What`s new is that we offer identifiers identical to the names in the long data (PERSNR to PID or $HHRNAKT to HID) and an additional variable survey year (SYEAR), so that users can easily merge variables from both data formats.

In order to ensure consistency of data and also to not alienate new users, these traditional "old" ID variables (PERSNR, HHNR, $HHNR, HHNRAKT) will no longer be delivered starting this year. Please use the new identifiers (PID, HID, CID, SYEAR).

Changes in our new main data format, SOEPlong

Dataset PBRUTTO

Dataset BIOL

  • Camces variables were removed from biol
  • Recognition of educational qualifications loops have been revised
  • Migration to Germany loops have been revised
  • In 1984-1995, the „Current Household Number (hid)“ was used when generating the „Original Houshold Number, Case ID (cid)“. Now the „Original Houshold Number, Case ID (cid)“ is used as the information source for the long version.

Dataset PL

  • Variables for short-time work, further training/retraining newly added and versioned
  • Multiple integrations, versioning and harmonizations for variables on self-employment
  • Multiple integrations for Place of residence loops
  • Multiple integrations of M1-Sample cases for the survey year 2014
  • Multiple integrations for questions on balance sheet assets
  • New variables added (digitization labor market)

Dataset JUGENDL

  • Integration of Grades
  • Integration of School Belonging Scale PISA

Dataset HBRUTTO

  • New variables for sample selection

Dataset HBRUTT

  • New variables for sample selection
  • Versioning of some living environment variables and incentivization

Dataset KIDLONG

  • Versioning of variables on language support

Dataset BIOBIRTH

BIOBIRTH now contains one row for each person who has ever lived in a SOEP household and therefore represents the population of PPFAD. Unlike in v36, where BIOBIRTH provided fertility information on every woman and man who has ever provided at least one successful SOEP Biography interview.

Precise identification of individuals and their information quality as well as the respondent´s status is possible via the variables biovalid and the new variable bioinfo. Theses variables provide information on whether individual level information is based merely on information derived from household composition and family relations, or on biographical questionnaire data. This should help data users to better assess how trustworthy a piece of information is. Birth information of persons without a completed SOEP Biography interview can only be inferred and is estimated via household composition and family relations. The variables kidpnr[nn], kidgeb[nn], kidmon[nn], kidsex[nn] were increased from a maximum of 15 possible children to 19 possible children. The information from kidmon[nn] is based on the information from the slightly new generated variable kidmon[nn] from PPFAD.

Dataset HGEN

New variable hgeqpfire introduced. The variable indicates whether a household has a fireplace or ceramic tiled stove.

Dataset PFLEGE

Two variables are no longer part of the file PFLEGE: MULTGRAD and WERPFLGT. Instead, 6 new dummy variables were included. These new variables describe by whom care is provided for a person in need of care in a household:

  • WERPFLGT1 “Public, church nurse, social worker”
  • WERPFLGT2 “Who cares: friends, neighbors”
  • WERPFLGT3 “Who cares: relatives not in the HH”
  • WERPFLGT4 “Who cares: relatives in the HH”
  • WERPFLGT5 “Who cares: private care service”
  • WERPFLGT6 “Who cares: other”

Dataset PEQUIV

The dataset PEQUIV contains two new additional variables. These are:

  • BAUK$$ “Building subsidy for new property owners”
  • FKAUK$$ “Imputation lag for variable BAUK$$”

Dataset INTERVIEWER

Variable educ_i (surveyed education of interviewer) was recoded and incorrect value labels were corrected.

old (wrong) values:
[1] Secondary School Degree - Sekundarschulabschluss
[2] Intermediate School Degree - Mittlerer Schulabschluss
[3] Upper Secondary School Degree - Abschluss der Sekundarstufe II
[4] Left university without degree - Hochschule ohne Abschluss verlassen
[5] Graduate degree - Hochschulabschluss

new (correct) values:
[1] No School Degree - Ohne Abschluss
[2] Secondary School Degree (GDR: 8th grade) - Hauptschulabschluss (DDR: 8. Klasse)
[3] Intermediate School Degree (GDR: 10th grade) - Mittlere Reife(DDR: 10. Klasse)
[4] Technical School Degree - Fachhochschulreife
[5] Upper Secondary Degree - Abitur/Hochschulreife

Outdated versions of datasets

The following data sets are still at the V36 level and have not been updated. We will update them as far as possible with the next realease of the data:

Dataset Description
migspell Migrations History
refugspell Migration History for Refugees
biojob First and Last Job
bioedu Educationsl History
bioparen SES of Parents
biosib Sibling Information
biotwin Twins Information


Individual (PAPI) 2020: Field-de Field-en
Household (PAPI) 2020: Field-de Field-en
Household (CAPI) 2020: Var-de
Biography (PAPI) 2020: Field-de Field-en
Biography (CAPI) 2020: Var-de
Catch-up Individual (PAPI) 2020: Field-de
Corona 2020: Var-de
Corona 2020 Tranche 2: Var-de
Corona 2020 Tranche 4: Var-de
Corona 2020 Tranche 5: Var-de
Corona 2020 Tranche 7: Var-de
Corona 2020 Tranche 9: Var-de
Catch-up Individual (CAPI) 2020: Var-de
Youth (16-17-year-olds, PAPI) 2020: Field-de
Youth (16-17-year-olds, CAPI) 2020: Var-de
Early Youth (13-14-year-olds, PAPI) 2020: Field-de
Pre-teen (11-12-year-olds, PAPI) 2020: Field-de
Mother and Child (Newborns, PAPI) 2020: Field-de
Mother and Child (2-3-year-olds, PAPI) 2020: Field-de
Mother and Child (2-3-year-olds, CAPI) 2020: Var-de
Mother and Child (5-6-year-olds, PAPI) 2020: Field-de
Mother and Child (5-6-year-olds, CAPI) 2020: Var-de
Parents and Child (7-8-year-olds, PAPI) 2020: Field-de
Mother and Child (9-10-year-olds, PAPI) 2020: Field-de
Mother and Child (9-10-year-olds, CAPI) 2020: Var-de
Deceased Individual (PAPI) 2020: Field-de

Alle Sample-spezifischen Fragebögen dieses Jahres und alle Fragebögen der vorherigen Befragungsjahre finden Sie auf dieser Seite

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

3) The Request for Record Linkage in the IAB-SOEP Migration Sample

4) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

8) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

13) SOEP Scales Manual (updated for SOEP-Core v32.1)

14) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

15) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

16) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

17) Multi-Itemskalen im SOEP Jugendfragebogen

18) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

19) Documentation of ISCED Generation Based on the CAMCES Tool in the IAB-SOEP Migration Samples M1/M2 and IAB-BAMF-SOEP Survey of Refugees M3/M4 until 2017

20) Dokumentation zum Entwicklungsprozess des Moduls „Einstellungen zu sozialer Ungleichheit“ im SOEP (v38)

21) SOEP-CoV: Project and Data Documentation

22) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

23) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

24) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

25) SOEP-Core v36: Codebook for the EU-SILC-like panel for Germany based on the SOEP

Alle Dokumentationen zum Filtern finden Sie auf dieser Seite

keyboard_arrow_up