Data distribution 1984-2016 (soep.v33.1)

SOEP Quicklinks:    





SOEPdata Dokumente



Title: Socio-Economic Panel (SOEP), data from 1984-2016

DOI: 10.5684/soep.v33.1
Collection period: 1984-2016
Publication date: 2018-01-30
Principal investigators: Jürgen Schupp, Jan Goebel, Martin Kroh, Carsten Schröder, Charlotte Bartels, Klaudia Erhardt, Alexandra Fedorets, Andreas Franken, Marco Giesselmann, Markus Grabka, Peter Krause, Hannes Kröger, Simon Kühne, Maria Metzing, Jana Nebelin, David Richter, Diana Schacht, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Rainer Siegers, Knut Wenzig

You can see the complete information by clicking at the DOI of the original data set

Data set information:

 Number of units 126,151
 Number of variables 72,709 in 439 data sets
 Data format STATA, SPSS, SAS, CSV

MD5 fingerprints

Distribution format zip file
all files
Stata bilingual dfe399ba3879874dbdd0096b58cbd90f   | TXT, 19.29 KB
Stata German 9cbe419645ee17bdb5265df5a5662802   | TXT, 19.29 KB
Stata English 3a195c128e21b732d8b1f0ff64316b35   | TXT, 19.29 KB
SPSS German 33763b1f68c54f790d9826b4923ac276   | TXT, 19.29 KB
SPSS English 0454017269b9f5601d3fe30ace13211f   | TXT, 19.29 KB
SAS German 84c5124b696a552340b1d7bca79c8c15   | TXT, 21.53 KB
SAS English e6cb205a9d2abec3a37872f1dbf2a6e8   | TXT, 21.53 KB
CSV df524ba26e46b42ff77dd6991046485d   | TXT, 19.29 KB
GGKBOU 1fd60d2f3f1a405d508cf472ff916cc9   | TXT, 140 Byte
GGKBOU English 67c43e2e72aab736e6c6dafb75da57f5   | TXT, 140 Byte
teaching versions
Stata German 3ecf547c653dfac561cb618c306972c8
Stata English 598ba143e4d7115fcc183dd1517af0d1
SPSS German 0f6ffcfcbdf0982afe48582603e20f97
SPSS English 96adf7fef897ddb346253598a9e93242
SAS German 16a66eacf4032b2ba8fe55f5e242bc3f
SAS English d921f61ee31459a4b54ea74d0dda9d10


  • Gert G. Wagner, Joachim R. Frick, and Jürgen Schupp (2007) The German Socio-Economic Panel Study (SOEP) - Scope, Evolution and Enhancements, Schmollers Jahrbuch (Journal of Applied Social Science Studies), 127 (1), 139-169 (download).
  • Schupp, Jürgen (2009): 25 Jahre Sozio-oekonomisches Panel - Ein Infrastrukturprojekt der empirischen Sozial- und Wirtschaftsforschung in Deutschland, Zeitschrift für Soziologie 38 (5),  350-357 (download).
  • Gert G. Wagner, Jan Göbel, Peter Krause, Rainer Pischner, and Ingo Sieber (2008) Das Sozio-oekonomische Panel (SOEP): Multidisziplinäres Haushaltspanel und Kohortenstudie für Deutschland - Eine Einführung (für neue Datennutzer) mit einem Ausblick (für erfahrene Anwender), AStA Wirtschafts- und Sozialstatistisches Archiv 2 (4), 301-328 (download).

Update information

Please note also the known issues of this version and fixes  here.

1 Deletion of incorrectly conducted interviews in the IAB-BAMF-SOEP Survey of Refugees

In the process of preparations for the next wave of the IAB-BAMF-SOEP Survey of Refugees, the survey institute determined that an interviewer had not conducted interviews correctly, affecting six percent of the household interviews in the sample. These households were removed from the dataset, but are available upon request for survey methodological analysis at a guest work station at the SOEP Research Data Center. In addition to deleting these lines of all affected datasets, we also made the following modifications:

  • Due to the deletion of household and individual interviews, the weights had to be updated (dataset HHRF and PHRF) to take the slightly reduced number of cases in the 2016 survey year into account.
  • The new weights were updated or included in the dataset BGPEQUIV.
  • Imputation of monthly household net income (I[1-5]HINC16) was redone for this sample in BGHGEN and in the dataset MIHINC.

2 Update INTID in BG files

Datasets from the current BG wave contained errors in the assignment of interviewer IDs. These were corrected.

3 Corrected number of entries in `$$KIND' (2014-2016)

Inconsistencies between key variables on population assignment in the PPFAD and $$KIND datasets were corrected. There was an error of one year in the definition of the target population in the $$KIND datasets from 2014 to 2016. In some cases, this led to a lack of information on the year of birth in files on children:

    • bekgjahr: 1998 for all samples
    • bfkgjahr: 1999 for all samples
    • bgkgjahr: 1999 only for samples M3 and M4 in 2016

These corrections also affect the number of cases in the file KIDLONG, which was corrected correspondingly.

3.1 Change in the $$NETTO codes in 96 cases (children) in the years 2014-2016

In the process of data checks, the $$NETTO codes in PPFAS were also compared and corrected. In survey years 2014 to 2016, some children had been incorrectly assigned the code 20 instead of 30 on the variable $$NETTO in the PPFAD dataset. This error has been corrected in v33.1 with the correction of the variable $$NETTO. The update also made it necessary to correct person weights in the affected survey years (dataset PHRF), because the determination of which individuals in interviewed households should be assigned a valid weight is based on the variable $$NETTO. The updated weight is also contained in v33.1.


In BIOAPREN, a number of missing values in the flag variables for parental (professional) education and the years of death of the parents were updated and filled in.


The algorithm for imputation of missing dates in the spells were optimized. As a result, in v33.1, the imputed variables and the variables imputed from these were changed, specifically all variables with the suffixes _imp and the variable staytime. The changes affected a total of 349 of 15,640 spells.

6 Update AUSB16 in BGPGEN

The variable AUSB16 (“profession requires vocational training”) from BGPGEN were updated. The correction substantially decreased the number of missings [-1].

Please note also the known issues of this version and fixes  here.


The SOEP micro data which we make available for scientific research can only be interpreted using statistical software. Direct use of SOEP data is subject to the high standards for lawful data protection in the Federal Republic of Germany. Signing a contract on data distribution with the DIW Berlin is therefore a precondition for working with SOEP data. After signing the contract, the data of every new wave will be available on request. 
more information

Online documentation and updates

Version specific changes in the dataset

Documentation of
Sample Sizes and Panel Attrition 1984-2016 | PDF, 1.43 MB

$HGEN | PDF, 199.96 KB (Household-related Status and Generated Variables)
$PGEN | PDF, 258.92 KB (Person-related Status and Generated Variables)

HPFAD | PDF, 107.66 KB (Household-related Meta-dataset)
PPFAD | PDF, 174.07 KB (Person-related Meta-dataset)

HEALTH | PDF, 122.32 KB

$HBRUTTO | PDF, 138.82 KB
$PBRUTTO | PDF, 152.47 KB

$PEQUIV | PDF, 1.23 MB

Overview SOEPlong

SOEPlong documentation | XLSX, 18.61 MB