Skip to content!

SOEP-Core v34 (data 1984-2017)

The German Socio-Economic Panel (SOEP) study is a wide-ranging representative longitudinal study of private households, located at the German Institute for Economic Research, DIW Berlin. Every year, there were nearly 15,000 households, and more than 25,000 persons sampled by the fieldwork organization TNS Infratest Sozialforschung. The data provide information on all household members, consisting of Germans living in the Eastern and Western German States, foreigners, and immigrants to Germany. The Panel was started in 1984. Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators. As early as June 1990—even before the Economic, Social and Monetary Union—SOEP expanded to include the states of the former German Democratic Republic (GDR), thus seizing the rare opportunity to observe the transformation of an entire society. Also immigrant samples were added in 1994/95 and 2013/2015 to account for the changes that took place in Germany society. Two samples of refugees were introduced in 2016. Further new samples were added in 1998, 2000, 2002, 2006, 2009, 2010, 2011, and 2012. The survey is constantly being adapted and developed in response to current social developments. The international version contains 95% of all cases surveyed (see 10.5684/soep.v34i).

Dataset Information

Title: Socio-Economic Panel (SOEP), data from 1984-2017

DOI: 10.5684/soep.v34
Collection period: 1984-2017
Publication date: 2019-03-05
Principal investigators: Stefan Liebig, Jan Goebel, Carsten Schröder, Jürgen Schupp, Charlotte Bartels, Alexandra Fedorets, Andreas Franken, Marco Giesselmann, Markus Grabka, Jannes Jacobsen, Selin Kara, Peter Krause, Hannes Kröger, Martin Kroh, Maria Metzing, Janine Napieraj, Jana Nebelin, David Richter, Diana Schacht, Paul Schmelzer, Christian Schmitt, Daniel Schnitzlein, Rainer Siegers, Knut Wenzig, Stefan Zimmermann

Data collector: Kantar Public Germany

Population: Persons living in private households in Germany.

Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk or register sample. 

Collection mode: The interview methodology of the SOEP is based on a set of pre-tested questionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)

Citation of the data set: Socio-Economic Panel (SOEP), data for years 1984-2017, version 34, SOEP, 2019, doi:10.5684/soep.v34.

Publications using this file should refer to the above DOI infoFind an explanation on the usage of DOI here.and cite following references

  • Goebel, Jan, Markus M. Grabka, Stefan Liebig, Martin Kroh, David Richter, Carsten Schröder, and Jürgen Schupp. 2019. The German Socio-Economic Panel (SOEP). Jahrbücher für Nationalökonomie und Statistik (Journal of Economics and Statistics) 239 (2), 345-360. (https://doi.org/10.1515/jbnst-2018-0022)

If you do not exclude the cases of the migration samples in your analysis, then please also cite the following reference

  • Herbert Brücker, Martin Kroh, Simone Bartsch, Jan Goebel, Simon Kühne, Elisabeth Liebau, Parvati Trübswetter, Ingrid Tucci & Jürgen Schupp (2014): The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents. SOEP Survey Paper 216 (PDF, 444.25 KB), Series C. Berlin, Nürnberg: DIW Berlin.

If you do not exclude the cases of the refugee samples in your analysis, please also cite: IAB-BAMF-SOEP survey of refugees (M3-M5), data for the years 2016-2021,

  • Herbert Brücker, Nina Rother, Jürgen Schupp. 2017. IAB-BAMF-SOEP Befragung von Geflüchteten 2016. Studiendesign, Feldergebnisse sowie Analysen zu schulischer wie beruflicher Qualifikation, Sprachkenntnissen sowie kognitiven Potenzialen. IAB Forschungsbericht 13/2017.

If you use data from the SOEP-LEE2 surveys, please also cite:

  • Matiaske, W., Schmidt, T. D., Halbmeier, C., Maas, M., Holtmann, D., Schröder, C., Böhm, T., Liebig, S., and Kritikos, A. S. (2023). SOEP-LEE2 : Linking Surveys on Employees to Employers in Germany. Journal of Economics and Statistics Data Observer, 1–14. https://doi.org/10.1515/jbnst-2023-0031.

If you would like to refer more specifically, please also cite:

  • Schröder, Carsten, Johannes König, Alexandra Fedorets, Jan Goebel, Markus M. Grabka, Holger Lüthen, Maria Metzing, Felicitas Schikora, and Stefan Liebig. 2020. The economic research potentials of the German Socio-Economic Panel study. German Economic Review 21 (3), 335-371. (https://doi.org/10.1515/ger-2020-0033)
  • Giesselmann, Marco, Sandra Bohmann, Jan Goebel, Peter Krause, Elisabeth Liebau, David Richter, Diana Schacht, Carsten Schröder, Jürgen Schupp, and Stefan Liebig. 2019. The Individual in Context(s): Research Potentials of the Socio-Economic Panel Study (SOEP) in Sociology. European Sociological Review 35 (5), 738-755. (https://doi.org/10.1093/esr/jcz029)
  • Jacobsen, Jannes, Magdalena Krieger, Felicitas Schikora, and Jürgen Schupp. 2021. Growing Potentials for Migration Research using the German Socio-Economic Panel Study. Jahrbücher für Nationalökonomie und Statistik 241 (4), 527-549. (https://doi.org/10.1515/jbnst-2021-0001)
  • Fedorets, Alexandra, Stefan Kirchner, Jule Adriaans, and Oliver Giering. 2022. Data on Digital Transformation in the German Socio-Economic Panel. Jahrbücher für Nationalökonomie und Statistik 242 (5-6), 691-705. (https://doi.org/10.1515/jbnst-2021-0056)

For the SOEP-Data 1984-2017 (v34) - Wave A to BH - we provide the following versions:

soep.v34

soep.v34i (International Scientific Use Version, 95%)

These datasets are included in SOEP-Core v34, but is also available as individual data sets upon reques:

soep.iab-soep-mig.2017 (Migration samples)

soep.iab-bamf-soep-mig.2018 (Refugee samples)

SOEP-Core soep.v34

1. New, user-friendly integrated data format

The new wave of the SOEP-Core study incorporates our “wide” and “long” data formats, which used to be provided to users separately. Our aim is to eliminate any confusion about what is available in which format and to make data use easier overall. After several years of testing SOEPlong as an additional service designed to facilitate analysis for both experienced and new users, we will now be providing all datasets in the “long” format as a standard part of our SOEP data release. This means that you will find the different SOEP data formats listed below in your data file, some of which will be contained in separate subdirectories.

Please make sure that you unpack the entire directory structure when unpacking your data.

1.1. SOEP in “long” format on the top level

In the top-level (or root) directory, you will find all of the datasets provided up to now with SOEPlong (pl, ppfadl, etc.) as well as all of the additional datasets formerly provided only in our classic “wide” format (biographical or spell data such as bioparen, artkalen, etc.). All of the data in the main SOEP-Core study are therefore contained in the datasets in the top-level directory.

Feedback from experienced and beginning users over the past several years shows that the “long” data offer significant advantages in ease of use, particularly for beginners. We have therefore decided to use this as our primary data format in future data releases.

All available individual year-specific datasets are pooled into a single dataset (e.g., all $P datasets are integrated into the PL dataset). In some cases, this means that we have to harmonize variables in order to be able to define them consistently over time. For instance, income information is given in euros up to 2001 and not in deutschmarks, and in cases where questionnaires have changed, the categories are modified over time. All changes are presented to users in a clear and understandable way, and if harmonization is necessary, all input variables are provided in their original form (see below _v*-variables). SOEPlong thus significantly reduces the number of datasets and the number of variables.

A more detailed description of the format of our SOEP-Core data release can be found in our new SOEPcompanion.

1.1.1. Most important changes to v33 in the long format

  • The following new files have been added:
    • HBRUTT: long file of the HBRUTT$$ files
    • PLUECKEL: long file of the $PLUECKE files
    • VPL: long files of the $VP files
  • Data sets PL and PL2 are being provided again in one combined file (PL).
  • The variable scheme with c-variables (cross-sectional) and l-variables (longitudinal) has been modified as follows:
    • If the variables on which a variable in the long format is based changed in the cross-section, then corresponding _v*-variables will be created for each version. A harmonized _h-variable is provided as well. Further information can be found in the SOEPcompanion (general description, examples)
  • All of the long datasets generated from the various cross-sectional datasets contain the new variable: INPUTDATASET.
  • Due to adjustments to the new joint data release format, some files with “long”-specific names are no longer included in the data release: CDESIGN, CSAMP, CSAMPFID, KIDL, PBREXIT.
  • The following datasets have been renamed to avoid conflicts with the data names in the raw directory:
    • PPATH replaces PPFAD
    • PPATHL replaces PPFADL
    • HPATH replaces HPFAD
    • HPATHL replaces HPFADL

1.2. Classic format in the subdirectory raw

Since we know that many users have existing scripts that are based on the original data format, and to enable users to understand the process of generating the “long” data, we provide all of the datasets in their original SOEP format in the directory raw.

Users who want to continue using the old format simply need to switch into subdirectory rawand use the datasets there.

The only change is that there are now additional identifiers in all of the datasets in the raw directory with the name in the long format (PID and PERSNR or HID and $HHRNAKT) and a survey year variable (SYEAR) so that users can easily merge variables from the two data formats.

1.3. New EU-SILC clone in the subdirectory eu-silc-clone

Many users are undoubtedly aware that the SOEP supports cross-national analysis with CNEF through the dataset PEQUIV. We have now produced a data product that allows you to use the SOEP data in comparative analyses with the EU-SILC (European Union Statistics on Income and Living Conditions) data. EU-SILC, which is provided by Eurostat upon request, offers cross-sectional and longitudinal information for many European countries. Up to now, only cross-sectional information has been available for Germany. The EU-SILC clone offers longitudinal information on private households in Germany based on the SOEP data. All of the information contained in it can be directly compared with the EU-SILC longitudinal information on other European countries.

The EU-SILC clone is integrated into the standard SOEP data release (in subdirectory eu-silc-clone).

Documentation on the 2005-2016 EU-SILC clone can be found here (PDF, 3.01 MB).

2. New samples in the main SOEP study

The new SOEP data release (v34) will be the first to contain data from the IAB-BAMF-SOEP Survey of Refugees in Germany as Sample M5, as well as the continuation of the PIAAC-L Survey, as Sample N.

2.1. IAB-BAMF-SOEP Survey of Refugees (M5)

The SOEP, in cooperation with the Institute for Employment Research (IAB) and the Federal Office for Migration and Refugees (BAMF), has succeeded in integrating a third sample of refugee households (M5) into the SOEP study. The survey was launched in 2017. The population of M5 covers adult refugees who have applied for asylum in Germany since January 1, 2013, and are currently living in Germany. M5 added another 1,519 households of refugees who have migrated to Germany since 2013 to the SOEP framework.

2.2. Integration of respondents from PIAAC-L as Subsample N

Sample N integrated 2,314 households of former participants of the Program for the International Assessment of Adult Competencies (PIAAC and PIAAC-L) in 2017. This is the most recent addition to the SOEP-Core samples. Fieldwork in sample N was conducted between mid-March and mid-August and thus slightly later than the majority of samples A–L1. More information on the PIAAC-L project can be found on the project homepage.

3. Translation errors in some questionnaire languages

In the IAB-BAMF-SOEP Survey of Refugees (M3-M5), there were translation errors in some some of the questions on income components in translated versions of the household questionnaire. Answers for these variables are therefore not comparable with other answers. The corresponding variables were set to -3.

4. Deletion of interviews not conducted in line with the standards of the IAB-BAMF-SOEP group in the IAB-BAMF-SOEP Survey of Refugees (M3/M4)

In the process of data preparation, three interviewers were identified who had not conducted interviews in line with the standards of the IAB-BAMF-SOEP group (more information here). The interviewers in question were responsible for 88 households in 2016 and 112 households in 2017. The households affected in the first wave of the survey (2016) were completely removed from the dataset. The households affected in 2017, who were supposed to be interviewed for the second time, were deleted for 2017 but left in the dataset for 2016. There are no indications that the first interviews (by a different interviewer) were not conducted in line with IAB-BAMF-SOEP standards. The interviews and cases deleted from the data release may be accessed upon request from a guest work station at the SOEP-RDC for survey methodological analysis. After these lines were deleted from all datasets, the following adjustments were made:

  • The deletion of the household and individual interviews required an update of the weights (dataset HHRF and PHRF), which now take account of the slightly reduced case numbers in survey years 2016 and 2017.
  • Update / inclusion of the new weights in the datasets BGPEQUIV and BHPEQUIV.

5. Extended variable naming convention

The extended variable naming convention is applied only to data sets from wave BH onwards and only applicable for the datasets $P, $H, $KIND. We added underscores between unit of analysis, question identifier, and item identifier to clearly separate the analysis unit, question, and item visually. In addition, a questionnaire identifier was introduced, which is also separated by an underscore from the item. This new version of naming variables is only used if the survey instrument differs from the “original” SOEP-Core instrument.

Due to our different samples in the SOEP, there are some respondents that receive sample-specific questions, such as the refugee sample that started in 2016. For that specific group, we created an extended individual questionnaire with some specific questions along with the standard SOEP questions that are asked every year. For the specific questions, you can use the instrument variable to see the source of the variables.

Examples and more detailed descriptions can be found in the chapter on this subject in the SOEP Companion.

6. Changes in specific variables

  • New variables for interview year: HIYEAR in HGEN, HPATHL, and PIYEAR in PGEN, PPATHL. These new variables indicate, for all survey years, the household and individual interviews that were finalized after (or before) the survey year (variable syear), which is the reference year for the questionnaires and for data collection.


6.1. Dataset PPATH / PPATHL (in raw: PPFAD)

6.1.1. SEXOR

  • The previous data release was the first to include the variables SEXOR (sexual orientation) and SEXORINFO (source of information on sexual orientation). The value -1 “insufficient information” has been changed to 2 “insufficient information”.

6.1.2 PARINFO

  • The value -1 “unclear” has been changed to 5 “unclear”.

6.1.3 Migration information

  • The coding of GERMBORN, CORIGIN, IMMIYEAR, and MIGBACK was changed for inconsistent cases (for more information, see the PPATH/PPFAD documentation).

6.1.4. Asylum-Seekers and Refugees

  • The variables for asylum-seekers and refugees [AREBACK, AREFINFO] have been renamed (in v33: REFBACK, REFINFO) and revised. The variable AREFINFO now also allows identification of specific subgroups (more information is available in the documentation).

6.2. Dataset PGEN

6.2.1 Partner pointer

  • For the variable PGPARTZ (PARTZ$), the value -1 (“no answer”) has been replaced by the correct value 5 (“unclear”).
  • Starting with wave BH, the new quality control processes implemented in generating the partner indicator have improved the quality of data from previous waves:
    • Contradictory answers between partners regarding their relationship have been identified and corrected.
    • Partnerships with differing partner indicators (1 “spouse” or 2 “life partner”) within a relationship have been identified and corrected.
    • Errors in the assignment of PARTZ values (1 “spouse” and 2 “life partner”) due to different filter routing in the different survey instruments have been corrected. Marriages were asked differently in the individual biography questionnaire for Sample J+K and in the individual questionnaire for Samples A-I. Separating out samples J and K played a key role in this correction, since these two led to errors due to their different filter routing.
    • Partnerships with recently deceased individuals were identified and deleted.
    • Respondents’ data on divorce, separation, or the death of a life partner within the past year have been taken into account for the first time in the process of generating the data.
    • For the first time, family status, civil status, partner’s first name (permanent partner number) and place of residence of the partner in the case of refugees’ partnerships (Samples M3-M5) have been taken into account in addition to interviewer given relationships between the different houshold members.

6.2.2. Volunteer work and side jobs

  • The PGEN (raw: $$PGEN) files contain nine new variables. In 2017, the SOEP fundamentally revised how respondents were surveyed about side jobs. Now, for the first time, respondents can provide answers on three different side jobs. They can also now differentiate the type of side job, whether volunteer work (variables HONOR1, HONOR2, HONOR3) and whether they are working for an employer or working freelance (SNDTYP1, SNDTYP2, SNDTYP3). The amount of gross additional income from side jobs is provided as imputed information (SNDJOB1, SNDJOB2, SNDJOB3).
    • SNDTYP117 : First side job occupational status
    • SNDTYP217 : Second side job occupational status
    • SNDTYP317 : Third side job occupational status
    • SNDJOB117 : Current gross additional income from side job 1 (gen.) in euros
    • SNDJOB217 : Current gross additional income from side job 2 (gen.) in euros
    • SNDJOB317 : Current gross additional income from side job 3 (gen.) in euros
    • HONOR117 : Volunteer work 1
    • HONOR217 : Volunteer work 2
    • HONOR317 : Volunteer work 3


6.2.3. Educational degrees

  • In v34, CASMIN and ISCED are based on additional information on educational degrees obtained abroad. Hence, some individuals with degrees from abroad display higher ranks in v33 than in v34.
  • The error in the CASMIN variable in v33 is fixed: In v33, individuals with 2c_voc (vocational maturity certificate) were mistakenly categorized as 2c_gen (general maturity certificate).

6.2.4. AUTONO

  • The generation of autono was discontinued in 2017 due to the difficulty in comparing this variable with the usual models of autonomy. Work is currently underway to introduce comparable definitions of autonomy.


6.3 Dataset PEQUIV

  • The PEQUIV (raw: $$PEQUIV) files contain six new variables. These are:
    • IAUS117 : Pensions from another country
    • AUS217 : Widows / orphans pension from another country
    • ASYL17 : Asylum-seeker benefit
    • FASYL17 : Imputation flag: Asylum-seeker benefit
    • EDUPAC17 : Benefits from the educational package
    • FEDUPAC17 : Imputation flag: Benefits from the educational package
    For more details, see the SOEP Survey Paper: Codebook for the $PEQUIV File 1984-2017.

6.4. Dataset BIOAGEL and BIOPUPIL

  • Variables from questionnaires given to 12-year-olds and 14-year-olds are now provided in BIOPUPIL dataset to reflect the differences in survey mode (parents being asked questions about their children vs. children being surveyed directly).
  • Variables from additional questions in refugee samples are integrated in BIOAGEL and BIOPUPIL datasets.


6.5. Dataset HGEN

A number of changes have taken place in recent years in questions on home rental. The first change took place in the hosehold questionnaire of wave BF (2014). The question asked about the costs of utilities in such detail that respondents were not able to provide correct answers. This led to underestimation of both base rent and utilities.

It emerged that this led to a slight break in the time series. Rent has increased continuously over the years since 1984. In 2014 and 2015, however, rental costs fell and have been increasing again sharply since 2016. This break can be explained by the change in the questionnaire.

Starting with wave BH, respondents are being asked about rent in the same way as in wave BG (2016) and in wave BD (2013) in order to maintain long-term comparability. In addition, with wave BH, the new migration sample M5 and the new refresher sample N are part of the SOEP. Since Sample M5 was not surveyed on utility costs in a comparable way and since many of these respondents probably live in group housing or receive subsidies to cover living costs, no rent variable was generated for them.

v33 - rent

v34 - rent

2010: 486.25

2010: 486.21

2011: 484.93

2011: 485.64

2012: 491.01

2012: 490.75

2013: 505.00

2013: 505.59

2014: 470.95

2014: 473.74

2015: 507.06

2015: 508.57

2016: 545.53

2016: 541.90

 

2017: 550.67

6.6. Dataset BIOIMMIG

  • The population of BIOIMMIG shrunk due to a change of coding of BIIMGRP (for more information, see the BIOIMMIG documentation)
     

6.7. Dataset HHRF/PHRF

  • New variables in HHRF: BHHHRF, BHHBLEIB, BHHHRFAM4, BHHHRFM5, BHHHRFN
  • New variables in PHRF (and ENUMHRF, available on request): BHPHRF, BHPBLEIB, BHPHRFAM4, BHPHRFM5, BHPHRFN
  • Please note that with our new integrated data format, you’ll find all weighting variables now directly in PPATHL or HPATHL.
  • On request, we provide stand-alone weighting variables (BHPHRFM35, BHHHRFM35) for the refugee samples M3, M4, and M5.

6.7.1. Revisions and Bugfixes

  • Due to confusion in the country codes for Iran and Russia in the sampling frame (Central Register of Foreign Nationals, AZR), design weights for Samples M3 and M4 as well as their cross-sectional weights for wave BG had to be updated.
    In Wave BG, we interpreted the population of samples M3 and M4 as refugees who immigrated to Germany between January 2013 and January 2016. In fact, only those refugees whose registration at the Central Register of Foreign Nationals (AZR) took place until April 2016 were included in those samples. In Sample M5, among others, those refugees were interviewed who, although they had immigrated in the same period, were registered later. For this reason, the total for the post-stratification of the second wave of M3 and M4 has been reduced by the number of refugees with a later registration date.

1984-2017 (Wave BH)

Overview (May 2019):

  1. Dataset: pl / variables: plb0186_v2, plb0186_h

    Values for the variables plb0186_v2 and plb0186_h for the East sample in 1990 are too small by a factor of 10.

  2. Dataset: bhh / variables: bhh_37_01, bhh_37_02

    The names assigned to the raw variables bhh_37_01 “electricity included in rent” and bhh_37_02 “assessed burden of housing expenses (rent and additional expenses)” do not correspond to the standard SOEP concept for naming variables. Both variables will be renamed in the new version.

  3. Dataset: migspell

    The previous version from the migspell dataset was delivered. 

  4. Datasets: biobirth, bioimmig, biojob, bioparen, bioresid, biosib, biosoc, biotwin, pflege / variables: pid, hid, cid

    The new identifiers were not filled in and have to be filled in from the old identifiers.

Details:

1. Dataset: pl
Variables: plb0186_v2, plb0186_h

Values for the variables plb0186_v2 “Actual working time with overtime (1990-2017)” and plb0186_h “Actual working time with overtime (harmonized)” have the wrong values for the East sample in 1990.

The variable plb0186_h is made up of the variables plb0186_v1 (1984-1989) and plb0186_v2 (1990-2017). We included all of the values for plb0186_v1 as they were, and divided all of the valid values for plb0186_v2 by 10. The process of harmonization is necessary due to the fact that the two raw variables for 1990 were provided in different formats:

gpost: gp3601e (two-digit, no comma)
gp: gp39 (three-digit, no comma)

The raw variable gp3601e from gpost was assigned to the variable plb0186_v2 although it does not have to be divided by 10. As a result, all values for the East German population for the year 1990 were mistakenly divided by 10. The simplest way of solving this problem is to multiply the valid values for the East German population by 10. 

cd "Datenpfad"
use "pl.dta"
tabstat plb0186_*,by(syear)
clonevar rep_plb0186_h=plb0186_h
replace rep_plb0186_h = rep_plb0186_h*10 if inputdataset == "gpost" & rep_plb0186_h > 0

Detailed information on the general process used to harmonize variables can be found here:
Versioning and harmonization of variables
Working with harmonized Variables

2. Dataset: bhh
Variables: bhh_37_01, bhh_37_02

The names assigned to the raw variables bhh_37_01 “electricity included in rent” and bhh_37_02 “assessed burden of housing expenses (rent and additional expenses)” do not correspond to the standard SOEP concept for naming variables. Both variables had to be renamed:

bhh_37_01 “Electricity included in rent” → bhh_33
bhh_37_02 “Assessed burden of housing expenses (rent and additional costs)” → bhh_37

To find out more about how raw variables are named in the SOEP, see the SOEPcompanion:
Naming conventions of Variables and Datasets

3. Dataset: migspell

Unfortunately the previous version of the migspell dataset was delivered. For the current version, please contact the SOEPhotline or write an email to soepmail.

4. Dataset: biobirth, bioimmig, biojob, bioparen, bioresid, biosib, biosoc, biotwin, pflege
Variables: pid, cid, hid

In the process of “merging” SOEP-Long and SOEP-Core, all of the SOEP-Long ID variables (pid, hid, cid) were also included in the raw datasets to make merging easier for users. In some datasets, only the ID variables were created but not filled in with the corresponding IDs.

Empty pid: biobirth, bioimmig, biojob, bioparen, bioresid, biosib, biosoc, biotwin, pflege
Empty hid: bioimmig, bioresid, biosoc
Empty cid: biobirth, bioimmig, biojob, bioparen, bioresid, biosib, biosoc, biotwin, pflege

With these datasets, please continue to use persnr, hhnrakt, hhnr, or copy the content into the corresponding new ID variable.

clonevar pid = persnr
clonevar hid = hhnrakt
clonevar cid = hhnr

Further information on SOEP identifiers can be found here:
Dataset Identifier


Individual (PAPI) 2017: Field-de Field-en Var-de Var-en
Household (PAPI) 2017: Field-de Field-en Var-de Var-en
Biography (PAPI) 2017: Field-de Var-de Var-en
Catch-up Individual (PAPI) 2017: Field-de Var-de Var-en
Youth (16-17-year-olds, PAPI) 2017: Field-de Var-de Var-en
Early Youth (13-14-year-olds, PAPI) 2017: Field-de Var-de Var-en
Pre-teen (11-12-year-olds, PAPI) 2017: Field-de Var-de Var-en
Mother and Child (Newborns, PAPI) 2017: Field-de Var-de Var-en
Mother and Child (2-3-year-olds, PAPI) 2017: Field-de Var-de Var-en
Mother and Child (5-6-year-olds, PAPI) 2017: Field-de Var-de Var-en
Parents and Child (7-8-year-olds, PAPI) 2017: Field-de Var-de Var-en
Mother and Child (9-10-year-olds, PAPI) 2017: Field-de Var-de Var-en
Deceased Individual (PAPI) 2017: Field-de Var-de Var-en

Please find all sample specific questionnaires of this year and all questionnaires of previous years on this site

1) Supplementary of the IAB-BAMF-SOEP Survey of Refugees in Germany (M5) 2017

2) SOEP-Core v34 – Documentation of Sample Sizes and Panel Attrition in the German Socio-Economic Panel (SOEP) (1984 until 2017)

3) SOEP-Core v34 – Biographical Information in the Meta File PPATH (Month of Birth, Immigration Variables, Living in East or West Germany in 1989)

4) SOEP-Core v34 – PPATHL: Person-Related Meta-Dataset

5) SOEP-Core v34 – HPATHL: Household-Related Meta-Dataset

6) SOEP-Core v34 – PBRUTTO: Person-Related Gross File

7) SOEP-Core v34 – HBRUTTO: Household-Related Gross File

8) SOEP-Core v34 – PGEN: Person-Related Status and Generated Variables

9) SOEP-Core v34 – HGEN: Household-Related Status and Generated Variables

10) SOEP-Core v34 – Codebook for the $PEQUIV File 1984-2017 : CNEF Variables with Extended Income Information for the SOEP

11) SOEP-Core v34 – BIOIMMIG: Generated variables for foreign nationals, immigrants, and their descendants in the SOEP

12) SOEP-Core v34 – HEALTH

13) SOEP-Core v34 – BIOPAREN: Biography Information for the Parents of SOEP-Respondents

14) SOEP-Core v34 – BIOAGEL & BIOPUPIL: Generated Variables from the “Mother & Child”, “Parent”, “Pre-Teen”, and “Early Youth” Questionnaires

15) SOEP-Core v34 – BIOSIB: Information on siblings in the SOEP

16) SOEP-Core v34 – The couple history files BIOCOUPLM and BIOCOUPLY, and marital history files BIOMARSM and BIOMARSY

17) SOEP-Core v34 – BIOAGE17: The Youth Questionnaire

18) SOEP-Core v34 – BIOSOC: Retrospective Data on Youth and Socialization

19) SOEP-Core v34 – BIOJOB: Detailed Information on First and Last Job

20) SOEP-Core v34 – BIOEDU: Data on educational participation and transitions

21) SOEP-Core v34 – BIORESID: Variables on Occupancy and Second Residence

22) SOEP-Core v34 – BIOBIRTH: A Data Set on the Birth Biography of Male and Female Respondents

23) SOEP-Core v34 – BIOTWIN: TWINS in the SOEP

24) SOEP-Core v34 – LIFESPELL: Information on the Pre- and Post-Survey History of SOEP-Respondents

25) SOEP-Core v34 – MIGSPELL and REFUGSPELL: The Migration-Biographies

26) SOEP-Core v34 – Activity Biography in the Files PBIOSPE and ARTKALEN

27) SOEP-Core v34: Codebook for the EU-SILC-Like Panel for Germany Based on the SOEP

1) Handgreifkraftmessung im Sozio-oekonomischen Panel (SOEP) 2006 und 2008

2) The new IAB-SOEP Migration Sample: an introduction into the methodology and the contents

3) The Request for Record Linkage in the IAB-SOEP Migration Sample

4) Flowcharts for the Integrated Individual-Biography Questionnaire of the IAB-SOEP Migration Sample 2013

5) The Measurement of Labor Market Entries with SOEP Data: Introduction to the Variable EINSTIEG_ARTK

6) Job submission instructions for the SOEPremote System at DIW Berlin – Update 2014

7) SOEP 2015 – Informationen zu den SOEP-Geocodes in SOEP v32

8) Editing and Multiple Imputation of Item Non-response in the Wealth Module of the German Socio-Economic Panel

9) Die Vercodung der offenen Angaben zu den Ausbildungsberufen im Sozio-Oekonomischen Panel

10) Das Studiendesign der IAB-BAMF-SOEP Befragung von Geflüchteten

11) Scales Manual IAB-BAMF-SOEP Survey of Refugees in Germany – revised version

12) SOEP 2010 – Preparation of data from the new SOEP consumption module: Editing, imputation, and smoothing

13) SOEP Scales Manual (updated for SOEP-Core v32.1)

14) Kognitionspotenziale Jugendlicher - Ergänzung zum Jugendfragebogen der Längsschnittstudie Sozio-oekonomisches Panel (SOEP)

15) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

16) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der Klassifikation der Berufe 2010 (KldB 2010): Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben

17) Multi-Itemskalen im SOEP Jugendfragebogen

18) Zur Erhebung des adaptiven Verhaltens von zwei- und dreijährigen Kindern im Sozio-oekonomischen Panel (SOEP)

19) Dokumentation zum Entwicklungsprozess des Moduls „Einstellungen zu sozialer Ungleichheit“ im SOEP (v38)

20) SOEP-CoV: Project and Data Documentation

21) Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution

22) SOEP-Core v34 – PFLEGE: Documentation of Generated Person-level Long-term Care Variables

23) SOEP 2006 – TIMEPREF: Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey

24) Assessing the distributional impact of "imputed rent" and "non-cash employee income" in microdata : Case studies based on EU-SILC (2004) and SOEP (2002)

25) SOEP-Core v36: Codebook for the EU-SILC-like panel for Germany based on the SOEP

All documentation for filtering can be found on this page

keyboard_arrow_up