The German Socio-Economic Panel Study (SOEP) is a wide-ranging representative longitudinal study of private households, located at the German Institute for Economic Research, DIW Berlin. Every year, there were nearly 11,000 households, and more than 20,000 persons sampled by the fieldwork organization TNS Infratest Sozialforschung. The data provide information on all household members, consisting of Germans living in the Old and New German States, Foreigners, and recent Immigrants to Germany. The Panel was started in 1984. Some of the many topics include household composition, occupational biographies, employment, earnings, health and satisfaction indicators. As early as June 1990-even before the Economic, Social and Monetary Union-SOEP expanded to include the states of the former German Democratic Republic (GDR), thus seizing the rare opportunity to observe the transformation of an entire society. An immigrant sample was added as well to account for the changes that took place in Germany society in 1994/95. Further new samples were added in 1998, 2000, 2002, 2006, 2009 and 2011. The survey is constantly being adapted and developed in response to current social developments. The international version contains 95% of all cases surveyed.
Title: German Socio-oeconomic Panel Study (SOEP), data of the years 1984-2011
Collection period: 1984-2011
Publication date: Nov. 12, 2012
Principal investigators: Jürgen Schupp, Martin Kroh, Jan Goebel, Silke Anger, Marco Giesselmann, Markus Grabka, Peter Krause, Elisabeth Liebau, Henning Lohmann, David Richter,Christian Schmitt, Daniel Schnitzlein, Juliana Werneburg, Frauke Peter, Ingrid Tucci
Data collector: TNS Infratest Sozialforschung GmbH
Population: Persons living in private households in Germany
Selection method: All samples of SOEP are multi-stage random samples which are regionally clustered. The respondents (households) are selected by random-walk.
Collection mode: The interview methodology of the SOEP is based on a set of pre-tested qustionnaires for households and individuals. Principally an interviewer tries to obtain face-to-face interviews with all members of a given survey household aged 16 years and over. Additionally one person (head of household3) is asked to answer a household related questionnaire covering information on housing, housing costs, and different sources of income. This covers also some questions on children in the household up to 16 years of age, mainly concerning attendance at institutions (kindergarten, elementary school, etc.)
Data set information:
|Number of units||74.137|
|Number of variables||47.933 in 360 data sets|
|Data format||STATA, SPSS, SAS, CSV|
|MD5 fingerprints of the data sets||
Stata German | TXT, 16.16 KB
Publications using this file should refer to the above DOI Find an explanation on the usage of DOI here.and cite one of the following references
With the integration of sample J in 2011, conducting of the biographical questionnaire was moved from the second to the first wave and combined with the individual questionnaire in an integrated survey. This means that there are some slight differences in the survey instrument between the old samples A-H and the supplementary sample J.
The following additional missing codes have been introduced to the survey data to document these possible differences:
|-4||"Inadmissible multiple response"|
|-5||"Not included in this version of the questionnaire"|
|-6||"Version of questionnaire with modified filtering"|
The SOEP Innovation Sample has been launched now and includes, inter alia, sample I. Sample I is therefore no longer part of the main survey as of 2011. See SOEP-IS on our website for further information about the Innovation Sample and the possibility of including your own questions.
BIOCOUPLM provides spell data on partnership histories from the first to last personal interview of a respondent. Spells are measured on a monthly basis.
BIOCOUPLY provides spell data on partnership histories. It contains annual information on partnership status since the respondent’s year of birth, including available retrospective data and annually updated information.
3.3 BIOSIB (beta version)
The new file BIOSIB provides information on siblings living in the SOEP households. The dataset contains the person numbers of all siblings in an observed family. It includes information on their gender, their year of birth, and on the relationship between the observed siblings.
BIOSIB is included as a beta version in the current data release. Please do not hesitate to send both positive and negative feedback or suggestions to Daniel Schnitzlein (email@example.com).
The BIOEDU dataset contains details on educational transitions beginning with entry into childcare up to tertiary education in a consistently structured form.
3.5 BIOAGE long
In the new integrated bioage long dataset (BIOAGEL), data are presented in “long” format, i.e. this dataset will contain information from BIOAGE01, BIOAGE03, BIOAGE06, as well as BIOAGE08a and BIOAGE08b.
Dataset on the Economic Behavior Experiment on Trust and Trustworthiness in the 2003, 2004, & 2005 SOEP Survey
This experiment to measure trust is based on the investment game introduced by Berg et al. (1995), a one-shot game for two players or movers who anonymously interact with each other. The first mover receives an endowment of 10 points and can transfer zero to ten points to the second mover. Every point that is transferred is doubled by the experimenters. The second mover is also given an endowment of ten points. After receiving points from the first mover, he/she decides on how much of the endowment to transfer back to the first mover (zero to ten points). As with the first mover's transfer, the back-transfer by the second mover is doubled by the experimenters. After the second mover's decision, the game ends and the subjects are paid their income in euros (one point equals one euro) by check sent a few days later.
A fundamental component of the game is that the participants actually receive money in accordance with the fixed payout function, i.e., all the decisions always have monetary consequences. This version of the game was developed by Fehr, Fischbacher, Schupp, von Rosenbladt & Wagner (2002).
The combination of representative survey and behavioral experiment was used in the SOEP main surveys in 2003, 2004, and 2005, with only minor modifications. Of the 1,432 original participants in 2003, 1,202 also took part in the experiment in 2004 and 2005.
The data are available in long format in the "TRUST" dataset. Consequently, this dataset contains information from each of the three waves in which the behavioral experiment was conducted.
Dataset on the Economic Behavior Experiment on Time Preferences in the 2006 SOEP Survey
In this experiment on economic behavior, respondents were asked to decide how they would like to receive €200 in prize money: if they would rather receive it immediately by check, or if they would prefer to wait and receive a larger amount later—that is, with interest. By splitting the sample (N = 1,503 persons) into random subsamples (splits), it was possible to vary both the time horizon and the implied interest rate to test possible incentive effects on the choice between a low payoff in the short term and a high payoff in the long term. The scientific director of the project was Prof. Dr. Armin Falk, CENs, University of Bonn.
4.1 $HBRUTTO dataset
The $HBRUTTO dataset will include a new variable to distinguish between urban, suburban and rural regions. This is based on the spatial categories of counties (as of December 31, 2009) used by the Federal Institute for Research on Building, Urban Affairs and Spatial Development (BBSR). The following spatial structure characteristics are used to define the categories:
Thus, three categories can be defined:
4.2 $PGEN dataset
BILZTCH$$ / BILZTEV$$:
BILZTCH$$ indicates whether the respondents’ answers suggest a downward shift in years of education or training ($BILZEIT) since the last observation or an upward change since the last year which is inconsistent with additional information on education or training recently completed.
BILZTEV$$ is a flag variable which indicates whether the respondent showed some inconsistent change in $BILZEIT either upwards or downwards over the entire observation period.
$VEBZEIT and $UEBSTD
To be consistent with the FID dataset, the missing values of the variables $VEBZEIT and $UEBSTD were slightly recoded, as the missing value –2 is now assigned to self-employed individuals. In previous waves, self-employed persons had the missing value –3 (implausible answer).
For $UEBSTD, the value –3 (implausible answer) is assigned to all individuals with more than ten hours of weekly overtime AND who also had an agreed working time of over 80 weekly hours ($VEBZEIT is implausible, value –3) or actual weekly working time of more than 80 hours a week ($TATZEIT is implausible, value –3).
4.3 BIOPAREN dataset
Seven new variables have been added to BIOPAREN:
VAORT11 and MAORT11 indicate the mother and father’s current place of residence.
GESCHW, GESCHWUP, NUMS, NUMB and TWIN provide information on siblings. The variable GESCHW indicates whether the respondent ever had any siblings at the time of the interview. GESCHWUP gives information about the year the sibling information was collected. NUMB and NUMS provides information on the number of brothers or sisters the respondent reports and TWIN indicates whether any of these are TWIN siblings (and of which type) of the respondent.
1984-2011 (Wave BB)
BIOCOUPLM, BIOCOUPLY, BIOMARSM, BIOMARSY
An update for all corrected files can be downloaded by means of a personalized link. Please contact firstname.lastname@example.org to obtain your link.
Please note: If you use one of the provided bugfixes in your analyses we recommend citing it as follows:
15) Die Vercodung der offenen Angaben zur beruflichen Tätigkeit nach der International Standard Classification of Occupations 2008 (ISCO08) - Direktvercodung - Vorgehensweise und Entscheidungsregeln bei nicht eindeutigen Angaben
All documentation for filtering can be found on this page