Skip to content!

SOEP-Core v20 - Changes in the Dataset

Änderungen am Datensatz

Dataset Information

The data of the German SOEP (100% version) are distributed on three CD-ROMs covering the years 1984-2003. New data sets for the survey year 2003 are the usual wave-specific data TPBRUTTO, TP, TPKAL, TPGEN, THBRUTTO, TH, THGEN, TKIND and SPLUECKE. There are also updates of data sets with a longitudinal component (biographical data and weights). The information collected for the first time in 2003 in the biographical questionnaire for sample G ("high-income sample") has been completely integrated into the user-friendly biographical data sets.

As of this year, the data on CD-ROM #2 also contains all SOEP data with variable labels and value labels in English (including the data from the 1988 financial statement in file EV).

In addition, we have made the following additions and changes:

Sample G "High Income Sample" (Start 2002)  

The revised sampling design, using a higher income threshold, results in a smaller number of observations in wave 2.
Contact: Jürgen Schupp

HHRF and PHRF 2003 

The standard weighting variables for waves S and T (SPHRF, TPHRF or SHHRF, THHRF) are based on sub-samples A-F, that is, without considering high-income sample G. In addition, we now offer a new integrated weighting variable for all sub-samples A-G (variables $PHRFAG or $HHRFAG, see also documentation on the integrated weights for A-G vs. A-F ).
Contact: Martin Kroh

Rectypes 2003

1. BIOCHILD: Information from the 'Mother and Child Questionnaire'
In this new file, information on newborns in the SOEP will be collected each year from now on (see further documentation in Biography Data).
Contact: Jürgen Schupp

2. BIORESID: Information on second residence in the first interview
The data set BIORESID includes information on length of residency, and on second residence. The information comes from the biographical questionnaire, which has consistently contained questions on this since 1994 (see further documentation in Biography Data).
Contact: Thorsten Schneider

3. BIOBRTHM: Birth biography information for men - from 2001 on
This new data set includes information on the birth biographies of men interviewed with this modified questionnaire since 2001. BIOBRTHM is structured analogously to BIOBIRTH, based on a question fomerly only answered by women (see further documentation in Biography Data).
Contact: Christian Schmitt

4. BIOTWIN: data for identifying births of twins, triplets, etc.
BIOTWIN includes all identifiable births of twins, triplets, etc. in the SOEP. Identifiers (PERSNR) for the mother and siblings are included (see further documentation in Biography Data).
Contact: Jürgen Schupp and Christian Schmitt

5. HBRUTT98:
This new file contains the complete gross population of sample E in the year 1998. It is useful in attrition analysis of the first wave of this sample.
Contact: Peter Krause

BIOPAREN 2003

Variables on the nationalities of parents have been corrected (see further documentation in Biography Data).
Contact: Jürgen Schupp

PGEN 2003  

MODE$$und MONTH$$
Two new variables have been generated for all previous waves to describe interview method and month (MODE$$ or MONTH$$. See also additional documentation.
Contact: Jürgen Schupp

$PSBIL
Update of $PSBIL: For foreigners, the category "leave without graduating" [code 6] had to be updated in 2000, which in turn made it necessary to update $BILZEIT, ISCED$$ und CASMIN$$.
Contact: Bettina Isengard and Peter Krause

$FAMSTD
The variable for martial status has been updated.
Contact: Christian Schmitt

HGEN 2003  

HMODE$$ and HMONTH$$
Two new variables were generated for all previous waves to describe interview method and month (HMODE$$ or HMONTH$$). See also additional documentation.
Contact: Jürgen Schupp

PPFAD 2003  

GEBMONAT
The central demographic information in PPFAD has been expanded to the month of birth (variable GEBMONAT). This information is now collected for all adults and children as well (see further documentation in Biography Data).
Contact: Christian Schmitt

Update of EINTRITT, ERSTBEFR, AUSTRITT, LETZTBEF (see further documentation).
Contact: Peter Krause

BIOBIRTH 2003  

The information on women's birth biographies was expanded to include information from the Youth Questionnaire, which is given to 16-17 year-olds being interviewed for the first time instead of the standard biographical questionnaire (see further documentation in Biography Data).
Contact: Christian Schmitt 

BIOIMMIG 2003  

This data was corrected to fix a case of miscoding in past years that occurred due to a reversal of the item sequence. This applies to the variables BIEXPRLV, BIEXPRAC and BIEXPRAN (see further documentation in Biography Data).
Contact: Jan Goebel

PFLEGE 2003

The new variable PNRCARE is now available for the years since 1999, that is, for waves P - T. PNRCARE is an invariable number identifying the primary caregiver in a household. In three cases, the person identified as caregiver was identical with the person being cared for. In these cases, PNRCARE was set at -3 (implausible value). For the waves prior to 1999, PNRCARE has been assigned the value -2.
Contact: Rainer Pischner 

YPBRUTTO 2003  

Revision of HHNRAKT and HHNROLD for persons listed doubly while living in a previous household.
Contact: Peter Krause 

$EQUIV 2003

All income data since 1984 is coded in EURO.

As a supplement to the annual income aggregates offered thus far, we now add the individual income components (sum of all income earned by all household members, variables I111xx$$) with consistent variable names over time.

All information missing due to item-non-response was imputed and marked using flag variables.

All income variables are also included for sample G, but standard weights were used on the basis of sub-samples A-F (see also the additional documentation).
Contact: Markus Grabka

 

keyboard_arrow_up