The 2006 SOEP data distribution (1984-2005, Waves A-V) includes the usual wave-specific data VPBRUTTO, VP, VPKAL, VPGEN, VHBRUTTO, VH, VHGEN, VKIND and UPLUECKE, as well as updated versions of all datasets with a longitudinal component (spell data, biographical data, and weights).
The first CD-ROM contains, as usual, all SOEP data with variable labels and value labels in German, and the second contains all SOEP data with variable labels and value labels in English.
Please also note the following improvements and changes:
New and renamed datasets 2005
With the current data distribution, we renamed all SOEP datasets based on age-specific biographical questionnaires (e.g., "Mother and Child") in a more consistent manner. Since all these datasets are saved in long format, the names now start with "BIOAGE" and a two-digit suffix. This suffix gives the maximum age of the individuals in question during the survey year.
New name for the dataset BIOCHILD up to the present (based on the questionnaire for mothers with a newborn child below the age of 15 months).
New dataset based on mother-and-child questionnaire for mothers with a child between the ages of 2 and 3 years. For further information, please see the biographical data documentation.
New name for the dataset previously known as BIOYOUTH (based on a survey of adolescents between 16 and 17 years old).
The 2005 cross-sectional weights are provisional - an update of VPHRF and VHHRF will be released in fall 2006
The wave-specific projection and weighting variables will be adjusted annually to external official data to ensure the accuracy of marginal distributions on age, sex, household size and nationality. The source of the data is the German Federal Statistical Office's official microcensus. From 2005 on, the data on Berlin will no longer be reported separately for the areas comprising former West Berlin / East Berlin; rather, Berlin will be considered part of East Germany. As a consequence, the data required to adjust our weights to the official marginal distributions will not be available before fall 2006.
To prevent this from causing a delay in the distribution of the SOEP data up to Wave V (2005), the weights (VPHRF* and VHHRF*) have been adjusted to the data used for Wave U (2004).
From our experience, there is a very low deviation in the benchmark data over the years (the new definition for West Berlin / East Berlin being one exception). Please keep in mind the provisional nature of the weighting scheme, and indicate this explicitly in any publications using the weights for Wave V. We will inform you as soon as the final version, based on the 2005 microcensus data, becomes available via the SOEP NEWSLETTER and listserver.
The adjusted screener (AHINC$$) is now available for all waves (Exception: Sample C in 1990/1991).
Raw categories for the size of the company. A consistent variable over all waves for the size of the company ("least common denominator" of the variable BETR$$).
The variable BETR$$ now has eleven instead of nine categories. The reason is the more detailed questions from Wave V onwards. The old category "5 to 20 employees" is now split into two categories ("5 to 10 employees" and "11 to 20 employees").
The new categories are:
TIP: The variable ALLBET$$ in the dataset $PGEN offers consistent data on company size thoughout all waves of the SOEP, although with fewer categories in a less detailed classification.
Employment Status. A consistent variable over all waves to differentiate employment status (in addition to the variable LFS$$, which differentiates non-employed persons).
Working experience full-time employment. Coverage of complete working experience in full-time employment (in years, one digit after the decimal point).
Working experience part-time employment. Coverage of complete working experience in part-time employment (in years, one digit after the decimal point).
Unemployment experience. Coverage of unemployment experience throughout the entire period of working life (in years, one digit after the decimal point).
Contact: Silke Anger
Social assistance to the elderly ("Grundsicherung im Alter").
Imputation flag: Social assistance to the elderly.
Losses from renting and leasing.
Imputation flag: losses from renting and leasing.
Losses from capital investment.
Imputation flag: losses from capital investment.
Race of individual
data already included in the variables M11124$$.
data already included in the variables M11125$$ .
Contact: Markus Grabka
Correction of [T-U]HPOP in HPFAD.
Correction of some individual and household weights for the years 2003 and 2004 (THHRF, UPHRF, and UHHRF).