Generated variables serve the purpose of simplifying work with SOEP data, just as is the case with status variables. Specific assumptions are included in the generation of such variables and can be obtained in the documentation. Please look at the files $PGEN (PDF, 0.66 MB) and $HGEN (PDF, 0.64 MB) in the documentation (Contact: Joachim Frick).
Use the Stata command label language EN to switch to the English labels.
Depending on your research focus you have different possibilities:
As a matter of principle the sample classification does not change either by a change in citizenship or by means of moving to another sample region (from west to east Germany and vice versa). The person remains in the foreigner, west or east sample. Current citizenship (NATION$$) or residential region ($SAMPREG) can be easily recognized.
Since the beginning of SOEP, numerous persons within the survey have moved from east Germany to west Germany and on a smaller scale vice versa. Analyses which are geared to regional references, are partly severely distorted if the variable PSAMPLE (which indicates sample classification) is used.
(PSAMPLE can be found in PPFAD: 1 = subsample A, 2 = subsample B, 3 = subsample C, 4 = subsample D (Immigrants), 5 = subsample E (supplementary sample from 1998 onwards), 6 = subsample F (innovation sample from 2000 onwards)).
A correct regional classification of persons within the survey can only be achieved with the use of the time-dependent variables $SAMPREG in PPFAD and HPFAD (1 = West Germany, 2 = East Germany).
Since 1990 the west German and east German populations have been determined in $SAMPREG irrespective of the sample classification. We therefore recommend always using this value for regional analysis!
The following table (PDF, 5.69 KB), made while cross-tabulating $SAMPREG and PSAMPLE, gives an insight into the extent of regional mobility since 1990 (basis: all persons with $NETTO=1 (person interviews) or $NETTO=2 (children up to 16 years) in surveyed households).
Analogous to the phenomenon above ($SAMPREG vs. PSAMPLE) , the identity of sample B is often assumed to be that of the population of the group of "foreigners" surveyed by the SOEP, while sample A contains "Germans" . For the most part this is correct, although it is not precise and over time becomes less accurate.
At the beginning of the SOEP in 1984, it was the head of the household's nationality which defined the classification in both samples A and B. Nevertheless, it is possible that there are other household members present with a different nationality to that of the head of the household. In addition, sample A contains foreigners whose nationality was not represented in sample B. The difference between differ enormously. While up to the year 2000, sample C contains almost without any exception persons with German nationality, due to the high share of emigrants, sample D relatively contains a lot of Germans.
An ex-ante classification of the respective persons in "German" and "non-German" is impossible in the latest samples E and F due to the sample designs.
The following table (PDF, 6.3 KB), which was created through a cross-tabulation of the re-coded information contained in NATION$$ (1=German, 2=non-German including Item-Non-Response) and PSAMPLE, gives an insight into the heterogeneity of the SOEP samples regarding the nationality composition since 1984 (basis: all persons with $NETTO=1 (person interview).
Within the framework of a coincidental splitting of the Sample, the new survey method CAPI (Computer Assisted Personal Interview) was used in about half of the cases in Sample E. These interviews can be identified in the variables $PFORM* in $PBRUTTO or $HFORM* in $HBRUTTO.
Early analysis shows no signs of any significant method effects, i.e. the content of the results appears not to have been influenced by the method of data retrieval. Further analysis by users regarding retrieval methods would naturally be advisable.
Since 2001, this survey method has been increasingly adopted for the old subsamples A to D, as well as F.