1. Dataset $PGEN: Variable casmin$$
A missing parenthesis in programming led to individuals in CASMIN category 6 (“(2c_gen) general maturity certificate”) being mistakenly placed in CASMIN category 7 ("(2c_voc) vocational maturity certificate").
For wave BG, this means that of the 4,553 observations in category 7, 1,976 actually belong in category 6 and 2,577 in category 7.
This can be corrected with the existing variable in the $PGEN data. For wave BG, it can be done as follows:
|
replace casmin16= 6 if inlist (bgpsbil,3,4) | bgpsbila==3 | bgpsbilo==3 replace casmin16= 7 if (inlist (bgpsbil,3,4) | bgpsbila==3 | bgpsbilo==3) & (inlist (bgpbbila,2,3,5,6,8) | (bgpbbil01>=1 & bgpbbil01<.) | (bgpbbilo>=1 & bgpbbilo<.)) replace casmin16= 8 if inlist (bgpbbil02,1,4) replace casmin16= 9 if inlist (bgpbbil02,2,3,5,6,7,8) | inlist (bgpbbila,4,7,9) |
2. Dataset [BE-BG]PGEN: Variable [be-bg]pbilla ("Vocational Degree Outside Germany")
The variable $$pbilla (foreign degrees – vocational education) in SOEP v33 was expanded retrospectively to include information on whether the degree had been completed. This revision failed, however, to take into account some of the information covered in certain modules. A correction can be made with the existing variables in the $PGEN data, as shown here: Statement | TXT, 2.72 KB
| Dataset | Variabel | Variable Label |
| bepgen | bepbbila | Vocational Degree Outside Germany |
| bfpgen | bfpbbila | Vocational Degree Outside Germany |
| bgpgen | bgpbbila | Vocational Degree Outside Germany |
3. Dataset BIOAGEL: Variable bioage
In the dataset BIOAGEL,the data type was not adjusted for the variable bioage. The variable shows which questionnaire the row of data was taken from. Since the variable bioage has included values > 99 since v33, this led to values > 99 being cut off in Stata. The cut-off values are:
| Variable | Value | Label |
| bioage | 101 | “bioage10a” |
| bioage | 102 | “bioage10b(only FID)” |
4. Dataset CIRDEF: Variable rgroup
The variable rgroup divides the SOEP sample into 20 equally sized groups. It is used to select the 50% sample. Since the new samples M3 and M4 were incorrectly assigned, there are no cases from these samples in the teaching version of the SOEP data.