SOEP Data

Linking Possibilities


Regional Data

SOEP offers diverse possibilities for regional and spatial analysis. With the anonymized regional information on the residences of SOEP respondents (households and individuals), it is possible to link numerous regional indicators on the levels of the states (Bundesländer), spatial planning regions, districts, and postal codes with the SOEP data on these households. However, specific security provisions must be observed due to the sensitivity of the data under data protection law (see overview). Accordingly, you are not allowed to make statements on, e.g., place of residence or administrative district in your analyses, but the data does provide valuable background information.


Are more detailed regional data offered?

Regional Analyses with SOEP: Overview

soep_regionaldaten_en.jpg

The variable $BULA (= Bundesland) is contained in the standard data set. If you need more detailed regional information for your research work, e.g., the municipal size classes, you need an expanded data distribution contract, which consists mainly of a data protection concept to be developed by you.

To use the spatial planning regions (geocodes) you need both an expanded data distribution contract and an expanded data protection concept. After signing your contract, you will receive this data.

On research stays at DIW Berlin or using our SOEPremote system of remote computer access, you can conduct analyses on the level of small-scale official county codes (KKZ), which are considered highly sensitive data under data protection regulations.

The precondition for using SOEPremote is, again, an expanded data distribution contract and an application for the use of SOEPremote | PDF, 67.61 KB , which also constitutes an addendum to your contract. After activation of access, you can transfer your analysis syntax (currently only in STATA format) through the remote access system by e-mail to our server. This processes the task automatically-after verifying data protection requirements-and sends you the results by email in a log file.

Postal code data can only be used on site at DIW Berlin in order to prevent misuse.

Can data from Federal States be evaluated as representative?

The highly populous states, e.g., Baden-Wuerrttemberg, Bavaria, and North Rhine-Westphalia, can be used for analysis given the large sample size. In general, the danger exists that for more detailed structural analyses, the case numbers on the specific states are too low to allow for statistically significant conclusions. The data can be evaluated, however, for "pools" of individual smaller federal states (e.g., state types).

Data protection concept

In your data protection concept for use of the municipal size classes, the following points should be taken into account:

  • Who will work with the data?
  • Procedures for changing passwords on a regular basis
  • Description of location/ type of computer where the data are being stored
  • Procedure to prevent use of the data on other computers (also not on home PC)

This must be signed by the data protection officer of your institution.

For the use of spatial planning regions (geocodes) you must also take the following points into account:

  • isolated computer (not linked to a network)
  • at least two-stage access control for files
  • authorized user must be able to determine whether unauthorized access by others has occurred or been attempted (through a protocol)
  • procedure to prevent use of the data on other computers (also not on a home PC)
  • limitation of access to central IT facilities
  • regular checks by data protection officer

This data protection concept must be signed by the data protection officer of your institution. An example of such a data protection concept, which of course would have to be adapted to your institution, can be found here | PDF, 18.83 KB .

You are welcome to send us a draft of your data protection concept by

Documentation

You can find a good overview of the possibilities for regional analyses with SOEP data in the following text, which was published in a DIW Berlin series:


Gundi Knies und C. Katharina Spieß:
Regional Data in the German Socio-Economic Panel Study (SOEP) | PDF, 1.66 MB

Peter Hintze, Tobia Lakes:
Data Documentation 46: Geographically Referenced Data in Social Science: A Service Paper for SOEP Data Users | PDF, 3.56 MB

Jan Goebel:
Job submission instructions for the SOEPremote System at DIW Berlin

Contact person: Jan Goebel

SOEP-LEE

There is increasing consensus in the economic and social sciences that the workplace plays a crucial role in individual life outcomes. This is true in the economic and sociological labor market research, network and social capital research, health research, the research on educational and competency acquisition processes, wage information, and the work-life interface, as well as in the inequality research as a whole. For this reason, there has been increasing interest in what are known as "linked employer-employee" (LEE) datasets, in which employees' individual data are linked with information on their employers.

The workplace data collected in the framework of the project SOEP-LEE will substantially expand the information on the work contexts and working conditionsof respondentsto the Socio-Economic Panel (SOEP) survey. The project has been implemented by asking all dependent employees in all of the SOEP samples to provide local contact information to their employer in 2011. The employer contact data then formed the basis for a standardizedemployer survey conducted seperately from the rest of the SOEP survey. This employer information can be linked with the individual and household data from the SOEP study. The new linked employer-employee dataset opens up new opportunities for wide-ranging forms of secondary analysis with innovative questions from wide range of disciplines in the social and economic sciences. An additional unique feature of SOEP-LEE is the analysis of employer survey data quality, carried out through the measurement of meta- and paradata over the course of data collection. As a result, this project also contributes to the ongoing development and refinement of survey methodology in the field of organizational studies.

The perspective of inequality theory provides the thematic focus for the employer survey. The central question is how inequalities in access to key resources and life opportunities come into being at the workplace level. The central dimensions of inequality under consideration here are: income; education and opportunities to realize educational investments in terms of chances of status gain or loss; returns to working life; and opportunities to balance work and family. We assume that the different groups of employees within a given company experience different restrictions and opportunities - in other words, that different employee groups differ in their access to further education or in the opportunities available to them to balance career and family. Furthermore, the data make it possible to analyze the results of the different heterogeneity conditions in companies (e.g., employment forms such as limited-term contracts, use of temporary employment and similar "atypical" employment forms, the gender composition, age structure, and educational structure of the workforce) in creating inequalities.

Documentation:

Michael Weinhardt, Alexia Meyermann, Stefan Liebig, Jürgen Schupp
The Linked Employer-Employee Study of the Socio-Economic Panel (SOEP-LEE): Project Report

Sebastian Bechmann, Kerstin Sleik
Methodenbericht der Betriebsbefragung des Sozio-oekonomischen Panels

Michael Weinhardt
Datenhandbuch der Betriebsbefragung des Sozio-oekonomischen Panels

TNS Infratest Sozialforschung, Michael Weinhardt
Erhebungsinstrumente und Datenkodierung der Betriebsbefragung des Sozio-oekonomischen Panels

Alexia Meyermann, Jennifer Elsner, Jürgen Schupp, Stefan Liebig
Pilotstudie einer surveybasierten Verknüpfung von Personen- und Betriebsdaten: Durchführung sowie Generierung einer Betriebsstudie als nachgelagerte Organisationserhebung zur SOEP-Innovationsstichprobe 2007 | PDF, 0.72 MB

Contact person: Jürgen Schupp

K2ID-SOEP

Titel: K²ID-SOEP extension study

DOI: 10.5684/k2id-soep-2013-15/v1
Collection Period: 2013-2015
Publication date:2017-09-08
Principal investigators: Pia S. Schober, C. Katharina Spieß
Further Researchers: Juliane F. Stahl, Georg F. Camehl

K2ID is short for „Kinder und Kitas in Deutschland“ and refers to the German name of the surveys carried out as part of a project entitled “Early childhood education and care quality in the Socio-Economic Panel” (K²ID-SOEP).

It aims at investigating effects of the quality of early childhood education and care (ECEC) institutions on children’s development and parents’ employment and wellbeing. It also examines socio-economic differences in parental choices of ECEC quality and whether they are linked to information asymmetries between mothers and ECEC providers.

The data collection of K2ID is based on participants of the Socio-Economic Panel (SOEP). In addition, participants of the “Families in Germany” (FID) study (which is in the process of being integrated with the SOEP) were also included in the sampling frame. From this group of people those with one or more children below school age at the date of the survey were given an additional questionnaire concerning their child care arrangements with a focus on quality. In case they used an ECEC institution they were also asked to provide the address of this institution and, if applicable to identify the specific group which their children attend. In a second step the ECEC institution directors and group educators were also given a questionnaire to collect additional information on quality in the respective setting.

More information on the study and data collection on the Homepage k2id.de.

Since October 2017, the data from the K2ID-SOEP survey have been passed on within the framework of the Research Data Center SOEP.

Documentation and Questionnaires

Questionnaires are only available in German.

Fragebogen für Eltern:

Fragebogen für ErzieherInnen:

Fragebogen für Kita-LeiterInnen:

SOEP-Record-Linkage

Longitudinal survey of migrants from the social insurance statistics

There is currently a lack of reliable empirical data in two areas of growing importance for future migration and integration research: a) data on the integration of German-born children and grandchildren of immigrants, and b) data on the integration of immigrants from countries that have joined the EU since 2004. In cooperation with the Institute for Employment Research (IAB) in Nuremberg, the SOEP created two samples of immigrants to Germany based on administrative data from the Federal Employment Agency that will then be continued as a longitudinal household survey in the SOEP study framework. The initial linkage of survey data on migrants with administrative data - including an experiment on agreement to the linkage of register data - opens up new analytical potentials for research and policy advice, and is of major importance for the research infrastructure in Germany.

Please note that data from both samples can be linked with administrative employment and income data: Survey respondents are asked to provide explicit consent to record linkage. But since this linked dataset contains social data, these weakly anonymized data are only accessible on site at the Research Data Center of the German Federal Employment Agency at the IAB (FDZ IAB). Researchers can access FDZ IAB data through a guest visit to the IAB or through remote data processing, also arranged with the IAB. The linked data will soon be available to external researchers. Requests for data access should be directed to FDZ IAB, since a contract with IAB for data use is required. (more information).

Documentation

Philipp Simon Eisnecker, Klaudia Erhardt, Martin Kroh, Parvati Trübswetter
The Request for Record Linkage in the IAB-SOEP Migration Sample | PDF, 433.04 KB

Philipp Eisnecker, Martin Kroh
The Informed Consent to Record Linkage in Panel Studies: Optimal Starting Wave, Consent Refusals, and Subsequent Panel Attrition

Contact person: Philipp Eisnecker

microm

The Microm-SOEP dataset enables users to link SOEP data with small-scale indicators from the micro-marketing provider microm. The Microm indicators have been matched with SOEP data on the housing block level. To protect the confidentiality of respondents’ data in accordance with data protection law, the data linkage was carried out on site at Kantar Public, the survey institute responsible for the SOEP fieldwork, which is alone in knowing respondents’ addresses.

All survey households remain completely anonymous. For security reasons—due to the small-scale nature of the data—analysis is only possible on specially protected SOEP computers on site at DIW Berlin.

Documentation (German only):

Jan Goebel, C. Katharina Spieß, Nils R. J. Witte, Susanne Gerstenberg
Die Verknüpfung des SOEP mit MICROM-Indikatoren: Der MICROM-SOEP-Datensatz | PDF, 0.75 MB

Contact person: Jan Goebel

SOEP-RV

In cooperation with the Research Data Center of the German Pension Insurance (FDZ-RV), we implement a record linkage of SOEP household survey data with administrative individual employment/retirement biographies, which are available on a monthly basis for employees since age 14.

By combining the two data sources, we complete SOEP’s biographical data and, at the same time, enrich the individual insurance biographies with numerous SOEP household- and individual-level context variables. The record-linked data, called SOEP-RV, will be provided to researchers for local use at their own computer. We expect the data to become a leading dataset for empirical research on households’ life-cycles, employment and retirement decisions and earnings and pension distributions.

 More information

Contact person: Holger Lüthen

Neighbourhood Effects