Skip to content!

Linking Possibilities

Short Description

In addition to data from our main studies, SOEP-Core and SOEP-IS, the SOEP Research Data Center (SOEP-RDC) offers a number of other datasets. These provide diverse possibilities for data linkage for example in spatial or regional analysis.


The Microm-SOEP dataset enables users to link SOEP data with small-scale indicators from the micro-marketing provider microm. The Microm indicators have been matched with SOEP data on the housing block level. To protect the confidentiality of respondents’ data in accordance with data protection law, the data linkage was carried out on site at Kantar Public, the survey institute responsible for the SOEP fieldwork, which is alone in knowing respondents’ addresses.

All survey households remain completely anonymous. For security reasons—due to the small-scale nature of the data—analysis is only possible on specially protected SOEP computers on site at DIW Berlin.

Jan Goebel, C. Katharina Spieß, Nils R. J. Witte, Susanne Gerstenberg
Die Verknüpfung des SOEP mit MICROM-Indikatoren: Der MICROM-SOEP-Datensatz (PDF, 0.75 MB)

Contact Person

Neighbourhood Effects

The project "neighbourhood effects" aims at combining existing individual-level datasets at a small scale regional level with informations on the respective neighbourhood, which are generated from different data sources. In a second step, the role of neighbourhood effects on varying outcome variables is analyzed in a social context. Possible approaches are, among others, the investigation of the importance of neighbourhood effects on the individual labor market success or the individual likelihood of receiving welfare benefits.

cooperation partner at FDZ der Bundesagentur für Arbeit im Institut für Arbeitsmarkt- und Berufsforschung:

Stefan Bender (Project Head)
Theresa Scholz (Project Liaison)

cooperation partner at Rheinisch-Westfälisches Institut für Wirtschaftsforschung e. V.:

Matthias Vorell (Project Liaison)
Thomas K. Bauer (Project Liaison)

Data available at the RDC Ruhr as RWI-GEO-LAB data. Metadata: DOI 10.7807/DIWIABRWI:V1

Contact Person


SOEP survey data linked to administrative data of the IAB (SOEP-CMI-ADIAB)

The record linkage data product SOEP-CMI-ADIAB is provided cooperatively by the Socio-Economic Panel (SOEP) at the German Institute for Economics Research (DIW Berlin) and the Institute for Employment Research (IAB), Nuremberg. SOEP-CMI-ADIAB consists of the survey data of respondents of the SOEP and administrative data of the IAB – in case the respondents consented to their records being linked. The survey data of SOEP-CMI-ADIAB comprise data of SOEP-Core and the IAB-SOEP Migration Sample, the IAB-BAMF-SOEP Survey of Refugees as well as data from the SOEP Innovation Sample.

SOEP-CMI-ADIAB can be now used to investigate research questions regarding the rich socio-demographic information of the SOEP as well as precise income information from the administrative data sources which include earnings biographies starting from 1975.

Since the linked dataset contains weakly anonymized social data, the datasets are only accessible on site at the Research Data Center of the German Federal Employment Agency at the IAB (FDZ IAB). Researchers can use FDZ IAB data on a guest visit to the IAB or through remote data processing, which can also be arranged with the IAB.


For data access a data distribution contract with the SOEP and a contract with the FDZ IAB is necessary. For further information visit the website of the FDZ IAB.

DOI: 10.5164/IAB.FDZD.2303.en.v1

Antoni, Manfred; Beckmannshagen, Mattis; Grabka, Markus M.; Keita, Sekou; Trübswetter, Parvati (2023): Survey data of SOEP Core, IAB-SOEP Migration Sample, IAB-BAMF-SOEP Survey of Refugees and SOEP Innovation Sample linked to administrative data of the IAB (SOEP-CMI-ADIAB) 1975-2020. FDZ-Datenreport, 03/2023(en), Nuremberg.


SOEP survey data linked to administrative data of the German Pension Insurance

For SOEP-RV the Research Data Center of the German Pension Insurance (FDZ-RV) cooperates with the Socio-Economic Panel (SOEP) at the DIW Berlin (German Institute for Economic Research). It was initially funded by the Forschungsnetzwerk Alterssicherung (FNA). SOEP-RV links SOEP survey data with anonymized, high quality social security data from administrative pension records. The long time frame of the social security data provides unique possibilities for research combining administrative and survey information, such as studies addressing new questions of long-term inequality or policy reform effects. In particular, SOEP-RV offers significant potential for research on pensions and old age, and for research on methodological questions such as the consistency of self-reported versus administrative information. The project has been established on a permanent basis so that we will provide annual (RTBN) and biennial (VSKT) updates of the SOEP-RV data and the efforts to enlarge the sample with every new wave of the SOEP will continue.

A crucial condition for inclusion of SOEP data in SOEP-RV is that record linkage is only carried out with the expressed written consent of the SOEP respondents. A majority of the respondents allow the German pension insurance with an explicit consent to provide information from their pension records. Up to now, about 14,500 SOEP-Core and SOEP-IS respondents have consented to the record linkage.

The Research Data Center of the German Pension Insurance provides complementary longitudinal data in the format of the VSKT, which is generally used for statistical and research purposes, and the RTBN cross-sectional data set for pensioners:

  • The data in the format of the VSKT includes demographic variables and some pension-related variable, plus a module on the social income situation. This includes monthly information for socially insured employees and statutory insured pensioners. However, it does not include state employees or the self-employed.
  • The RTBN provides information on pensions, including for example when the pension started, the key factors of the pension calculation, and the pension paid.

The data can only be used for research purposes. The data is provided as scientific-use-files in German and English for the software-packages Stata and SPSS. Please note that the administrative records of the German Pension Insurance and the SOEP data must be ordered separately, but the data can easily be linked by the users. For both datasets it is mandatory to fill out an application form and sign a data usage contract with the respective research data center. After this, the data will be made available via download.

An application for the administrative data of the German Pension Insurance can be submitted through FDZ-RV.

DOI: 10.1515/jbnst-2021-0020

Lüthen, Holger; Schröder, Carsten; Grabka, Markus M.; Goebel, Jan; Mika, Tatjana; Brüggmann, Daniel; Ellert, Sebastian; Penz, Hannah (2022): SOEP-RV: Linking German Socio-Economic Panel Data to Pension Records. Jahrbücher für Nationalökonomie und Statistik 242 (2022), 2, S. 291-307.

Codebooks of the SOEP-RV datsets can be accessed through the FDZ-RV.


The SOEP provides various linked employer-employee datasets. Some of them stem from the two SOEP-LEE studies, both of which included data collection from establishments that employ SOEP-Core participants. The first SOEP-LEE study collected one wave of data in 2012, while SOEP-LEE2 added two more waves in 2022 and 2024 (scheduled). SOEP-LEE2 also comprises a business-related survey of self-employed SOEP-Core participants, which was fielded in 2022 and 2024 (scheduled), extending the 2020 wave contributed by the INNOMSME study. Within SOEP-LEE2, additional data were collected from establishments that are not linkable to SOEP-Core, but that received a similar questionnaire, resulting in a larger dataset for company-level analyses.

The two SOEP-LEE studies collected data on different topics. The first SOEP-LEE study focused on organization and management, human resources policies, wages and inequality, and the financial situation of the establishments. SOEP-LEE2 kept some of these topics, but its main focus were workplace digitalization, the organization of work, personnel management and development, as well as, in its 2022 wave, the COVID-19 pandemic. Starting with the 2024 wave, further emphasis was placed on the shortage of labor and questions concerning cybersecurity. The self-employed survey asked in the 2020 wave about innovation and productivity, R&D, (intangible) capital, and perceptions about one's own entrepreneurial activity. The 2022 wave continued with these themes, but adopted some questions from the SOEP-LEE2 establishment questionnaire for larger coherence.

Data access
Researchers who wish to access the data can do so by ordering the SOEP-Core EU edition (see SOEP data access) or by visiting the Research Data Center of the SOEP (see SOEP-in-Residence program). In the RDC SOEP, researchers have access to the onsite edition, which provides some variables of the SOEP-LEE2 data with more detailed scales and categories. Data from the first SOEP-LEE study are only available in the RDC SOEP.

Data Structure and Linkage
We provide the data of the two SOEP-LEE studies in different datasets. Data of the first SOEP-LEE study are contained in the datasets slee_estab and slee_sample. slee_estab includes the data collected in the establishment survey, while slee_sample is the linkage file that contains SOEP-Core person identifiers (pid) and establishment identifiers (eid), allowing for linkage with SOEP-Core.

Data of SOEP-LEE2 employer survey is distributed in the datasets lee2estab, lee2brutto, and lee2person. lee2estab contains the survey data themselves, while lee2brutto provides additional field work information. lee2person is the linkage file that contains SOEP-Core person identifiers (pid) and establishment identifiers (eid), allowing for linkage with SOEP-Core. Note that it is not possible to combine the 2012 wave of SOEP-LEE with the subsequent waves of SOEP-LEE2 into a single panel dataset because the 2012 wave uses different establishment identifiers, also if by chance the same establishment was surveyed.

Data for the self-employed is provided in the selfempl dataset. Each individual's business is identified by the SOEP-Core person identifier (pid) so that no further linkage file is required. 

For the first SOEP-LEE study, please cite: Weinhardt, M.; Meyermann, A.; Liebig, S.; Schupp, J. (2017). The Linked Employer-Employee Study of the Socio-Economic Panel (SOEP-LEE): Content, Design and Research Potential. Jahrbücher für Nationalökonomie und Statistik 237(5), 457–467.

For SOEP-LEE2, please cite: Matiaske, W., Schmidt, T. D., Halbmeier, C., Maas, M., Holtmann, D., Schröder, C., Böhm, T., Liebig, S., and Kritikos, A. S. (2023). SOEP-LEE2 : Linking Surveys on Employees to Employers in Germany. Jahrbücher Für Nationalökonomie Und Statistik Data Observer, 1–14.

Questions and variables are documented as part of SOEP-Core on Moreover, the following documentation is currently available:

slee_estab, 2012:

  • Questionnaire, pseudo-PAPI, PDF (de (PDF, 2.76 MB))
  • Codebook (de (PDF, 335.4 KB))
  • Methodological report (de (PDF, 3.1 MB))
  • Project Report (en (PDF, 1.45 MB))

lee2estab, 2022

  • Questionnaire, pseudo-PAPI, PDF (de (PDF, 459.86 KB))
  • Questionnaire with variable names, PDF (de (PDF, 212.49 KB))

selfempl 2020
Questionnaire with variable names, PDF (de)


Titel: K²ID-SOEP extension study

DOI: 10.5684/k2id-soep-2013-15/v1
Collection Period: 2013-2015
Publication date:2017-09-08
Principal investigators: Pia S. Schober, C. Katharina Spieß
Further Researchers: Juliane F. Stahl, Georg F. Camehl

K2ID is short for „Kinder und Kitas in Deutschland“ and refers to the German name of the surveys carried out as part of a project entitled “Early childhood education and care quality in the Socio-Economic Panel” (K²ID-SOEP).

It aims at investigating effects of the quality of early childhood education and care (ECEC) institutions on children’s development and parents’ employment and wellbeing. It also examines socio-economic differences in parental choices of ECEC quality and whether they are linked to information asymmetries between mothers and ECEC providers.

The data collection of K2ID is based on participants of the Socio-Economic Panel (SOEP). In addition, participants of the “Families in Germany” (FID) study (which is in the process of being integrated with the SOEP) were also included in the sampling frame. From this group of people those with one or more children below school age at the date of the survey were given an additional questionnaire concerning their child care arrangements with a focus on quality. In case they used an ECEC institution they were also asked to provide the address of this institution and, if applicable to identify the specific group which their children attend. In a second step the ECEC institution directors and group educators were also given a questionnaire to collect additional information on quality in the respective setting.

More information on the study and data collection on the Homepage

Since October 2017, the data from the K2ID-SOEP survey have been passed on within the framework of the Research Data Center SOEP.

Documentation and Questionnaires

Questionnaires are only available in German by now, English versions will follow.

Questionnaire for parents:

Questionnaire for child-minders:

Questionnaire for daycare managers:



Philipp Kaminsky, and Antonia Meier
User support and contract management for the Research Data Center of the SOEP in the German Socio-Economic Panel study Department