The SOEP supports efforts in the scientific community to make data easily available for replication and reanalysis. At the same time, the SOEP is obligated to ensure that respondents’ data are used solely for scientific purposes. This means that data users have to sign a data distribution contract with DIW Berlin and are forbidden from disseminating any part of the data to third parties.
We urge all researchers to archive the syntax they used in preparing and analyzing the SOEP data for analysis. The syntax should contain the specific version of the SOEP data, which can be clearly identified through the DOI, to enable replication.
Syntax can be archived in a free data repository such as the Open Science Framework (OSF: https://osf.io) or on the website of the journal publishing the paper. The SOEP Research Data Center also offers to archive the syntax files used by researchers in preparing and analyzing the SOEP data for analysis, and to make the syntax files available for download from its website.
Some journals also require that researchers provide access to the dataset they prepared for analysis from the original raw data. In such cases, the SOEP offers data users the option of archiving their research datasets at the SOEP Research Data Center as well. The SOEP-RDC will provide the dataset upon request to researchers who have signed a data distribution contract with DIW Berlin.
Dataset not completely anonymized. Please contact SOEPmail@diw.de to gain access.
Data protection issues are of utmost importance to both SOEP and CNEF. First, data protection is a crucial part of the (implicit) contract between the surveys and their respondents. Second, researchers who want to access the survey data must adhere to strict data protection regulations. The precautions taken by the surveys and data users to guarantee data protection ultimately help to ensure future participation by respondents. Because of the exceptionally high standards of data protection that apply to SOEP and CNEF data, making them available for reanalysis can present a major challenge. The SOEP data are subject to limited access: they are provided solely for research purposes (wissenschaftliche Zweckbindung) and therefore only to members of the scientific community. To obtain the data, researchers must sign a data distribution contract with DIW Berlin. Users of SOEP data are not permitted to transfer the data to third parties / other users not covered in the data distribution contract.
More and more of the scholarly journals that publish empirical papers using microdata stipulate that the data be submitted for archiving along with the paper itself. Two such journals are the Journal of Applied Econometrics and the American Economic Review. The latter recently adopted the following policy: “For published articles, the authors must provide both the data and the programs sufficient for the articles’ findings to be replicated. These data and programs are then posted on the journal's Web site. If the use of the data is restricted, the authors must provide instructions on how to obtain permission to use the data. If some of the data are proprietary, the editors try to work out ways for other researchers to use the data. In addition, the journal is encouraging studies to reanalyze data and replicate results.” (Kleppner et al, 2009: p. 96-97). The SOEP is keen to support such policies.
In the interests of improving the statistical infrastructure for reanalysis and replication studies using SOEP data, the SOEP group now offers users a variety of options for making their SOEP working dataset available to other researchers. These options apply to all data formats associated with SOEP, including CNEF, EU-SILC-Clone, LIS, and LWS. If your working dataset includes any SOEP microdata (or data derived from SOEP), you as a SOEP user may not transfer the data to the journal’s editorial office, but may instead take advantage of the following alternatives:
Data Availability: Data are available from the German Socio-economic Panel Study (SOEP) due to third party restrictions (for requests, please contact firstname.lastname@example.org). The scientific use file of the SOEP with anonymous microdata is made available free of charge to universities and research institutes for research and teaching purposes. The direct use of SOEP data is subject to the strict provisions of German data protection law. Therefore, signing a data distribution contract is a precondition for working with SOEP data. The data distribution contract can be requested with a form, available at: http://www.diw.de/soepforms. For further information, contact the SOEPhotline at either email@example.com or +49-30-89789-292.
Whenever a journal editor asks for your working data set, please contact us at firstname.lastname@example.org. We would be happy to deposit the data in a special archive and notify the journal editor about the access procedure.
If you want to send us data for our replication service, please use the cryptshare server hosted by us at https://cs-soep.diw.de/. After briefly registering, you can distribute the data using a secure, encrypted server.
In order to improve the infrastructure for the re-analysis of published findings based on SOEP data we also provide information of the following types
Kleppner, Daniel and Phillip A. Sharp (2009): Research Data in the Digital Age. Science, Vol. 325: 368, 24 July 2009.
Kleppner, Daniel et al. [Committee on Ensuring the Utility and Integrity of Research Data in a Digital Age] (2009): Ensuring the Integrity, Accessibility, and Stewardship of Research Data in the Digital Age. The National Academies Press, Washington, D.C.