Skip to content!

KonsortSWD

Current Project

Project Management

Christof Wolf, GESIS (spokesperson)
Representatives of SOEP:
Stefan Liebig (up to Sept. 2022)
Jürgen Schupp (up to 2020)
Jan Goebel (from Oct. 2022 on)

Project Period

October 1, 2020 - September 30, 2025

Funded by

Deutsche Forschungsgemeinschaft (DFG)

In Cooperation With

GESIS - Leibniz Institute for the Social Sciences
DIPF - Leibniz Institute for Research and Information in Education
DZHW - German Centre for Higher Education and Science Research
Freie Universität Berlin
ifo Institute
KIT - Karlsruhe Institute of Technology
LIfBi - Leibniz Institute for Educational Trajectories
RWI - Leibniz Institute for Economic Research
University of Bremen – Qualiservice Research Data Center
University of Duisburg-Essen – Chair of Public and Regional Policy
WZB - Berlin Social Science Center
ZBW - Leibniz Information Centre for Economics
ZPID - Leibniz Institute for Psychology

KonsortSWD - Consortium for the Social, Behavioural, Educational and Economic Sciences in the National Research Data Infrastructure (NFDI)

Researchers in the social, behavioral, educational, and economic sciences work with different types of data that are considered particularly sensitive due to legal or ethical restrictions and that were not originally collected for research purposes.

KonsortSWD aims to assist researchers working together on multi- and interdisciplinary projects to implement research data management (RDM) plans. The 14 institutions in KonsortSWD are contributing their experience in the operation of user-oriented research data infrastructures to the National Research Data Infrastructure (NFDI) in order to strengthen, expand, and deepen a research data infrastructure for the study of human society. The project is primarily user-driven and addresses the needs of the research communities involved.

The core of KonsortSWD’s RDM strategy is to provide researchers and research data centers (RDCs) with the tools and services they need for managing and sharing (new) sensitive and non-sensitive data in compliance with the FAIR principles for scientific data management and stewardship. This will include supporting sustainable RDM in all phases of the research data lifecycle and ensuring data accessibility, while taking ethical and legal considerations into account. Further information can be found on the consortium's website and first publications of the Measures are available on the multidisciplinary repository Zenodo.

KonsortSWD is divided into five task areas: Community Participation (mainly the responsibility of the RatSWD office), Data Access, Data Production, Technical Solutions and the Secretariat. In the Task Areas, various services for researchers and research data centres are developed and offered within the framework of individual sub-projects – called Measures. SOEP is coordinating Task Area 3 “Data Production” and is responsible for the individual measures TA2.M2 (RDCnet) and TA3.M5 (Open Data Format).

DIW Team

Management: Jan Goebel
Coordination: Janina Britzke

TA3.M01 Harmonized Variables - Combining Survey Data more Easily through Standardised and Harmonised Variables 2020-2023
TA3.M02 Supporting research data management in qualitative social research 2020-2025
TA3.M03 Linking Textual Data 2020-2025
TA3.M04 CODI – A service for coding open responses in surveys 2020-2023
TA3.M05 Open Data Format 2020-2025
TA3.M09 Data infrastructures for the research of societal crisis phenomena 2024-2025
TA3.M10 ForSynData 2024-2025

The seven subprojects listed above focus on the following areas of research and service provision:

  • interoperability and reusability of survey data through ex-ante/ex-post harmonization
  • standards for research data management (RDM) for qualitative data
  • unstructured text data use and linkage with standardized survey data
  • efficiency and quality of coding for (semi-)open response formats
  • non-proprietary data formats and long-term archiving
  • finding and using crisis-relevant data and survey instruments more easily
  • standardize the documentation of research syntheses and the heterogeneous data resulting from them

In the years to come, we aim to provide additional benefits to the community of data producers by making the standards and tools of research data management sustainable and by improving long-term archiving. For data users, the quantity of available data and the range of different data types will be expanded by enabling linkage of data types and opening up new possibilities for the use of existing data.

Measure TA2.M2 (Creating Single Points of Access for Sensitive Data in RDCs)

Project lead: Jan Goebel
Collaborators: Neil Murray, Kenny Pedrique

To create the optimal conditions for empirical research, it is crucial that data access is easy as well as secure. Researchers are normally able to access anonymized microdata after signing a contract with the data provider, but in the case of detailed, weakly anonymized data, data can only be used on-site at guest researcher workstations, which often means spending large amounts of time and money. Improving access to sensitive data is an important criterion for maximizing research potential.

KonsortSWD Measure TA2.M2 aims to close this gap. It will establish a research data infrastructure network (RDCnet) connecting guest researcher workstations at the participating research data centers in a network of secure data access points. This will enable researchers to access sensitive data from any of the participating guest researcher workstations. By improving ease of access, the measure will increase the number of data users, while leaving control over the ultimate distribution of the datasets with the data providers to ensure adherence individual standards of data security.

For more information, please visit the consortium website. Previous publications of the measure are available on the multidisciplinary repository Zenodo.

Measure TA3.M5 (Open, Metadata-Enriched, Non-Proprietary Data Format for Data Dissemination)

Project lead: Knut Wenzig
Collaborators: Xiaoyao Han, Tom Hartl

The principles of good scientific practice require that the steps of the research process as well as the materials used or produced are clearly documented and made accessible for subsequent use. During the research data lifecycle, numerous documents are produced that document the research process (e.g., study design descriptions, questionnaires, codebooks, descriptive summaries, data analysis replication codes). Ideally, each of these documents should be findable, accessible, interoperable, and reusable. One way of meeting these criteria is through the use of metadata to organize the research process. Social scientists use different and sometimes proprietary data analysis software packages that process metadata in different ways. In some cases, the metadata cannot be accessed through the data file itself but through pdf files or webpages. The different data formats used in statistical software packages that are only partially compatible present an obstacle for replication studies. Proprietary data formats in particular jeopardize the FAIR principle of interoperability.

The goal of this project is (A) to develop an open, non-proprietary, multilingual, metadata-enriched data format that (B) can be used with common statistical programs and that also enables access to the metadata. The data products will be described directly by the metadata; they will be more accessible and interoperable and will re-use upstream metadata. The project will also approach other communities that use metadata to take their software or metadata schema requirements into consideration and thus to integrate and expand the user base for the new data format. Specifications and software, including source codes, will be provided as FLOSS software under license (e.g., CC, MIT, LGPL), making the products easily usable in different contexts.

Project outcomes will include the following:

  1. Specification and documentation of a uniform metadata schema (KonsortSWD Metadata Schema) in consultation with the KonsortSWD research data centers.
  2. Technical integration of the metadata scheme

a) Development of a conversion filter that can be used to convert individual metadata structures into the KonsortSWD metadata schema.

b) Development of import filters for common statistical programs so that the KonsortSWD metadata schema can be used in dataset labeling and data management.

For more information, see the consortium website. Initial publications of the measure are available on the multidisciplinary repository Zenodo.

Contact

Janina Britzke
Janina Britzke

Staff Member of the Division Knowledge Transfer in the German Socio-Economic Panel study Department

keyboard_arrow_up