To provide the best conditions for working with SOEP data, we regularly conduct user surveys on various topics.
After a one-year break, from November 2021 to the beginning of January 2022 all SOEP users had the opportunity to participate in the SOEP User Survey 2021. They were given the opportunity to submit opinions, wishes and ideas, but also criticism and potentially emerging problems to the SOEP team. In addition to the classic questions regarding our services and infrastructure work, this year we asked about the use of specific data sets. We would like to thank the total of 1011 participants for their rich feedback and many suggestions.
This year our survey has a larger block of topics about the usage of net data sets. Among them were questions about the reasons when net data sets are not used. We asked about 4 data sets from the following questionnaires: parent-child questionnaires (bioagel), "schoolchildren" or "early youth" questionnaires (biopupil), youth questionnaires (jugendl), and "deceased person" questionnaires (vpl). In the course of this, we found that 3/4 of the participants in each case had never previously used the data from the respective questionnaires (Figure 1). Slightly less than one third of the users regularly work with the data set bioagel.
When asked why the data were never used for any of your analyses, more than half of the users indicated that they had no research interest in the populations (Figure 2). About 18% of users have no knowledge of the data from the questionnaires. Others feel that the data is not suitable for analyzing their research question.
This year, we asked for the largest time span that users have ever evaluated in a relevant analysis. Almost 1/5 of users use all 36 waves, meaning the time span between 1984 and 2019 (Figure 3). 9% of all users have worked with 35 waves, and 34 waves are used by 5% of participants. Less than 5% of respondents use a time span less than 33 years. This indicates that for the majority of users all 36 waves are relevant for working with SOEP data.
This year, users were asked about the largest time span they have ever evaluated in a relevant analysis. Half of the users use the time span between 26 and 36 years (Figure 3). Only 6% use the time range between 1 and 5 years. Almost 1/5 of users use all 36 waves.
Another topic this year was the knowledge or use of special samples. The most frequently used samples by our users are the migration sample M1/2 and the sample for refugees M3-5 (Figure 4). Both samples attract a lot of interest. The samples Social City and LGB* are presumably more relevant for specific questions and are thus used less on their own.
About 71% of respondents had never participated in a SOEP user survey before and participated for the first time in November. Overall, about 58% of all users were male in 2021. 70% of users are between 21 and 40 years old. As before, most users are in economics and social sciences, with more than 74% of all respondents locating themselves in these two disciplines.
Once again, we would like to thank all participants for their active participation!
The 2019 SOEP User Survey, conducted from mid-December 2019 to early January 2020, marked both the start of a new decade and the ten-year anniversary of the SOEP User Survey. Every year, the SOEP User Survey gives data users the opportunity to tell us about their experiences working with the SOEP data. Our 2019 survey focused on three topics: the technical preconditions for data analysis, the quality of the SOEP data, and our Getting Started services. We are grateful to the 812 respondents, whose valuable input will help us to continue developing and improving the SOEP further.
In this section of the survey, we asked data users what technical problems they had experienced, if any, and what hardware and software they needed to be able to work well with the SOEP data. The majority of users had no problems opening the SOEP datasets, but some had difficulties, for instance, processing the numerous variables in our individual long-format dataset in Stata/IC (Figure 1).
Based on this feedback, we developed several recommendations for data users. We recommend the use of Stata/MP or Stata/SE on a computer with an internal memory of 16GB. Users can still work with the data in Stata/IC or on less powerful computers, but some modifications, such as the commands “describe using pl.dta” and “use pid syear plVARS using pl.dta”, allow users work effectively with even our largest datasets while placing low demands on their hard- and software.
© DIW Berlin
In the second section of the survey, we asked users what they thought about various aspects of SOEP data quality. The results show strengths in the areas of reliability and punctuality. Users saw the greatest potential for improvement in the areas of documentation and user-friendliness. Based on this feedback, we have introduced improvements in these areas—for instance, in our Getting Started toolbox of services for new and returning data users.
© DIW Berlin
On a Likert scale, the means are displayed as a blue line and the medians by status of researchers as red dots.
The results of our 2018 User Survey showed a need for improvements in user-friendliness of the data, so over the course of 2019, the SOEP continued developing Getting Started, a toolbox of services designed to facilitate data use for new and returning users. These services include Paneldata.org, SOEPcompanion, SOEPtutorials, and SOEPhelp. In our 2019 User Survey, we invited users to rate these services. We asked whether they knew of each service, whether they had ever used it, and if so, whether they used it regularly or just occasionally. We then asked whether they would recommend each service to others. We only included answers from respondents who had used a service at least once. The survey results show that a large majority of SOEP data users would recommend our Getting Started services to others.
© DIW Berlin
We are grateful to all of the users who took part in our User Survey and look forward to the next ten years of working together with the SOEP research community.
In 2018, the SOEP User Survey entered its ninth year as an online survey. The annual survey invites users to give us their opinions, ideas, wishes, and critique, and to alert us to potential problems. This year, we shortened the questionnaire slightly to focus more on how researchers use and assess the different datasets and to give users more opportunities for feedback. We are grateful to the 797 respondents to our 2018 User Survey for their suggestions, which will help us to continue improving our data and services.
To stay abreast of the changing needs of SOEP data users and to create the best possible conditions for analyzing the SOEP data, we regularly ask what statistical software users work with. While Stata use had increased in popularity in previous years, it declined in 2018 (Figure 1, multiple answers allowed). Stata remains the most popular statistical software among SOEP users at 76 percent, followed by the programming language R at 31 percent. To respond to this trend, we will be providing a version of future data releases formatted for analysis in R.
© DIW Berlin
Use of the Studies
We are not only interested in finding out what software SOEP users work with, but also in what they are analyzing. The question of which SOEP datasets users work with regularly has been part of our user survey for many years. The results for 2018 present a stable picture relative to previous years (Figure 2). In 2018, 36% of users reported using SOEP-Core on a regular basis, and over two thirds of SOEP users have worked with SOEP-Core at least once. Just under one third of users reported regular use of SOEPlong, a dataset that we provide in easier-to-use “long” form. Survey results showed that users are less familiar with SOEPlong than with SOEP-Core. Regular use of SOEP-IS, a sample designed for innovative research questions, is relatively low (3%). Around one third of all users were not aware of SOEP-IS.
© DIW Berlin
Important Aspects of the SOEP
As in 2017, our 2018 User Survey asked users to rate the SOEP on specific quality criteria (Figure 3). Again using a seven-point Likert scale, users first rated how important each of the quality criteria was to them, and then evaluated the SOEP’s current performance in each area. The results show that users place the highest importance on understanding the process of data generation and on getting a quick idea of whether SOEP data would fit their research project. Both categories show a negative difference between expectations and realities. The SOEP is exceeding users’ expectations in the punctuality of data releases and in the e-mail information sent out to users about new SOEP studies and projects.
© DIW Berlin
Strength and Weaknesses
Sometimes we see ourselves differently from how others see us. To be able to build on our strengths and improve on our weaknesses, we asked users to tell us the SOEP’s three greatest strengths and weaknesses in an open-answer question. We sorted the diverse answers into 16 thematic categories. We compared the number of respondents who considered each category a strength or weakness and compiled the overview in Figure 4. The SOEP’s three greatest strengths were in the diversity and number of themes and variables, the long duration of the study, and the data format. Data access is a less pronounced strength: users regard it as positive but see some potential for improvement. Significant weaknesses lie in the user-friendliness of the data and documentation. We addressed both points in the data release following the 2018 user survey: we now provide an integrated version of the data, and our new SOEPcompanion is online at http://companion.soep.de/index.html.
© DIW Berlin
We are grateful to all of the respondents to our 2018 User Survey for their useful feedback and suggestions!
From November 20, 2017, to January 3, 2018, SOEP users had the opportunity to take part in our annual SOEP User Survey and contribute their opinions, requests, and ideas for the further development of the services provided by the SOEP. In addition to standard questions about our various services and infrastructural work, this year’s user survey also included questions focusing on optimizing user friendliness. We are grateful to the 757 users who participated in the survey for their valuable feedback and many suggestions.
Working with the SOEP data for the first time often poses a major challenge to our users. In order to increase user friendliness and address the specific problems users face, we wanted to start by clearly defining our SOEP user community. Around 63 percent of respondents worked with the SOEP for the first time in 2016 or 2017 and can therefore be considered “new users.” We refer to the approximately 37 percent of respondents who had used the SOEP data for the first time before 2016 as “old users.” Our new users are, on average, 31 years old, female (53 percent), and work in economics (62 percent) as doctoral students (35 percent) or research assistants (29 percent). Old users are approximately seven years older, and the majority are male (65 percent) and professors (33 percent) or research assistants (39 percent) in economics (43 percent) and sociology (42 percent).
This year, SOEP User Survey respondents were asked to rate the SOEP on a series of quality criteria (Figure 1) using a seven-point Likert scale. We then compared users’ expectations for each criterion with their ratings of the SOEP’s performance. The SOEP exceeded users’ expectations in areas such as punctuality of data release, information on new studies and projects, and the possibility to submit questions to SOEP-IS, and performed below users’ expectations in the area “understandable data generation process”.
To identify potential problems faced by first-time SOEP data users, this year’s survey asked respondents what subjects they would like to have covered in the SOEP’s instructional materials (Figure 2). Users were asked to think back to the first time they worked with the SOEP data and rate (on a seven-point Likert scale) how useful an instructional manual on specific topics would have been at that time, and whether the currently available instructional materials are sufficient. Respondents rated instructions on how to find the history of questions and variables (mean: 5.6) and understand the meaning of terms for datasets and variables (mean: 5.6) as extremely important. They reported having problems especially in “finding the history of questions and variables,” and felt that the available information (mean: 4.6) should be expanded. In the area “survey instruments and their contents,” the information provided by the SOEP met users’ expectations. Users rated detailed written descriptions with screenshots as their preferred form of instructional materials (Figure 3).
The SOEP offers an array of SOEPcampus events to make it easier to get started working with the SOEP. To tailor our workshops to old and new users’ needs, we asked User Survey respondents how important the various types of workshops are (Figure 4). When comparing the percentages of old and new users who rated a particular workshop as “very important,” we found differences in demands: introductory workshops on “techniques for preparing datasets for analysis” (difference: 18 percent) and “methods for analyzing longitudinal data (e.g., panel regression)” were much more important to new users. Among old users, there was a higher demand for an “introduction into the use of paneldata.org” (difference: -14%).
The extensive feedback from our User Survey is a valuable source of information that helps us to continually improve our work and our services. We are very grateful to all of the SOEP users who participated in our survey in 2017!
As we reported in the last SOEPnewsletter, a number of SOEP users were kind enough to participate in the user survey during the last two months of 2016. In addition to the classical questions on our services and infrastructure work, we were highly interested in finding out the level of user awareness about our various studies and how SOEP data are used. Approximately 30 percent of the 713 total respondents were new users who worked with the SOEP data for the first time in 2016. For the sixth year in a row, our survey was targeted to all SOEP data users and contracting parties.
Awareness level of the studies
This year, our questionnaire contained a larger group of topics on the awareness of and interest in our SOEP-based studies and their strengths and weaknesses. As a result, we were able to determine that around half of the respondents were not aware of the options that SOEP -IS offers, but found them interesting and would probably use them in the future (see Figures 1 and 2). This included the option of proposing questions and experiments for SOEP-IS, or evaluating other researchers’ questions and experiments after a suitable hiatus period. The survey also showed that many respondents are interested in the IAB-BAMF-SOEP Survey of Refugees in Germany and using the dataset available in November 2017 (see Figure 3).
© DIW Berlin
© DIW Berlin
© DIW Berlin
Use of the data
With regard to the use of our data, we also wanted to find out which statistics programs researchers use and with which frequency (see Figure 4). Stata is now used by 75 percent of users and has established itself as the most frequently used program. And R is gaining in frequency of use.
© DIW Berlin
Thanks to the very detailed feedback from our respondents, we have also received valuable suggestions on how to make our work and the services we offer even better. Their suggestions for improving documentation have provided us with the impetus to be clearer and more user-friendly concerning this. We are aware that the “landscape of SOEP studies” is filling up, and providing suitable aids is essential for beginning to work with SOEP or continuing to use it. For this reason, in addition to restructuring the paneldata.org metadata portal, we plan to completely revise the SOEPlong documentation. When integrating the IAB-BAMF-SOEP Survey of Refugees, we will update the existing SOEP data structure and provide additional separate variables generated purely for the migration. They will be documented in an understandable manner. Using our familiar channels (the SOEPnewsletter and SOEP website), we will keep you informed on the status of these activities.
We would again like to express our gratitude to all 2016 user survey respondents!
In winter 2015, 771 SOEP users again took part in the SOEP user survey. This year’s survey covered classic questions about SOEP service and infrastructure as well as the new topics of data sharing in academia and re-analysis of data. To ensure the highest possible participation in our survey, we sent the invitation to an integrated mailing list consisting of longtime SOEP users with a data distribution contract, new users who signed a subcontract for data use within the last year, users who download the SOEP data, and members of the SOEP mailing list. We are proud to report that we achieved the highest number of responses of any year since the start of our user survey. Participation increased 13% over 2014 (see Figure 1). We are very grateful to everyone who participated in the survey.
We do not know the characteristics of our entire user community. In the following we use the term “user community” to refer to those users who participated in our user survey.
The results show that in 2015, our user community was 41% women and 59% men: an 8 percentage point increase in female users and the highest number of female users since the beginning of the survey in 2004.
Research staff and post-doctoral students made up one third of all respondents to the user survey, while the percentage of professors has declined since last year. This has been accompanied by a decline in the use of SOEP data in teaching (from 69% in 2014 to 61% in 2015).
The research fields represented by SOEP users have not changed in a significant manner since the last user survey. The proportion of respondents from the field of economics has declined to 45% since the last survey. Around 41% of our users are from the social sciences or sociology.
In this year’s user survey, we wanted to find out our users’ preferences for data distribution. The increasing complexity of the SOEP sample means an increasing amount of effort to generate the data. To meet this challenge, the SOEP staff is constantly working to improve the process of data preparation and generation. We want to give you—our users—the opportunity to tell us your preferences so that we can meet your needs as well as possible. In the survey, respondents were asked to drag and drop the aspects of “advance data access”, “quality of data checking and testing” and “completeness of the data” into their own order of importance (see Figure 2). The results show that for our users, “advance data access” is less important than data quality checking or data completeness.
Based on this, we have concluded that we should put more weight on completeness and data checking procedures, even if this means delays in data provision.
We use our annual survey to evaluate the various services we provide. Respondents were asked to rate on a scale from 0 to 10 how satisfied they were with SOEP contract management, data, data downloads, and documentation. In all of these areas, the
overwhelming majority of responding users were very satisfied with our services. And in some areas, respondents rated us even higher this year than in 2014.
The importance of data documentation is also evident from the critiques and suggestions provided by respondents, which confirm the need to continue improving our work in this area. An important step in this direction has been taken with the introduction of our new metadata portal, paneldata.org. The difficulties entailed by learning a new way of
working are evident in Figure 3. Many of our users continue to use our old metadata portal, SOEPinfo, which continues to run parallel to paneldata.org. Almost half of all respondents were not yet aware of paneldata.org, at least not under this new name
that we introduced instead of SOEPinfo v.2. Thanks to our respondents’ extensive feedback, we have valuable ideas for facilitating the transition to paneldata.org.
We are working hard to optimize paneldata.org and to make it as user-friendly as possible. We encourage all users to take the leap and switch over, since paneldata.org contains documentation not only on SOEP-Core but also on the practical new SOEPlong, as well as the SOEP Innovation Sample (SOEP-IS) and other related longitudinal studies.
Thank you again to everyone who participated in our 2015 SOEP User Survey!
In fall 2014, a notable 662 SOEP users—to whom we are very grateful—took part again in our SOEP User Survey and gave us feedback on the range of services we currently provide. We are very pleased to be able to rely on a stable user community for our annual online survey. The number of users who completed the entire survey has risen continuously over the last four years (see Figure 1). Of the 662 users who expressed interest, 581 completed the survey, which means a dropout rate of around 12%, which was down slightly from the previous year (16%).
- A rising number of professors act as mulitpliers of the SOEP -
The results of the survey show a change in the composition of our user community with regard to occupational status and location of the institution. While the percentage of students dropped from 14.5 % in 2013 to 8.7% in 2014, the percentage of professors has increased substantially to 45.7% (previous year: 24.7%).
At the same time, the share of participants not personally working nor having worked with the SOEP data increases. Figure 2 illustrates the increase of professors using the data passively, taking more of a role in disseminating the use of the SOEP. Sixty-four per cent (64%) of professors respond with “yes” to the additional question of whether they supervise junior researchers with a thesis using SOEP data. We are pleased with this channel of enlarging our user community, although many of our questions in the user survey can only be answered with first-hand experience working with the SOEP data.
- Datendownload gut angenommen, Dokumentation verbesserungswürdig -
The survey always asks respondents to evaluate each of our individual service areas. As seen in Figure 3, users gave high marks for the data download service introduced in 2013. Users reported equally high levels of satisfaction with the quality of the data and contract management. In the area of documentation, however, respondents saw room for improvement. We are well aware of the importance of our data documentation and are working constantly to improve in this area. One example is the recently updated Desktop Companion. Since the SOEP is still expanding with its various Related Studies (SOEP-RS), increasing effort is required to produce detailed documentation. Integrating the FiD study, which ran through 2014, into SOEP Core poses one such challenge. In 2015, we will be focusing on adapting these data to guarantee consistent and user-friendly documentation of the SOEP data.
- SOEPinfo v.2 to provide better access to data documentation -
In addition, we plan to further establish the use of our new metadata portal, SOEPinfo v.2, in our user community. It was developed as part of the open-source project “DDI on Rails” and includes not only thorough documentation of SOEP Core from the previous online resource, SOEPinfo, but also a complete picture of the SOEPlong data. The user survey showed that just a few months after SOEPinfo v.2 was introduced, around one-third of all respondents had already worked with it (see Figure 4). This group of respondents gave the version they used an average of 7 out of 10 possible points. In four out of six subcategories—visual design, information content, quality of the generated syntax, and response speed—the average rating was 7 or above. Overall, users’ evaluations of the new SOEPinfo were around equally high as those in 2011 in the same categories.
Survey respondents who had not used the new SOEPinfo v.2 reported that they were continuing to use the old SOEPinfo mainly out of habit or because they did not see a need to switch. We are very curious to see how users will respond to the question of which data documentation sources they use in the next user survey. Until then, we would like to encourage all our users to take advantage of this new service SOEPinfo v.2 and especially of the opportunity to provide us with your feedback.