User Survey

In 2018, the SOEP User Survey entered its ninth year as an online survey. The annual survey invites users to give us their opinions, ideas, wishes, and critique, and to alert us to potential problems. This year, we shortened the questionnaire slightly to focus more on how researchers use and assess the different datasets and to give users more opportunities for feedback. We are grateful to the 797 respondents to our 2018 User Survey for their suggestions, which will help us to continue improving our data and services.

Data Analysis

To stay abreast of the changing needs of SOEP data users and to create the best possible conditions for analyzing the SOEP data, we regularly ask what statistical software users work with. While Stata use had increased in popularity in previous years, it declined in 2018 (Figure 1, multiple answers allowed). Stata remains the most popular statistical software among SOEP users at 76 percent, followed by the programming language R at 31 percent. To respond to this trend, we will be providing a version of future data releases formatted for analysis in R.

Figure 1


Use of the Studies

We are not only interested in finding out what software SOEP users work with, but also in what they are analyzing. The question of which SOEP datasets users work with regularly has been part of our user survey for many years. The results for 2018 present a stable picture relative to previous years (Figure 2). In 2018, 36% of users reported using SOEP-Core on a regular basis, and over two thirds of SOEP users have worked with SOEP-Core at least once. Just under one third of users reported regular use of SOEPlong, a dataset that we provide in easier-to-use “long” form. Survey results showed that users are less familiar with SOEPlong than with SOEP-Core. Regular use of SOEP-IS, a sample designed for innovative research questions, is relatively low (3%). Around one third of all users were not aware of SOEP-IS.

Figure 2


Important Aspects of the SOEP

As in 2017, our 2018 User Survey asked users to rate the SOEP on specific quality criteria (Figure 3). Again using a seven-point Likert scale, users first rated how important each of the quality criteria was to them, and then evaluated the SOEP’s current performance in each area. The results show that users place the highest importance on understanding the process of data generation and on getting a quick idea of whether SOEP data would fit their research project. Both categories show a negative difference between expectations and realities. The SOEP is exceeding users’ expectations in the punctuality of data releases and in the e-mail information sent out to users about new SOEP studies and projects.

Figure 3


Strength and Weaknesses

Sometimes we see ourselves differently from how others see us. To be able to build on our strengths and improve on our weaknesses, we asked users to tell us the SOEP’s three greatest strengths and weaknesses in an open-answer question. We sorted the diverse answers into 16 thematic categories. We compared the number of respondents who considered each category a strength or weakness and compiled the overview in Figure 4. The SOEP’s three greatest strengths were in the diversity and number of themes and variables, the long duration of the study, and the data format. Data access is a less pronounced strength: users regard it as positive but see some potential for improvement. Significant weaknesses lie in the user-friendliness of the data and documentation. We addressed both points in the data release following the 2018 user survey: we now provide an integrated version of the data, and our new SOEPcompanion is online at

Figure 4


We are grateful to all of the respondents to our 2018 User Survey for their useful feedback and suggestions!


From November 20, 2017, to January 3, 2018, SOEP users had the opportunity to take part in our annual SOEP User Survey and contribute their opinions, requests, and ideas for the further development of the services provided by the SOEP. In addition to standard questions about our various services and infrastructural work, this year’s user survey also included questions focusing on optimizing user friendliness. We are grateful to the 757 users who participated in the survey for their valuable feedback and many suggestions.


Working with the SOEP data for the first time often poses a major challenge to our users. In order to increase user friendliness and address the specific problems users face, we wanted to start by clearly defining our SOEP user community. Around 63 percent of respondents worked with the SOEP for the first time in 2016 or 2017 and can therefore be considered “new users.” We refer to the approximately 37 percent of respondents who had used the SOEP data for the first time before 2016 as “old users.” Our new users are, on average, 31 years old, female (53 percent), and work in economics (62 percent) as doctoral students (35 percent) or research assistants (29 percent). Old users are approximately seven years older, and the majority are male (65 percent) and professors (33 percent) or research assistants (39 percent) in economics (43 percent) and sociology (42 percent).

SOEP Service

Figure 1


This year, SOEP User Survey respondents were asked to rate the SOEP on a series of quality criteria (Figure 1) using a seven-point Likert scale. We then compared users’ expectations for each criterion with their ratings of the SOEP’s performance. The SOEP exceeded users’ expectations in areas such as punctuality of data release, information on new studies and projects, and the possibility to submit questions to SOEP-IS, and performed below users’ expectations in the area “understandable data generation process”.

 User Friendliness

Figure 2


To identify potential problems faced by first-time SOEP data users, this year’s survey asked respondents what subjects they would like to have covered in the SOEP’s instructional materials (Figure 2). Users were asked to think back to the first time they worked with the SOEP data and rate (on a seven-point Likert scale) how useful an instructional manual on specific topics would have been at that time, and whether the currently available instructional materials are sufficient. Respondents rated instructions on how to find the history of questions and variables (mean: 5.6) and understand the meaning of terms for datasets and variables (mean: 5.6) as extremely important. They reported having problems especially in “finding the history of questions and variables,” and felt that the available information (mean: 4.6) should be expanded. In the area “survey instruments and their contents,” the information provided by the SOEP met users’ expectations. Users rated detailed written descriptions with screenshots as their preferred form of instructional materials (Figure 3).


Figure 3


Figure 4


The SOEP offers an array of SOEPcampus events to make it easier to get started working with the SOEP. To tailor our workshops to old and new users’ needs, we asked User Survey respondents how important the various types of workshops are (Figure 4). When comparing the percentages of old and new users who rated a particular workshop as “very important,” we found differences in demands: introductory workshops on “techniques for preparing datasets for analysis” (difference: 18 percent) and “methods for analyzing longitudinal data (e.g., panel regression)” were much more important to new users. Among old users, there was a higher demand for an “introduction into the use of” (difference: -14%).

The extensive feedback from our User Survey is a valuable source of information that helps us to continually improve our work and our services. We are very grateful to all of the SOEP users who participated in our survey in 2017!



As we reported in the last SOEPnewsletter, a number of SOEP users were kind enough to participate in the user survey during the last two months of 2016. In addition to the classical questions on our services and infrastructure work, we were highly interested in finding out the level of user awareness about our various studies and how SOEP data are used. Approximately 30 percent of the 713 total respondents were new users who worked with the SOEP data for the first time in 2016. For the sixth year in a row, our survey was targeted to all SOEP data users and contracting parties.

Awareness level of the studies

This year, our questionnaire contained a larger group of topics on the awareness of and interest in our SOEP-based studies and their strengths and weaknesses. As a result, we were able to determine that around half of the respondents were not aware of the options that SOEP -IS offers, but found them interesting and would probably use them in the future (see Figures 1 and 2). This included the option of proposing questions and experiments for SOEP-IS, or evaluating other researchers’ questions and experiments after a suitable hiatus period. The survey also showed that many respondents are interested in the IAB-BAMF-SOEP Survey of Refugees in Germany and using the dataset available in November 2017 (see Figure 3).




Use of the data

With regard to the use of our data, we also wanted to find out which statistics programs researchers use and with which frequency (see Figure 4). Stata is now used by 75 percent of users and has established itself as the most frequently used program. And R is gaining in frequency of use.


Thanks to the very detailed feedback from our respondents, we have also received valuable suggestions on how to make our work and the services we offer even better. Their suggestions for improving documentation have provided us with the impetus to be clearer and more user-friendly concerning this. We are aware that the “landscape of SOEP studies” is filling up, and providing suitable aids is essential for beginning to work with SOEP or continuing to use it. For this reason, in addition to restructuring the metadata portal, we plan to completely revise the SOEPlong documentation. When integrating the IAB-BAMF-SOEP Survey of Refugees, we will update the existing SOEP data structure and provide additional separate variables generated purely for the migration. They will be documented in an understandable manner. Using our familiar channels (the SOEPnewsletter and SOEP website), we will keep you informed on the status of these activities.

We would again like to express our gratitude to all 2016 user survey respondents!


In winter 2015, 771 SOEP users again took part in the SOEP user survey. This year’s survey covered classic questions about SOEP service and infrastructure as well as the new topics of data sharing in academia and re-analysis of data. To ensure the highest possible participation in our survey, we sent the invitation to an integrated mailing list consisting of longtime SOEP users with a data distribution contract, new users who signed a subcontract for data use within the last year, users who download the SOEP data, and members of the SOEP mailing list. We are proud to report that we achieved the highest number of responses of any year since the start of our user survey. Participation increased 13% over 2014 (see Figure 1). We are very grateful to everyone who participated in the survey.

We do not know the characteristics of our entire user community. In the following we use the term “user community” to refer to those users who participated in our user survey.

Figure 1

The results show that in 2015, our user community was 41% women and 59% men: an 8 percentage point increase in female users and the highest number of female users since the beginning of the survey in 2004.

Research staff and post-doctoral students made up one third of all respondents to the user survey, while the percentage of professors has declined since last year. This has been accompanied by a decline in the use of SOEP data in teaching (from 69% in 2014 to 61% in 2015).

The research fields represented by SOEP users have not changed in a significant manner since the last user survey. The proportion of respondents from the field of economics has declined to 45% since the last survey. Around 41% of our users are from the social sciences or sociology.

Data distribution

In this year’s user survey, we wanted to find out our users’ preferences for data distribution. The increasing complexity of the SOEP sample means an increasing amount of effort to generate the data. To meet this challenge, the SOEP staff is constantly working to improve the process of data preparation and generation. We want to give you—our users—the opportunity to tell us your preferences so that we can meet your needs as well as possible. In the survey, respondents were asked to drag and drop the aspects of “advance data access”, “quality of data checking and testing” and “completeness of the data” into their own order of importance (see Figure 2). The results show that for our users, “advance data access” is less important than data quality checking or data completeness.

Based on this, we have concluded that we should put more weight on completeness and data checking procedures, even if this means delays in data provision.

Figure 2

Data documentation

We use our annual survey to evaluate the various services we provide. Respondents were asked to rate on a scale from 0 to 10 how satisfied they were with SOEP contract management, data, data downloads, and documentation. In all of these areas, the
overwhelming majority of responding users were very satisfied with our services. And in some areas, respondents rated us even higher this year than in 2014.

The importance of data documentation is also evident from the critiques and suggestions provided by respondents, which confirm the need to continue improving our work in this area. An important step in this direction has been taken with the introduction of our new metadata portal, The difficulties entailed by learning a new way of
working are evident in Figure 3. Many of our users continue to use our old metadata portal, SOEPinfo, which continues to run parallel to Almost half of all respondents were not yet aware of, at least not under this new name
that we introduced instead of SOEPinfo v.2. Thanks to our respondents’ extensive feedback, we have valuable ideas for facilitating the transition to

We are working hard to optimize and to make it as user-friendly as possible. We encourage all users to take the leap and switch over, since contains documentation not only on SOEP-Core but also on the practical new SOEPlong, as well as the SOEP Innovation Sample (SOEP-IS) and other related longitudinal studies.

Figure 3

Thank you again to everyone who participated in our 2015 SOEP User Survey!

In fall 2014, a notable 662 SOEP users—to whom we are very grateful—took part again in our SOEP User Survey and gave us feedback on the range of services we currently provide. We are very pleased to be able to rely on a stable user community for our annual online survey. The number of users who completed the entire survey has risen continuously over the last four years (see Figure 1). Of the 662 users who expressed interest, 581 completed the survey, which means a dropout rate of around 12%, which was down slightly from the previous year (16%).

Figure 1

- A rising number of professors act as mulitpliers of the SOEP -

The results of the survey show a change in the composition of our user community with regard to occupational status and location of the institution. While the percentage of students dropped from 14.5 % in 2013 to 8.7% in 2014, the percentage of professors has increased substantially to 45.7% (previous year: 24.7%). 

At the same time, the share of participants not personally working nor having worked with the SOEP data increases. Figure 2 illustrates the increase of professors using the data passively, taking more of a role in disseminating the use of the SOEP. Sixty-four per cent (64%) of professors respond with “yes” to the additional question of whether they supervise junior researchers with a thesis using SOEP data. We are pleased with this channel of enlarging our user community, although many of our questions in the user survey can only be answered with first-hand experience working with the SOEP data.

Figure 2

- Datendownload gut angenommen, Dokumentation verbesserungswürdig -

The survey always asks respondents to evaluate each of our individual service areas. As seen in Figure 3, users gave high marks for the data download service introduced in 2013. Users reported equally high levels of satisfaction with the quality of the data and contract management. In the area of documentation, however, respondents saw room for improvement. We are well aware of the importance of our data documentation and are working constantly to improve in this area. One example is the recently updated Desktop Companion. Since the SOEP is still expanding with its various Related Studies (SOEP-RS), increasing effort is required to produce detailed documentation. Integrating the FiD study, which ran through 2014, into SOEP Core poses one such challenge. In 2015, we will be focusing on adapting these data to guarantee consistent and user-friendly documentation of the SOEP data.

Figure 3

- SOEPinfo v.2 to provide better access to data documentation -

In addition, we plan to further establish the use of our new metadata portal, SOEPinfo v.2, in our user community. It was developed as part of the open-source project “DDI on Rails” and includes not only thorough documentation of SOEP Core from the previous online resource, SOEPinfo, but also a complete picture of the SOEPlong data. The user survey showed that just a few months after SOEPinfo v.2 was introduced, around one-third of all respondents had already worked with it (see Figure 4). This group of respondents gave the version they used an average of 7 out of 10 possible points. In four out of six subcategories—visual design, information content, quality of the generated syntax, and response speed—the average rating was 7 or above. Overall, users’ evaluations of the new SOEPinfo were around equally high as those in 2011 in the same categories.

Figure 4

Survey respondents who had not used the new SOEPinfo v.2 reported that they were continuing to use the old SOEPinfo mainly out of habit or because they did not see a need to switch. We are very curious to see how users will respond to the question of which data documentation sources they use in the next user survey. Until then, we would like to encourage all our users to take advantage of this new service SOEPinfo v.2 and especially of the opportunity to provide us with your feedback.

Auch 2013 wurde das Projekt als Online-Befragung mit dem webbasierten Instrument LimeSurvey© durchgeführt. Befragungsmonat war November 2013, es wurde auf strikte Einhaltung der Datenschutzbestimmungen geachtet. Außer den Unterzeichnern eines SOEP-Datenweitergabevertrages, die zur Teilnahme eingeladen worden waren, hatten auch andere Nutzer die Möglichkeit der Teilnahme, indem sie sich über dei DIW Berlin-Webseite anmeldeten. Insgesamt konnten 585 anonymisierte Antworten für die Analyse genutzt werden.

Bei den Unter-30jährigen gibt es mehr weibliche als männliche Datennutzer_innen. Bei den Nutzer_innen über 30 zeigt die Statistik ein umgekehrtes Ergebnis.
Insgesamt waren 2013 ungefähr 61% aller Nutzer_innen männlich.

Die meisten Nutzer_innen der SOEP-Daten sind wie bisher in den Wirtschafts- und Sozialwissenschaften zu finden.
2013 verorteten sich mehr als 80% aller Antwortenden in diesen beiden Disziplinen.

¹ Includes public health, other social sciences, and in 2004 also psychology.
² Also includes information science.

Falls Sie an weiteren Ergebnissen Interesse haben: Wir haben sie auf einem englischsprachigen  Poster (PDF, 0.75 MB) zusammengefasst.

Insgesamt nahmen 574 NutzerInnen an der Befragung, die im Frühjahr 2012 durchgeführt wurde, teil, was einer Response Rate von 21 % entspricht. Die meisten unserer User kommen weiterhin aus der Ökonomie (49%) und der Soziologie (36 %).

Da das SOEP als Längsschnittstudie konzipiert wurde, war es uns besonders wichtig, mehr darüber zu erfahren, ob diese Eigenschaft der Daten auch für Analysen genutzt wird. Hier zeigt sich, dass nur 19 % der User, die uns antworteten, die Daten ausschließlich im Querschnitt nutzen, 22 % nutzen sie ausschließlich im Längsschnitt. Die Mehrheit (59%) nutzte die Daten sowohl im Querschnitt als auch im Längsschnitt.

Eine deutliche Veränderung zeigt sich bei der Nutzung statistischer Software. Im Jahr 2004 war SPSS die meistgenutzte Software (66 %), gefolgt von Stata (28 %) und SAS (5 %). Im Jahr 2012 belegt Stata den ersten Platz (55 %), SPSS den zweiten (26 %) und auf Platz drei wird SAS von der Open Source Software R mit 9 % abgelöst.

Um einen umfassenden Überblick zur Zufriedenheit der SOEP-NutzerInnen mit den verschiedenen Bereichen wie Datenqualität, Datenzugang und Dokumentation zu bekommen, werden im SOEP regelmäßig Befragungen unserer nationalen wie internationalen Datennutzer durchgeführt. So war es auch das zentrale Anliegen des im letzten Sommer durchgeführten „User-Survey 2011", wichtige Hinweise und Anregungen zur Verbesserung unseres Services zu erhalten.

Wir verschickten 1996 Emails an unsere Vertrags- und UntervertragsnehmerInnen. Von 443 Usern (22,2 %) erhielten wir eine Antwort. Diese Zahl entspricht ziemlich genau der Anzahl der „aktiven" SOEP-NutzerInnen, die im Jahr 2010 eine Daten-DVD angefordert haben und ausgeliefert bekamen (N = 420). Wir bedanken uns herzlich bei allen, die unserer Einladung gefolgt sind und den Fragebogen ausfüllten.

Großteil der SOEP Nutzer aus der Ökonomie und Soziologie

Wie schon in der Vergangenheit stammt auch diesmal der größte der Teil der Antwortenden aus den Disziplinen Ökonomie (50%)und Soziologie (33%). Es folgen NutzerInnen aus dem Bereich Psychologie (6 %), Statistik (4 %) und Politikwissenschaft (2%). Die restlichen 6 % stammen aus der Medizin, Pädagogik und Geografie. Die meisten Antwortenden leben in Deutschland (70 %) und der europäischen Union (20%). Aus Nordamerika und anderen Teilen der Welt erhielten wir 6 % bzw. 4 % der Antworten.

Insgesamt lässt sich eine hohe Zufriedenheit der User feststellen: Im Mittel betrug die Zufriedenheit mit den Daten 8,3, mit dem Datenzugang 8,6 und mit der Dokumentation 7,9 (möglicher Wertebereich von 0 bis 10). Nur fünf Antwortende gaben an, unzufrieden zu sein (Werte zwischen 0 und 4).

Vier von fünf nutzen den Längsschnittaspekt der SOEP Daten - ein Fünftel bereits SOEPlong

Die Ergebnisse zur Datennutzung zeigen, dass mehr als 80 % auch den Längsschnittaspekt der Daten nutzen. Für uns ist das sehr erfreulich, denn damit haben wir einen Hinweis erhalten, dass die Bereitstellung unseres neuen Datenformats SOEPlong der richtige Weg ist und die Arbeit mit dem SOEP für viele NutzerInnen erleichtern wird. SOEPlong reduziert die Anzahl der Datensätze extrem, da alle ähnlichen Datensätze zusammengefasst werden. Weiterhin wird durch SOEPlong das Problem der von Welle zu Welle unterschiedlichen Variablennamen gelöst. Trotz seines Beta Stadiums wird das neue Datenformat bereits von 20 % der Antwortenden genutzt. In der neuen Datenlieferung dieses Jahres stellen wir bereits die zweite, verbesserte Beta-Version dieses Formats zur Verfügung.
Über Rückmeldungen und Anregungen sind wir nach wie vor dankbar.

Lehrversion der SOEP-Daten

Ebenfalls sehr interessant sind die Ergebnisse zur Nutzung der SOEP-Daten in der Lehre. Obwohl 68 % der Antwortenden Lehrveranstaltungen durchführen, wird die spezielle Lehrversion der SOEP-Daten nur von 17 % der in der Lehre tätigen Antwortenden eingesetzt. Die Existenz einer speziellen Lehrversion war sogar 42% der User unbekannt. Daher werden wir in Zukunft stärker über die Lehrversion der SOEP-Daten informieren.

SOEPinfo fast allen bekannt

Auch in Bezug auf SOEPinfo hat die Nutzerbefragung Erkenntnisse geliefert. So zeigte sich zwar, dass nur 13% der antwortbereiten SOEP-NutzerInnen SOEPinfo nicht kennen. Dennoch wollen wir in Zukunft SOEPinfo auf unserer Homepage noch sichtbarer machen. Weiterhin wollen wir die Möglichkeiten von SOEPinfo noch weiter verbessern. Ein Ziel ist es beispielsweise, in Zukunft auch Metadaten-Informationen über das SOEPlong-Format in SOEPinfo einzubinden.

Rund zwei Drittel der SOEP-NutzerInnen verwenden Stata

Im Vergleich mit der NutzerInnen-Befragung des Jahres 2004 lässt sich eine Veränderung in der für die Arbeit mit den SOEP-Daten eingesetzten Software erkennen. Anders als im Jahr 2004 wird nun von den meisten Antwortenden Stata eingesetzt, SPSS hat seine Führungsposition verloren. Die Open-Source Software R wird von 8 % der Antwortenden eingesetzt. Mplus (3%), SAS (3 %) und TDA (2 %) werden kaum für die Arbeit mit den SOEP-Daten verwendet.

Die Stata-NutzerInnen wurden darüber hinaus nach der eingesetzten Stata-Version befragt. Die Befragung zeigte, dass drei Viertel der Stata-NutzerInnen über die Stata Versionen SE oder MP verfügen. Die begrenzte Intercooled Version von Stata, die nur eine limitierte Anzahl der nutzbaren Variablen innerhalb eines Datensatz erlaubt und daher den Personen-Datensatz von SOEPlong nicht komplett öffnen kann, wird von einer deutlich kleineren Gruppe genutzt (Hinweis: Es ist allerdings möglich über eine Auswahl der Variablen direkt mit dem `use' Befehl dennoch die benötigten Variablen in STATA IC zu laden: `use HID PID varlist using pl.dta'.)  

Danke für Ihre Teilnahme!

Wie wichtig die regelmäßige „Kommunikation" mit Ihnen, den SOEP-NutzerInnen ist, zeigt unter anderem die Tatsache, dass uns auf dem Wege der Befragung zwei kleinere Fehler in den Daten mitgeteilt wurden, die wir bereits mit der neuen Datenlieferung SOEP.v27 beheben konnten (ein Fehler in der Variablenkorrespondenz und ein Fehler in einem englischen Label). Wir bedanken uns ausdrücklich bei allen TeilnehmerInnen und werden die uns auf den Weg gegebenen Anregungen bestmöglich umsetzen.

In der Zukunft werden wir Sie erneut befragen. Wir planen das jährlich zu tun und zwar in der vorlesungsfreien Zeit, in der Hoffnung, dass das Ihren Bedürfnissen entgegenkommt und noch mehr NutzerInnen an der Befragung teilnehmen. Zudem wollen wir die Befragung noch nutzungsfreundlicher gestalten und den Umfang reduzieren. Der nächste SOEP User Survey kommt also bestimmt. Wir hoffen, dass Sie dann (wieder) dabei sind!