March 23, 2015

SOEP Brown Bag Seminar

'What else are you concerned about?' – Exploiting free text for quantitative social sciences

Date

March 23, 2015
11:00 - 12:00

Location

Gustav-Schmoller-Raum
DIW Berlin im Quartier 110
Room 3.3.002A
Anton-Wilhelm-Amo-Straße 58
10117 Berlin

Speakers

Martin Brümmer and Julia Rohrer (University of Leipzig)

We analyzed 35,000 statements from 14,000 SOEP members that decided to share their worries beyond ticking off preset items by using the unstructured response format. Making free text accessible for quantitative analysis imposes specific challenges that are partly unique to the language and peculiarities of the respective text format. In our talk, we will outline the required processing steps for the worry statements reported in the SOEP. Subsequently, we will demonstrate different ways to make sense of this kind of text data.

Because using the unstructured response format was not mandatory in the survey, whether an individual decided to express his or her worries already contains information. As a preliminary point, we will thus link personal features such as gender, age, and personality traits to the likelihood of reporting worries. We proceed with analyses on the level of single words: How is the usage of frequent words related to age, which worries indicate that an individual lives in the eastern states of Germany, are there any worries that are positively associated with life satisfaction? Ultimately, we suggest Latent Dirichlet Allocation (LDA) as one way to handle the overwhelming diversity of text data by identifying shared topics across statements. Our results comprise 25 areas of concern in the lives of SOEP participants. While some of these topics seem to worry people consistently over time, the rising occurrence of other topics is linked to contemporary events.

Contact

keyboard_arrow_up