The German Socio-Economic Panel Study is a representative panel study for the German population, collecting data on a broad variety of topics of everyday life, including general wellbeing, household composition, educational aspirations and educational status, income and occupational biographies, leisure time activities, housing, health, political orientation and more. With its long running panel structure, the breadth of topics and the representative nature of the data, the SOEP has become a central resources for quantitative research in the social sciences in Germany.

Introduction to Webscraping http://www.diw.de/sixcms/detail.php?id=diw_01.c.844944.de

This short course is meant to give an overview over the most common web scraping techniques. The idea is to have an interactive course in which the participants get their hands on actual code and work with it. Therefore, please bring your own computers, if possible. The main aim is to cover several approaches that are needed to scrape different types of data from different websites. In the end, participants should have an idea of how to approach the task of web scraping any website they are interested in. As an exercise spanning the entirety of the course, participants are encouraged to choose a website that they are interested in and try to build a scraper using the codes and knowledge they gain during the course.

The codes for the course are written in Python. The course also includes a very short introduction to Python but due to the limited time, we will not be able to cover all the Python concepts needed. Therefore, it would be helpful if you look at some preparatory material to get somewhat familiar with the language. Most importantly, please take the time to install a Python distribution on your computer and some of the packages that we will need for the course.

The course is split into four 90-minute sessions over two days. On day 1, we will cover the bulk of the material over three session. First, we will cover a short introduction to Python and some basic web scraping concepts. Then, we will look at how to gather data if an Application Programming Interface (API) is available. Finally, we will cover techniques for retrieving information from HTML code such as HTML parsing and text pattern matching. On day 2, we will look into browser automation, a technique that is used to scrape websites that load dynamically. Finally, we will leave some time to discuss issues with your own scraper.

The time gap between the last session of day 1 and the session on day 2 is meant to give you some time
to work on your own scraping projects if you so wish.

To join the minicourse please register with Daniela Centemero (dcentemero@diw.de).

Jahrestagung des Vereins für Socialpolitik 2022 http://www.diw.de/sixcms/detail.php?id=diw_01.c.849087.de

Die Verfügbarkeit umfangreicher Datensätze aus administrativen und unternehmerischen Quellen bezüglich der wirtschaftlichen Entscheidungen von Privathaushalten und Unternehmen verändert die empirische Forschung in den Sozialwissenschaften grundlegend.
Wirtschaftswissenschaftler*innen und Statistiker*innen entwickeln neue Methoden, die Antworten auf altbekannte empirische Fragen liefern und eine wesentlich schnellere und genauere Forschung ermöglichen. Gleichzeitig verändern neue Daten und Methoden auch die Arbeitsweise von Verwaltungen und Unternehmen, was neue und herausfordernde Fragen für Politik und Gesellschaft aufwirft.

Text Mining and Machine Learning with Application to Macro http://www.diw.de/sixcms/detail.php?id=diw_01.c.839766.de

The DIW Graduate Center is pleased to offer a masterclass on text mining by Stephen Hansen, Imperial College London. Details on the dates of the events will be announced in due time.

The 13th Berlin IO Day http://www.diw.de/sixcms/detail.php?id=diw_01.c.703405.de

The Berlin IO Day is a one-day workshop sponsored by the Berlin Centre for Consumer Policies (BCCP) and the Society of Friends at DIW Berlin (VdF) and supported by the Berlin's leading academic institutions, including DIW Berlin, ESMT Berlin, Freie Universität Berlin, Humboldt-Universität zu Berlin, and Technische Universität Berlin. The aim is to create an international forum for high quality research in Industrial Organization in the heart of Berlin, one of Europe's most vibrant and intellectually lively cities.

Workshop on the Integration of Refugee Families in Host Countries: Research Advances, Policy Improvements, and Data Challenges http://www.diw.de/sixcms/detail.php?id=diw_01.c.840961.de

This one-day workshop aims to bring together researchers working on various aspects of the integration of refugee families into host societies and discuss the most recent research developments in this field. It also aims to discuss empirical research, data collection, and policy challenges in view of the new waves of refugees expected to arrive to Europe in the near future. To this end, the workshop will also include selected presentations on the experience and results of the IAB-BAMF-SOEP Refugee Survey, a representative survey of refugees in Germany that has been conducted as a longitudinal survey since 2016.

The workshop is financed by the project “Representative Sample of Refugee Families in Germany (GeFam2)”.