Random Forests for Labor Market Analysis: Balancing Precision and Interpretability

SOEPpapers 1230, 29 S.

Daniel Graeber, Lorenz Meister, Carsten Schröder, Sabine Zinn

2025

get_appDownload (PDF  477 KB)

Abstract

Machine learning is increasingly used in social science research, especially for prediction. However, the results are sometimes not as straight-forward to interpret compared to classic regression models. In this paper, we address this trade-off by comparing the predictive performance of random forests and logit regressions to analyze labor market vulnerabilities during the COVID-19 pandemic, and a global surrogate model to enhance our understanding of the complex dynamics. Our study shows that, especially in the presence of non-linearities and feature interactions, random forests outperform regressions both in predictive accuracy and interpretability, yielding policy-relevant insights on vulnerable groups affected by labor market disruptions

Lorenz Meister

Doktorand in der Infrastruktureinrichtung Sozio-oekonomisches Panel

Sabine Zinn

Direktorin SOEP in der Infrastruktureinrichtung Sozio-oekonomisches Panel

Daniel Graeber

Wissenschaftlicher Mitarbeiter in der Infrastruktureinrichtung Sozio-oekonomisches Panel

Carsten Schröder

Bereichsleitung Angewandte Panelanalysen in der Infrastruktureinrichtung Sozio-oekonomisches Panel



JEL-Classification: C45;C53;C25;J08;I18;C83;J21
Keywords: Machine learning, interpretability, labor market, random forests

keyboard_arrow_up