The Effect of Self-Employment on Income Inequality

It is well known that the self-employed are over-represented at the bottom as well as the top of the income distribution. This paper shifts the focus from the income situation of the self-employed to the distributive effects of a change in self-employment rates. With representative German data and unconditional quantile regression analysis we show that an increase in the proportion of self-employed individuals in the labor force increases income polarization by tearing down floors at the bottom and allowing higher earnings potentials at the very top of the hourly income distribution. Recentered influence function regression of inequality measures corroborate that self-employment is a source of income inequality in the labor market.


Introduction
A considerable part of active labor market policy aims at fostering entrepreneurship or self-employment, respectively. There is, however, no clear consensus that a rise in selfemployment rates lead to higher prosperity or GDP (Blanchflower, 2000). In addition, it has become common knowledge that the median self-employed earns less than median employee (Hamilton, 2000). For this reason, the majority of self-employed is worse off in term of pay when compared to employees. The self-employed, however, achieve higher average incomes than paid employees, which is usually due to superstar entrepreneurs with very high incomes. In fact, the literature shows that self-employed are over-represented in the lower as well as the upper tail of the income distribution (Astebro et al., 2011). When, however, only few superstar entrepreneurs are responsible for the higher average incomes, while most self-employed achieve lower than average incomes, entrepreneurship policy should also pay attention to the distributive effects. In this paper, we therefore ask how a rise in self-employment shapes the income distribution and to what extent.
In the early years of the 21 st century, economists renewed their interest in (income) inequality. Thereby, most research concentrated on the effects of individual skills, technological change, or the process of globalization, which, however, are only part of a very complex story (e.g., Atkinson, 2003;Autor, 2014). This paper shifts the focus to the employment status of individuals because self-employment and entrepreneurship create substantial potentials to become extremely wealthy, which ultimately contributes to income inequality. In fact, we observe a correlation between countries with higher shares of self-employed in their workforce and income inequality (see Figure 1). We therefore shed light on the distributive effects of an increase in self-employment rates. As the correlation between self-employment rates and inequality is especially pronounced in high income countries, we examine the mechanisms behind this pattern with German data.
Insert Figure 1 about here Most studies on income differences between paid employees and the self-employed apply conditional quantile regression to analyze whether entrepreneurship pays. This method indeed is a powerful method to examine the effects of self-employment on the conditional distribution of incomes. However, this procedure usually does not allow conclusions about treatment effects on individuals, but allows for statements about the income distribution as a whole. Political interest, in turn, usually focuses on the question how a shift in selfemployment rates alter the unconditional income distribution or the distributive effects, respectively. We therefore address the effect of an increase in the self-employment rate on the hourly income distribution by utilization of the unconditional quantile regression approach and utilize recentered influence function (RIF) regression (Firpo et al., 2009). In addition, we apply this methodology to examine whether changes in self-employment also affect income inequality.
The contribution of this paper to the literature is threefold: First, we analyze the income situation of self-employed in comparison to paid employees. Second, we investigate the effect of a change in the rate of self-employment on the hourly income distribution. In addition, we examine the relationship between self-employment and income inequality. With representative German data of the year 2015, we corroborate that the self-employed are overrepresented at the bottom as well as at the top of the hourly income distribution. Based on our RIF regression results, we found that a rise in the share of self-employed without any employees (solo self-employed) exhibits adverse effects for the bottom 50% of the workforce.
A higher share of self-employed with employees (employers), in turn, tends to increase hourly incomes among the top earners. In combination, a rise in self-employment bears potential of income polarization. We furthermore show that both types of self-employment significantly affect income inequality.
The SOEP is a longitudinal survey of more than 10 thousand private households in Germany and is provided by the German Institute for Economic Research (DIW) Berlin. Basic data characteristics are described in Wagner et al. (2007) or Goebel et al. (2018). The SOEP contains variables about demography, employment as well as the household. Note that in Germany, also other representative data sets are available. Recently, Sorgner et al. (2017) utilized the German Micro-Census in their study comparing incomes of self-employed and paid employees. This data set surveys monthly individual incomes in 24 groups of uneven size. Categories thereby range from 0-150 Euro to more than 18,000 Euro. In the SOEP, in turn, income is reported on a cardinal scale. The SOEP is therefore preferable because uneven categorization and right censoring in the Micro-Census would restrict our analysis of income inequality in a very sensitive way.
The dependent variable in our analysis is the hourly gross income. Precisely, the reported gross income achieved in the month before the interview is normalized by the actual work time. Precisely, the survey contains the weekly work time. We therefore multiply this variable with the factor 4.29 1 to conclude about the monthly working hours. Our central variable of interest describes the employment status of respondents. In fact, individuals are asked to report whether they are paid employees or self-employed with or without employees. 2 Germany experienced a rise in self-employment levels, which was mainly driven by an increase in solo self-employment (Brenke, 2013;Fritsch et al., 2015). According to Metzger (2015), 58.6% of full-time founders in year 2015 can be classified as solo entrepreneurs. We therefore concentrate on self-employed without any employees and those with employees. Note that the hourly income distribution differs distinctively by occupational status (see Figure 2 and Sorgner et al., 2017).
Insert Figure 2 about here The SOEP includes information on demographics as well as employment history and household composition. In this study, we include a comprehensive set of control variables. These comprise age (squared), sex, nationality (German / non-German), marital status (married / single / other), children under 16 years in household (yes / no), and a regional indicator giving insights about the federal state, the respondent is living in. Also the educational level is accounted for by dummy variables (primary education or a lower secondary degree / upper secondary degree / tertiary degree). We furthermore control for a magnitude of labor-related variables, such as the labor market experience in part-time jobs as well as in full-time jobs (measured in years), or years in unemployment. A further central control variable is the time spent in work (hours of work, also accounted for in Sorgner et al., 2017) as the sum of working hours ultimately determines the monthly income. We additionally control for tenure (in years): For the self-employed, it reveals experience in the current self-employed work, while for employees, it describes the time at the current employer. Both aspects are highly correlated with income and salary development. Descriptive statistics on all variables included in the analysis are presented in Table 1.
Insert Table 1 about here We follow Sorgner et al. (2017) and conduct cross-sectional analysis. In fact, we consider the latest year of the underlying version and examine year 2015. The analysis is restricted to individuals who report to be full-time employed in private industries or NACE-codes ranging from 10 to 82 respectively. Also note that the analysis does not account for civil servants as the relation between gross and net incomes is distinctively different from other employees and the self-employed. Finally, the analysis is restricted to individuals aged between 19 and 65 years.

Methodology
Conditional quantile regression helps to understand the impact of covariates along the distribution of an outcome. Application of this approach acknowledges that different characteristics might exhibit a different impact among low-and high-income earners. For this reason, the methodology is so popular in economic studies, which assess the impact of a variable on a quantile/percentile of the outcome (conditional on other variables). This approach also has been applied a magnitude of studies analyzing the income of self-employed in comparison to paid employees (among others Hamilton, 2000;Sorgner et al., 2017). Potentially heterogeneous effects, as in the case of self-employment, where self-employed at the bottom (top) are worse (better) off than employees, however, do not imply that an increase in self-employment has a stronger effect for the low (high) income earners, but for the conditionally low (high) income earners. Therefore, the results do not necessarily suggest that the unconditional income distribution is more disperse.
Quantile regression is a powerful method to examine the effects of self-employment on the conditional distribution of earnings. The political interest, however, mostly lies in how shifting self-employment rates alter the distributive effects. Such questions can be addressed by estimation of an unconditional quantile approach. The unconditional distribution can be thought of the product of the conditional distribution of income on self-employment and the marginal distribution of self-employment (cf. Alejo et al., 2014). The effect of an increase in self-employment therefore depends on the interaction between the marginal distribution of self-employment as well as the conditional distribution of income. As pointed out by Alejo et al. (2014, p. 55), "[t]he step from conditional to unconditional distributive effects is not a trivial one, and only recently there are available specific statistical tools to study them. The [...] literature on unconditional quantile regressions (Firpo, Fortin, and Lemieux (2009)) based on the concept of the recentered influence function, seems to provide a natural and important step towards this goal." The RIF approach is based on the properties of the influence function (IF) (Firpo et al., 2009), which is used in the robust statistics literature (Hampel et al., 1986). The IF is an analytical tool used to examine the effect or influence of adding an observation on the value of a statistic (ν(F [ Y ])) without the need to recalculate the particular statistic (Borah and Basu, 2013). Firpo et al. (2009) define the RIF as shown in equation (1). Y describes a random variable with cumulative distribution function F Y (y).
That is Y describes hourly income in our case. ν(F Y ) is a functional of interest and utilized to recenter the influence function.
The influence function of a specific quantile τ of the income distribution is shown in equation (2). q τ describes the specific percentile of the unconditional distribution of hourly income.
f Y (q τ ) stands for the probability density function of income evaluated at q τ . I(Y ≤ q τ ) is an indicator variable, which reveals whether hourly income is less or equal to q τ . The final RIF in our case is presented in equation (3). The RIF regression approach is also adequate to measure inequality. For example, IFs are also available for the variance, the Gini coefficient, or other measures of inequality.
Hence, one might use these IFs and run RIF regressions based on the corresponding influence functions (see Choe and Van Kerm, 2018;Firpo et al., 2018). In this paper, we start with an examination of the effect of a rise in self-employment on the variance of the hourly income. A higher variance is indicative of higher deviations from the mean and therefore higher inequality. We, moreover, apply the Gini index, the general entropy index as well as the Atkinson inequality measure, whereas all are prominent measures of wealth and income inequality (Cowell and Van Kerm, 2015). We utilize the Gini index because it is one of the most popular measures in research on inequality. It ranges between zero and one, whereas one describes perfect inequality. As one might expect distinctive results at the bottom as well as at the top of the income distribution (Halvarsson et al., 2018), we also apply inequality measures, which are sensitive to changes at different parts of the hourly earnings distribution.
In this regard, we calculate the RIFs for two general entropy measures, whereas the Theil index is more sensitive to differences at the top of the hourly income distribution than the mean log deviation. Finally, the Atkinson index allows to alter in which part changes of the earnings distribution will be most sensitive by changing . Higher implies rising sensitivity to changes at the bottom of the distribution. All the inequality measures have in common that higher values represent a higher level of inequality. Hence, estimation of a positive coefficient in the RIF regression is associated with a higher level of inequality.

Results
This section presents the central results. We start with the presentation of results obtained with the conditional quantile approach and then switch to the results estimated by RIF regressions. The results of the conditional quantile regressions reveal that solo self-employed frequently obtain lower hourly incomes when compared to paid employees (see Table 2). The estimated coefficients are negative until decile 6. This implies that solo self-employed are worse off when compared to paid employees until about the 60 th percentile of the income distribution. The effects, however, are statistically significant up to the fourth decile, which implies that solo self-employed earn significantly lower wages than paid employees somewhere between the 40 th and the 50 th percentile. Self-employed with employees are less frequently worse off in terms of hourly incomes when compared to paid employees. When changing the focus to the very top of the income distribution, self-employed individuals generally obtain higher hourly incomes than paid employees. This holds true for both, the solo self-employed as well as the employers.
Insert Table 2 about here Now, we shift the focus to the political view and address the question how an increase in selfemployment rates change the income distribution. This question is addressed by application of the RIF regression approach. Until the 7 th decile, an increase in solo self-employment shifts the hourly income distribution to the left (see Table 3). As the effect is statistically significant until the median, an increase in self-employment decreases the hourly earnings at least for the bottom 50% of the distribution. More specifically, the coefficient of -2.5259 in specification (1) implies that an increase in solo self-employment from 3.72% to 4.72% reduces incomes in the lowest decile by about 27.09% (= 2.5259 /9.3240 * 100%). In the fifth decile, the corresponding effect of a one percentage point increase in solo self-employment reduces hourly earning by about 7.77% (= 1.3236 /17.0325 * 100%). As the effects are statistically significant as well as economically relevant, we conclude that the effects of an increase in solo self-employment exhibits considerable adverse effects for the bottom 50% of the fulltime workforce. Specification (9) in Table 3 also adverts to positive effects for the top 10% of earners. Although the relative effect of an increase in the share of solo self-employed is meaningful (9.16% = 2.9889 /32.6178 * 100%), the coefficient is statistically insignificant due to the comparably high standard error.
Insert Table 3 about here An increase in the share of employers exhibits statistically significant as well as economically meaningful negative effects for the bottom 10% of the distribution. An increase in employers reduces hourly earnings at the very bottom by 16.48% (= 1.5366 /9.3240 * 100%). In combination with the results for the solo self-employed, a rise in self-employment seems to tear down floors at the very bottom of the hourly income distribution. A rising share of employers, however, also exhibits positive income effects and shifts the income distribution for earners above the 6 th decile to the right. When the share of employers increases by one percentage point, hourly incomes among the top 10% increase by 32.58% (= 10.6277 /32.6178 * 100%).
The results shown in Table 3 clearly suggest that an increase in the share of solo selfemployed tends to have adverse effects at the bottom of the hourly income distribution.
In contradiction, an increase in employers tends to increase wage potentials for individuals with hourly income above the median. Self-employment thus is suggested to be a source of income polarization as well as inequality in the labor market. In order to draw more robust inference, we examine the effect of an increase in self-employment on income inequality by application of a variety of different RIF regressions of inequality measures. 4 Specification (1) in Table 4 suggests that an increase in self-employed without employees has a positive, but statistically insignificant effect on the variance of hourly earnings, while an increase in the rate of employers is suggested to increase wage dispersion. Estimation of the RIF regression with respect to the Gini index implies that an increase in the rate of both types of selfemployment leads to a rise in inequality (specification (2)). Precisely, the Gini increases by 39.13% (= 0.1118 /0.2857 * 100%) when the share of solo self-employment increases by one percentage point. The coefficient on employees implies that inequality doubles when the share of employer increases from 5.08% to 6.08% ( 0.2946 /0.2857 * 100% = 103.12%). Also the estimates based on the general entropy measures shown in specifications (3) and (4) corroborate that an increase in self-employment significanlty contributes to income inequality. Finally, the estimated coefficients regarding the Atkinson inequality measures are presented in specifications (5) to (7). The coefficients of both groups of self-employed increase with rising . This also holds for the relative effects of the solo self-employed. The relative effects of employers, in contrast decrease with increasing . This corroborates that solo self-employment is likely to introduce higher inequality by shifting the bottom, while employers are likely to increase inequality at the top of the income distribution.
Insert Table 4 about here

Conclusion
Our contribution to the literature is threefold: At first, we examine the income situation of self-employed in comparison to paid employees. Second, we study the effects of a change in the rate of self-employment on the income distribution. Finally, we investigate the role of self-employment with regard to income inequality. Our analysis is based on the German SOEP with reference to survey year 2015. With respect to the fist point, we confirm prior findings that many self-employed are worse of in hourly earnings when compared to paid employees (e.g., Hamilton, 2000). The pattern, however, becomes more differentiated when we distinguish between solo self-employed and self-employed who also managed to create jobs for others. Specifically, we show that especially the solo self-employed are worse off in terms of hourly earnings, while employers are less common at the bottom of the hourly income distribution. Self-employed individuals, in turn, can also be found at the very top of the earnings distribution, which is especially likely among employers. This result basically corroborates that the self-employed are over-represented at the bottom as well as at the top of the income distribution (Astebro et al., 2011).
Besides the income situation of the self-employed, we also analyzed whether and how an increase in self-employment affects the hourly income distribution. Our RIF regression results suggest that an increase in solo self-employment reduces hourly incomes for the bottom 50% We basically confirm this pattern by separation of solo self-employed and employers with German data.
In the German context, our results as well as the rise in solo self-employment (Brenke, 2013;Fritsch et al., 2015) suggest that the increase in self-employment was largely due to entry into the bottom of the earnings distribution. Therefore, a promising avenue for future research is the analysis of occupational choice. In this regard, the literature has found for instance that entrepreneurs face finance and liquidity constraints (Blanchflower and Oswald, 1998). When we assume that the quality of a business is positively correlated with start-up costs, then initial wealth inequality may be a reason for long tails in the earnings distribution of entrepreneurs because one might imagine that only the richer households can gain access to the good opportunities. One might also study whether and how (private) start-up financing might help dampening adverse effects associated with occupational choice, liquidity constraints, and initial wealth inequality. This paper, moreover, contributes to the literature on active labor market policy aiming at rising the self-responsiveness and fostering self-employment out of unemployment. In fact, most of subsidized start-ups are created by single founders or solo entrepreneurs, respectively. This particular group is also likely to remain in the state of solo self-employment (Caliendo et al., 2012).