Abstract: The rise of dominant firms in data driven industries is often credited to their alleged data advantage. Empirical evidence lending support to this conjecture is lacking. In this paper, we show that data as an input into machine learning tasks displays features that favor the hypothesis that data is a source of market power. We study the search result quality for search keywords on Yahoo!. Search result quality improves when more users search a keyword. In addition to this direct network effect caused by more users, we observe an additional externality that is caused by the amount of data that the search engine collects on the users. More data on the users reinforces the direct network effect. We propose to view this reinforcement effect due to additional user-specific data as a data network effect. Our findings are consistent with the consensus that data display diminishing returns to scale for a given prediction task. This feature of data is often regarded as incompatible with the hypothesis that data is a source of market power. Our results rationalize the market power hypothesis through a different mechanism by suggesting that data, in addition to being an input, is also a technology shifter.
Maximilian Schäfer, DIW Berlin
Topics: Competition and Regulation , Consumers , Digitalization , Markets