Learning from Data and Network Effects: The Example of Internet Search

Discussion Papers 1894, 62 S.

Maximilian Schäfer, Geza Sapi


get_appDownload (PDF  1.06 MB)


The rise of dominant firms in data driven industries is often credited to their alleged data advantage. Empirical evidence lending support to this conjecture is surprisingly scarce. In this paper we document that data as an input into machine learning tasks display features that support the claim of data being a source of market power. We study how data on keywords improve the search result quality on Yahoo!. Search result quality increases when more users search a keyword. In addition to this direct network effect caused by more users, we observe a novel externality that is caused by the amount of data that the search engine collects on the particular users. More data on the personal search histories of the users reinforce the direct network effect stemming from the number of users searching the same keyword. Our findings imply that a search engine with access to longer user histories may improve the quality of its search results faster than an otherwise equally efficient rival with the same size of user base but access to shorter user histories.

Maximilian Schäfer

Ph.D. Student in the Firms and Markets Department

JEL-Classification: L12;L41;L81;L86
Keywords: Competition, network effects, search engines, Big Data