Working paper Open Access
The central question we address in this paper is how to do phonology in an emerging era of big data. The more specific question we explore is how to better use naturalistic corpus data to study phonology. We support the growing trends that are expanding the range of phenomena phonologists investigate, and enhancing the richness of detail with which investigations are conducted.
Presenting case studies from English, Indonesian, and Romanian, we argue that the use of corpus data necessarily follows from the goals of the generative enterprise. At the same time, experimental and laboratory investigations are crucial to fully and systematically explore both phonological patterns and individual speaker differences, as we show with case studies of English, Italian, and Catalan.
We advocate for an iterative model of phonological analysis integrating careful data elicitation with both corpus analysis and experimental methods.