Large corpora of text serve as social sensors, capturing what people discuss at a scale and richness that neither qualitative nor survey research designs can match—and that are suited for studying social dynamics and collective phenomena as they unfold in interactive social contexts. Sociological research, the authors argue, is transitioning from using text analysis as an end in itself to treating it as a means of advancing social theory
Treating text-analytic methods as measurement strategies that extract sociologically relevant information from unstructured language data, the authors organize approaches into data-first, theory-first, and theory–data integration paradigms, showing how each balances inductive discovery with theoretical specification. Data-first approaches emphasize openness to discovery, surfacing latent features that are then interpreted through a theoretical lens. Theory-first approaches require researchers to specify in advance which linguistic features or categories to measure, lending themselves to deductive theory testing. Theory–data integration approaches occupy the middle ground, combining theory-driven modeling with openness to unexpected patterns. While the rise of large language models has democratized text analysis, enabling researchers to deploy the same foundation models across all three paradigms, the authors caution that challenges around reproducibility, bias, and prompt sensitivity require researchers to remain anchored in theoretically grounded measurement.
The article demonstrates how computational text analysis expands sociological inquiry along two dimensions. First, these methods enable thick description at scale by combining pattern recognition across large corpora with interpretive close reading, and complementing approaches such as grounded theory, abductive analysis, and forensic social science. Second, the article examines the growing incorporation of text data into causal inference, reviewing strategies for using text as outcome, treatment, or confounder, while highlighting persistent methodological challenges stemming from the high dimensionality of text and the iterative nature of text modeling.
Computational text analysis is, in the authors' view, not merely a technical innovation but a substantive expansion of what sociology can measure, model, and interpret. By bridging quantitative scale with qualitative depth, computational text analysis has the potential to revitalize interpretive approaches in sociology and open new avenues for studying phenomena that have long resisted systematic empirical inquiry.