Search Machine Learning Repository: Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment
Authors: Jason Chuang, Sonal Gupta, Christopher Manning and Jeffrey Heer
Conference: Proceedings of the 30th International Conference on Machine Learning (ICML-13)
Year: 2013
Pages: 612-620
Abstract: The use of topic models to analyze domain-specific texts often requires manual validation of the latent topics to ensure they are meaningful. We introduce a framework to support large-scale assessment of topical relevance. We measure the correspondence between a set of latent topics and a set of reference concepts to quantify four types of topical misalignment: junk, fused, missing, and repeated topics. Our analysis compares 10,000 topic model variants to 200 expert-provided domain concepts, and demonstrates how our framework can inform choices of model parameters, inference algorithms, and intrinsic measures of topical quality.
[pdf] [BibTeX]

authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).