Search Machine Learning Repository:
Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment
Authors: Jason Chuang, Sonal Gupta, Christopher Manning and Jeffrey Heer
Conference: Proceedings of the 30th International Conference on Machine Learning (ICML-13)
Abstract: The use of topic models to analyze domain-specific texts often requires manual validation of the latent topics to ensure they are meaningful. We introduce a framework to support large-scale assessment of topical relevance. We measure the correspondence between a set of latent topics and a set of reference concepts to quantify four types of topical misalignment: junk, fused, missing, and repeated topics. Our analysis compares 10,000 topic model variants to 200 expert-provided domain concepts, and demonstrates how our framework can inform choices of model parameters, inference algorithms, and intrinsic measures of topical quality.
authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).