Search Machine Learning Repository: Demystifying Information-Theoretic Clustering
Authors: Greg V. Steeg, Aram Galstyan, Fei Sha and Simon Dedeo
Conference: Proceedings of the 31st International Conference on Machine Learning (ICML-14)
Year: 2014
Pages: 19-27
Abstract: We propose a novel method for clustering data which is grounded in information-theoretic principles and requires no parametric assumptions. Previous attempts to use information theory to define clusters in an assumption-free way are based on maximizing mutual information between data and cluster labels. We demonstrate that this intuition suffers from a fundamental conceptual flaw that causes clustering performance to deteriorate as the amount of data increases. Instead, we return to the axiomatic foundations of information theory to define a meaningful clustering measure based on the notion of consistency under coarse-graining for finite data.
[pdf] [BibTeX]

authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).