Search Machine Learning Repository:
Clustering in the Presence of Background Noise
Authors: Shai Ben-david and Nika Haghtalab
Conference: Proceedings of the 31st International Conference on Machine Learning (ICML-14)
Abstract: We address the problem of noise management in clustering algorithms. Namely, issues that arise when on top of some cluster structure the data also contains an unstructured set of points. We consider how clustering algorithms can be ``robustified" so that they recover the cluster structure in spite of the unstructured part of the input. We introduce some quantitative measures of such robustness that take into account the strength of the embedded cluster structure as well was the mildness of the noise subset. We propose a simple and efficient method to turn any centroid-based clustering algorithm into a noise-robust one, and prove robustness guarantees for our method with respect to these measures. We also prove that more straightforward ways of ``robustifying'' clustering algorithms fail to achieve similar guarantees.
authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).