Search Machine Learning Repository: Loss-Proportional Subsampling for Subsequent ERM
Authors: Paul Mineiro and Nikos Karampatziakis
Conference: Proceedings of the 30th International Conference on Machine Learning (ICML-13)
Year: 2013
Pages: 522-530
Abstract: We propose a sampling scheme suitable for reducing a data set prior to selecting a hypothesis with minimum empirical risk. The sampling only considers a subset of the ultimate (unknown) hypothesis set, but can nonetheless guarantee that the final excess risk will compare favorably with utilizing the entire original data set. We demonstrate the practical benefits of our approach on a large dataset which we subsample and subsequently fit with boosted trees.
[pdf] [BibTeX]

authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).