Search Machine Learning Repository: Scalable Optimization of Neighbor Embedding for Visualization
Authors: Zhirong Yang, Jaakko Peltonen and Samuel Kaski
Conference: Proceedings of the 30th International Conference on Machine Learning (ICML-13)
Year: 2013
Pages: 127-135
Abstract: Neighbor embedding (NE) methods have found their use in data visualization but are limited in big data analysis tasks due to their O(n^2) complexity for n data samples. We demonstrate that the obvious approach of subsampling produces inferior results and propose a generic approximated optimization technique that reduces the NE optimization cost to O(n log n). The technique is based on realizing that in visualization the embedding space is necessarily very low-dimensional (2D or 3D), and hence efficient approximations developed for n-body force calculations can be applied. In gradient-based NE algorithms the gradient for an individual point decomposes into ``forces'' exerted by the other points. The contributions of close-by points need to be computed individually but far-away points can be approximated by their ``center of mass'', rapidly computable by applying a recursive decomposition of the visualization space into quadrants. The new algorithm brings a significant speed-up for medium-size data, and brings ``big data'' within reach of visualization.
[pdf] [BibTeX]

authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).