Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Accelerating t-SNE using tree-based algorithms

Published: 01 January 2014 Publication History

Abstract

The paper investigates the acceleration of t-SNE--an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots--using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N log N). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.

References

[1]
J. Barnes and P. Hut. A hierarchical O(N log N) force-calculation algorithm. Nature, 324 (4):446-449, 1986.
[2]
B.J.C. Baxter and G. Roussos. A new error estimate of the fast Gauss transform. SIAM Journal on Scientific Computation, 24(1):257-259, 2002.
[3]
R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173-189, 1972.
[4]
A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In Proceedings of the International Conference on Machine Learning, pages 97-104, 2006.
[5]
S. Brin. Near neighbor search in large metric spaces. In Proceedings of the International Conference on Very Large Data Bases, pages 574-584, 1995.
[6]
C.J.C. Burges. Dimension reduction: A guided tour. Foundations and Trends in Machine Learning, 2(4):1-95, 2010.
[7]
M.Á. Carreira-Perpiñán. The elastic embedding algorithm for dimensionality reduction. In Proceedings of the International Conference on Machine Learning, pages 167-174, 2010.
[8]
M. Chalmers. A linear iteration time layout algorithm for visualising high-dimensional data. In Proceedings of IEEE Visualization, pages 127-132, 1996.
[9]
K. Cho, B. van Merriënboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In arXiv 1406.1078, 2014.
[10]
D.J. Croton, V. Springel, S.D.M. White, G. De Lucia, C.S. Frenk, L. Gao, A. Jenkins, G. Kauffmann, J.F. Navarro, and N. Yoshida. The many lives of active galactic nuclei: cooling flows, black holes and the luminosities and colours of galaxies. Monthly Notices of the Royal Astronomical Society, 365(1):11-28, 2006.
[11]
N. de Freitas, Y. Wang, M. Mahdaviani, and D. Lang. Fast Krylov methods for N-body learning. In Advances in Neural Information Processing Systems, volume 18, pages 251-258, 2006.
[12]
J.H. Freidman, J.L. Bentley, and R.A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3:209-226, 1977.
[13]
T.M.J. Fruchterman and E.M. Reingold. Graph drawing by force-directed placement. Software: Practice and Experience, 21(11):1129-1164, 1991.
[14]
K. Fukunaga and P.M. Narendra. A branch and bound algorithm for computing k-nearest neighbors. IEEE Transactions on Computers, 24:750-753, 1975.
[15]
A.G. Gray. Fast kernel matrix-vector multiplication with application to gaussian process learning. Technical Report CMU-CS-04-110, Carnegie Mellon University, 2004.
[16]
A.G. Gray and A.W. Moore. N-body problems in statistical learning. In Advances in Neural Information Processing Systems, pages 521-527, 2001.
[17]
A.G. Gray and A.W. Moore. Rapid evaluation of multiple density models. In Proceedings of the International Conference on Artificial Intelligence and Statistics, 2003.
[18]
L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. Journal of Computational Physics, 73:325-348, 1987.
[19]
J. Heer, M. Bostock, and V. Ogievetsky. A tour through the visualization zoo. Communications of the ACM, 53:59-67, 2010.
[20]
G.E. Hinton and S.T. Roweis. Stochastic Neighbor Embedding. In Advances in Neural Information Processing Systems, volume 15, pages 833-840, 2003.
[21]
G.E Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. In arXiv 1207.0580, 2012.
[22]
Y. Hu. Efficient and high-quality force-directed graph drawing. The Mathematica Journal, 10(1):37-71, 2005.
[23]
P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of 30th Symposium on Theory of Computing, 1998.
[24]
R.A. Jacobs. Increased rates of convergence through learning rate adaptation. Neural Networks, 1:295-307, 1988.
[25]
S. Ji. Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering. BMC Bioinformatics, 14(222):1-14, 2013.
[26]
Y. Jia. Caffe: An open source convolutional architecture for fast feature embedding. http://caffe.berkeleyvision.org/, 2013.
[27]
D. Keim, J. Kohlhammer, G. Ellis, and F. Mansmann. Mastering the Information Age: Solving Problems with Visual Analytics. Eurographics Association, Germany, 2010.
[28]
A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
[29]
C.C. Laczny, N. Pinel, N. Vlassis, and P. Wilmes. Alignment-free visualization of metagenomic data by nonlinear dimension reduction. Scientific Reports, 4:1-12, 2014.
[30]
D. Lang, M. Klaas, and N. de Freitas. Empirical testing of fast kernel density estimation algorithms. Technical Report TR-2005-03, University of British Columbia, 2005.
[31]
N.D. Lawrence. Spectral dimensionality reduction via maximum entropy. Proceedings of the International Conference on Artificial Intelligence and Statistics, JMLR W&CP, 15: 51-59, 2011.
[32]
Y. LeCun, F.J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 97-104, 2004.
[33]
T. Liu, A.W. Moore, A. Gray, and K. Yang. An investigation of practical approximate nearest neighbor algorithms. In Advances in Neural Information Processing Systems, volume 17, pages 825-832, 2004.
[34]
M. Mahdaviani, N. de Freitas, B. Fraser, and F. Hamze. Fast computational methods for visually guided robots. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 138-143, 2005.
[35]
M. Muja and D.G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In Proceedings of the International Conference on Computer Vision Theory and Applications, 2009.
[36]
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A.Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
[37]
D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2161-2168, 2006.
[38]
A. Quigley and P. Eades. FADE: Graph drawing, clustering, and visual abstraction. In Proceedings of the International Symposium on Graph Drawing, pages 197-210, 2000.
[39]
V.C. Raykar and R. Duraiswami. Fast optimal bandwidth selection for kernel density estimation. In Proceedings of the 2006 SIAM International Conference on Data Mining, pages 524-528, 2006.
[40]
V. Rokhlin. Rapid solution of integral equations of classic potential theory. Journal of Computational Physics, 60:187-207, 1985.
[41]
S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by Locally Linear Embedding. Science, 290(5500):2323-2326, 2000.
[42]
R.R. Salakhutdinov and G.E. Hinton. Semantic hashing. In Proceedings of the SIGIR Workshop on Information Retrieval and Applications of Graphical Models, pages 52-63, 2007.
[43]
J.K. Salmon and M.S. Warren. Skeletons from the treecode closet. Journal of Computational Physics, 111(1):136-155, 1994.
[44]
L.K. Saul, K.Q. Weinberger, J.H. Ham, F. Sha, and D.D. Lee. Spectral methods for dimensionality reduction. In Semisupervised Learning. The MIT Press, 2006.
[45]
P. Sermanet, S. Chintala, and Y. LeCun. Convolutional neural networks applied to house numbers digit classification. In Proceedings of the International Conference on Pattern Recognition, pages 3288-3291, 2012.
[46]
F. Sha and L.K. Saul. Large margin Gaussian mixture modeling for phonetic classification and recognition. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pages 265-268, 2006.
[47]
C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008.
[48]
V. Springel, N. Yoshidaa, and S.D.M. White. GADGET: A code for collisionless and gasdynamical cosmological simulations. New Astronomy, 6(2):79-117, 2001.
[49]
J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323, 2000.
[50]
P. Tiño and I.T. Nabney. Hierarchical GTM: Constructing localized nonlinear projection manifolds in a principled way. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):639-656, 2002.
[51]
A. Torralba, R. Fergus, and W.T. Freeman. 80 million tiny images: A large dataset for non-parametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958-1970, 2008.
[52]
L.J.P. van der Maaten. Learning a parametric embedding by preserving local structure. In Proceedings of the International Conference on Artificial Intelligence and Statistics, JMLR W&CP, volume 5, pages 384-391, 2009.
[53]
L.J.P. van der Maaten. Barnes-Hut-SNE. In Proceedings of the International Conference on Learning Representations, 2013.
[54]
L.J.P. van der Maaten and G.E. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9(Nov):2431-2456, 2008.
[55]
L.J.P. van der Maaten, E.O. Postma, and H.J. van den Herik. Dimensionality reduction: A comparative review. Technical Report TiCC-TR 2009-005, Tilburg University, 2009.
[56]
N.J. van Eck and L. Waltman. Software survey: Vosviewer, a computer program for bibliometric mapping. Scientometrics, 84:523-538, 2010.
[57]
J. Venna, J. Peltonen, K. Nybo, H. Aidos, and S. Kaski. Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, 11(Feb):451-490, 2010.
[58]
M. Vladymyrov and M.Á. Carreira-Perpiñán. Partial-Hessian strategies for fast learning of nonlinear embeddings. In Proceedings of the International Conference on Machine Learning, pages 345-352, 2012.
[59]
M. Vladymyrov and M.Á. Carreira-Perpiñán. Entropic affinities: Properties and efficient numerical computation. Proceedings of the International Conference on Machine Learning, JMLR W&CP, 28(3):477-485, 2013.
[60]
M. Vladymyrov and M.A. Carreira-Perpiñán. Linear-time training of nonlinear low-dimensional embeddings. In Proceedings of the International Conference on Artificial Intelligence and Statistics. JMLR: W&CP, volume 33, pages 968-977, 2014.
[61]
X. Wan and G.E. Karniadakis. A sharp error estimate for the fast gauss transform. Journal of Computational Physics, 219(1):7-12, 2006.
[62]
M.S. Warren and J.K. Salmon. A parallel hashed octtree N-body algorithm. In Proceedings of the ACM/IEEE Conference on Supercomputing, pages 12-21, 1993.
[63]
Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In Advances in Neural Information Processing Systems, pages 1753-1760, 2008.
[64]
C. Yang, R. Duraiswami, N.A. Gumerov, and L. Davis. Improved fast Gauss transform and efficient kernel density estimation. In Proceedings of the IEEE International Conference on Computer Vision, pages 664-671, 2003.
[65]
Z. Yang, J. Peltonen, and S. Kaski. Scalable optimization of neighbor embedding for visualization. In Proc. of the Int. Conf. on Machine Learning, 2013.
[66]
P.N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 311-321, 1993.
[67]
G. Zoutendijk. Methods of Feasible Directions. Elsevier Publishing Company, Amsterdam, The Netherlands, 1960.

Cited By

View all
  • (2024)Trusted re-weighting for label distribution learningProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702875(4237-4249)Online publication date: 15-Jul-2024
  • (2024)LangCellProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694600(61159-61185)Online publication date: 21-Jul-2024
  • (2024)MOMENTProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692712(16115-16152)Online publication date: 21-Jul-2024
  • Show More Cited By
  1. Accelerating t-SNE using tree-based algorithms

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image The Journal of Machine Learning Research
    The Journal of Machine Learning Research  Volume 15, Issue 1
    January 2014
    4085 pages
    ISSN:1532-4435
    EISSN:1533-7928
    Issue’s Table of Contents

    Publisher

    JMLR.org

    Publication History

    Published: 01 January 2014
    Published in JMLR Volume 15, Issue 1

    Author Tags

    1. Barnes-Hut algorithm
    2. dual-tree algorithm
    3. embedding
    4. multidimensional scaling
    5. space-partitioning trees
    6. t-SNE

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)241
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Trusted re-weighting for label distribution learningProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702875(4237-4249)Online publication date: 15-Jul-2024
    • (2024)LangCellProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694600(61159-61185)Online publication date: 21-Jul-2024
    • (2024)MOMENTProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692712(16115-16152)Online publication date: 21-Jul-2024
    • (2024)Envisioning outlier exposure by large language models for out-of-distribution detectionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692289(5629-5659)Online publication date: 21-Jul-2024
    • (2024)What factors distinguish overlapping Data job postings? Towards ML-based models for job category’s factors predictionIntelligent Decision Technologies10.3233/IDT-24050918:3(2161-2176)Online publication date: 16-Sep-2024
    • (2024)Learning hierarchical embedding space for image-text matchingIntelligent Data Analysis10.3233/IDA-23021428:3(647-665)Online publication date: 1-Jan-2024
    • (2024)Practical privacy-preserving MLaaSProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i14.29476(15502-15510)Online publication date: 20-Feb-2024
    • (2024)Uncovering and Mitigating the Impact of Code Obfuscation on Dataset Annotation with Antivirus EnginesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680302(553-565)Online publication date: 11-Sep-2024
    • (2024)Exploring Layerwise Adversarial Robustness Through the Lens of t-SNEProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654258(619-622)Online publication date: 14-Jul-2024
    • (2024)Accelerating Hyperbolic t-SNEIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.336484130:7(4403-4415)Online publication date: 1-Jul-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media