Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Free access

Accelerating t-SNE using tree-based algorithms

Published: 01 January 2014 Publication History


The paper investigates the acceleration of t-SNE--an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots--using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N log N). Our experiments show that the resulting algorithms substantially accelerate t-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of t-SNE appears to outperform the dual-tree variant.


J. Barnes and P. Hut. A hierarchical O(N log N) force-calculation algorithm. Nature, 324 (4):446-449, 1986.
B.J.C. Baxter and G. Roussos. A new error estimate of the fast Gauss transform. SIAM Journal on Scientific Computation, 24(1):257-259, 2002.
R. Bayer and E. McCreight. Organization and maintenance of large ordered indexes. Acta Informatica, 1(3):173-189, 1972.
A. Beygelzimer, S. Kakade, and J. Langford. Cover trees for nearest neighbor. In Proceedings of the International Conference on Machine Learning, pages 97-104, 2006.
S. Brin. Near neighbor search in large metric spaces. In Proceedings of the International Conference on Very Large Data Bases, pages 574-584, 1995.
C.J.C. Burges. Dimension reduction: A guided tour. Foundations and Trends in Machine Learning, 2(4):1-95, 2010.
M.Á. Carreira-Perpiñán. The elastic embedding algorithm for dimensionality reduction. In Proceedings of the International Conference on Machine Learning, pages 167-174, 2010.
M. Chalmers. A linear iteration time layout algorithm for visualising high-dimensional data. In Proceedings of IEEE Visualization, pages 127-132, 1996.
K. Cho, B. van Merriënboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. In arXiv 1406.1078, 2014.
D.J. Croton, V. Springel, S.D.M. White, G. De Lucia, C.S. Frenk, L. Gao, A. Jenkins, G. Kauffmann, J.F. Navarro, and N. Yoshida. The many lives of active galactic nuclei: cooling flows, black holes and the luminosities and colours of galaxies. Monthly Notices of the Royal Astronomical Society, 365(1):11-28, 2006.
N. de Freitas, Y. Wang, M. Mahdaviani, and D. Lang. Fast Krylov methods for N-body learning. In Advances in Neural Information Processing Systems, volume 18, pages 251-258, 2006.
J.H. Freidman, J.L. Bentley, and R.A. Finkel. An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software, 3:209-226, 1977.
T.M.J. Fruchterman and E.M. Reingold. Graph drawing by force-directed placement. Software: Practice and Experience, 21(11):1129-1164, 1991.
K. Fukunaga and P.M. Narendra. A branch and bound algorithm for computing k-nearest neighbors. IEEE Transactions on Computers, 24:750-753, 1975.
A.G. Gray. Fast kernel matrix-vector multiplication with application to gaussian process learning. Technical Report CMU-CS-04-110, Carnegie Mellon University, 2004.
A.G. Gray and A.W. Moore. N-body problems in statistical learning. In Advances in Neural Information Processing Systems, pages 521-527, 2001.
A.G. Gray and A.W. Moore. Rapid evaluation of multiple density models. In Proceedings of the International Conference on Artificial Intelligence and Statistics, 2003.
L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. Journal of Computational Physics, 73:325-348, 1987.
J. Heer, M. Bostock, and V. Ogievetsky. A tour through the visualization zoo. Communications of the ACM, 53:59-67, 2010.
G.E. Hinton and S.T. Roweis. Stochastic Neighbor Embedding. In Advances in Neural Information Processing Systems, volume 15, pages 833-840, 2003.
G.E Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R.R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. In arXiv 1207.0580, 2012.
Y. Hu. Efficient and high-quality force-directed graph drawing. The Mathematica Journal, 10(1):37-71, 2005.
P. Indyk and R. Motwani. Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proceedings of 30th Symposium on Theory of Computing, 1998.
R.A. Jacobs. Increased rates of convergence through learning rate adaptation. Neural Networks, 1:295-307, 1988.
S. Ji. Computational genetic neuroanatomy of the developing mouse brain: dimensionality reduction, visualization, and clustering. BMC Bioinformatics, 14(222):1-14, 2013.
Y. Jia. Caffe: An open source convolutional architecture for fast feature embedding. http://caffe.berkeleyvision.org/, 2013.
D. Keim, J. Kohlhammer, G. Ellis, and F. Mansmann. Mastering the Information Age: Solving Problems with Visual Analytics. Eurographics Association, Germany, 2010.
A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
C.C. Laczny, N. Pinel, N. Vlassis, and P. Wilmes. Alignment-free visualization of metagenomic data by nonlinear dimension reduction. Scientific Reports, 4:1-12, 2014.
D. Lang, M. Klaas, and N. de Freitas. Empirical testing of fast kernel density estimation algorithms. Technical Report TR-2005-03, University of British Columbia, 2005.
N.D. Lawrence. Spectral dimensionality reduction via maximum entropy. Proceedings of the International Conference on Artificial Intelligence and Statistics, JMLR W&CP, 15: 51-59, 2011.
Y. LeCun, F.J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 97-104, 2004.
T. Liu, A.W. Moore, A. Gray, and K. Yang. An investigation of practical approximate nearest neighbor algorithms. In Advances in Neural Information Processing Systems, volume 17, pages 825-832, 2004.
M. Mahdaviani, N. de Freitas, B. Fraser, and F. Hamze. Fast computational methods for visually guided robots. In Proceedings of the IEEE International Conference on Robotics and Automation, pages 138-143, 2005.
M. Muja and D.G. Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In Proceedings of the International Conference on Computer Vision Theory and Applications, 2009.
Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A.Y. Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2161-2168, 2006.
A. Quigley and P. Eades. FADE: Graph drawing, clustering, and visual abstraction. In Proceedings of the International Symposium on Graph Drawing, pages 197-210, 2000.
V.C. Raykar and R. Duraiswami. Fast optimal bandwidth selection for kernel density estimation. In Proceedings of the 2006 SIAM International Conference on Data Mining, pages 524-528, 2006.
V. Rokhlin. Rapid solution of integral equations of classic potential theory. Journal of Computational Physics, 60:187-207, 1985.
S.T. Roweis and L.K. Saul. Nonlinear dimensionality reduction by Locally Linear Embedding. Science, 290(5500):2323-2326, 2000.
R.R. Salakhutdinov and G.E. Hinton. Semantic hashing. In Proceedings of the SIGIR Workshop on Information Retrieval and Applications of Graphical Models, pages 52-63, 2007.
J.K. Salmon and M.S. Warren. Skeletons from the treecode closet. Journal of Computational Physics, 111(1):136-155, 1994.
L.K. Saul, K.Q. Weinberger, J.H. Ham, F. Sha, and D.D. Lee. Spectral methods for dimensionality reduction. In Semisupervised Learning. The MIT Press, 2006.
P. Sermanet, S. Chintala, and Y. LeCun. Convolutional neural networks applied to house numbers digit classification. In Proceedings of the International Conference on Pattern Recognition, pages 3288-3291, 2012.
F. Sha and L.K. Saul. Large margin Gaussian mixture modeling for phonetic classification and recognition. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pages 265-268, 2006.
C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast image descriptor matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2008.
V. Springel, N. Yoshidaa, and S.D.M. White. GADGET: A code for collisionless and gasdynamical cosmological simulations. New Astronomy, 6(2):79-117, 2001.
J.B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319-2323, 2000.
P. Tiño and I.T. Nabney. Hierarchical GTM: Constructing localized nonlinear projection manifolds in a principled way. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):639-656, 2002.
A. Torralba, R. Fergus, and W.T. Freeman. 80 million tiny images: A large dataset for non-parametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1958-1970, 2008.
L.J.P. van der Maaten. Learning a parametric embedding by preserving local structure. In Proceedings of the International Conference on Artificial Intelligence and Statistics, JMLR W&CP, volume 5, pages 384-391, 2009.
L.J.P. van der Maaten. Barnes-Hut-SNE. In Proceedings of the International Conference on Learning Representations, 2013.
L.J.P. van der Maaten and G.E. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9(Nov):2431-2456, 2008.
L.J.P. van der Maaten, E.O. Postma, and H.J. van den Herik. Dimensionality reduction: A comparative review. Technical Report TiCC-TR 2009-005, Tilburg University, 2009.
N.J. van Eck and L. Waltman. Software survey: Vosviewer, a computer program for bibliometric mapping. Scientometrics, 84:523-538, 2010.
J. Venna, J. Peltonen, K. Nybo, H. Aidos, and S. Kaski. Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, 11(Feb):451-490, 2010.
M. Vladymyrov and M.Á. Carreira-Perpiñán. Partial-Hessian strategies for fast learning of nonlinear embeddings. In Proceedings of the International Conference on Machine Learning, pages 345-352, 2012.
M. Vladymyrov and M.Á. Carreira-Perpiñán. Entropic affinities: Properties and efficient numerical computation. Proceedings of the International Conference on Machine Learning, JMLR W&CP, 28(3):477-485, 2013.
M. Vladymyrov and M.A. Carreira-Perpiñán. Linear-time training of nonlinear low-dimensional embeddings. In Proceedings of the International Conference on Artificial Intelligence and Statistics. JMLR: W&CP, volume 33, pages 968-977, 2014.
X. Wan and G.E. Karniadakis. A sharp error estimate for the fast gauss transform. Journal of Computational Physics, 219(1):7-12, 2006.
M.S. Warren and J.K. Salmon. A parallel hashed octtree N-body algorithm. In Proceedings of the ACM/IEEE Conference on Supercomputing, pages 12-21, 1993.
Y. Weiss, A. Torralba, and R. Fergus. Spectral hashing. In Advances in Neural Information Processing Systems, pages 1753-1760, 2008.
C. Yang, R. Duraiswami, N.A. Gumerov, and L. Davis. Improved fast Gauss transform and efficient kernel density estimation. In Proceedings of the IEEE International Conference on Computer Vision, pages 664-671, 2003.
Z. Yang, J. Peltonen, and S. Kaski. Scalable optimization of neighbor embedding for visualization. In Proc. of the Int. Conf. on Machine Learning, 2013.
P.N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 311-321, 1993.
G. Zoutendijk. Methods of Feasible Directions. Elsevier Publishing Company, Amsterdam, The Netherlands, 1960.

Cited By

View all
  • (2024)Trusted re-weighting for label distribution learningProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702875(4237-4249)Online publication date: 15-Jul-2024
  • (2024)LangCellProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694600(61159-61185)Online publication date: 21-Jul-2024
  • (2024)MOMENTProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692712(16115-16152)Online publication date: 21-Jul-2024
  • Show More Cited By
  1. Accelerating t-SNE using tree-based algorithms



    Information & Contributors


    Published In

    cover image The Journal of Machine Learning Research
    The Journal of Machine Learning Research  Volume 15, Issue 1
    January 2014
    4085 pages
    Issue’s Table of Contents



    Publication History

    Published: 01 January 2014
    Published in JMLR Volume 15, Issue 1

    Author Tags

    1. Barnes-Hut algorithm
    2. dual-tree algorithm
    3. embedding
    4. multidimensional scaling
    5. space-partitioning trees
    6. t-SNE


    • Article


    Other Metrics

    Bibliometrics & Citations


    Article Metrics

    • Downloads (Last 12 months)241
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 08 Feb 2025

    Other Metrics


    Cited By

    View all
    • (2024)Trusted re-weighting for label distribution learningProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702875(4237-4249)Online publication date: 15-Jul-2024
    • (2024)LangCellProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694600(61159-61185)Online publication date: 21-Jul-2024
    • (2024)MOMENTProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692712(16115-16152)Online publication date: 21-Jul-2024
    • (2024)Envisioning outlier exposure by large language models for out-of-distribution detectionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692289(5629-5659)Online publication date: 21-Jul-2024
    • (2024)What factors distinguish overlapping Data job postings? Towards ML-based models for job category’s factors predictionIntelligent Decision Technologies10.3233/IDT-24050918:3(2161-2176)Online publication date: 16-Sep-2024
    • (2024)Learning hierarchical embedding space for image-text matchingIntelligent Data Analysis10.3233/IDA-23021428:3(647-665)Online publication date: 1-Jan-2024
    • (2024)Practical privacy-preserving MLaaSProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i14.29476(15502-15510)Online publication date: 20-Feb-2024
    • (2024)Uncovering and Mitigating the Impact of Code Obfuscation on Dataset Annotation with Antivirus EnginesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680302(553-565)Online publication date: 11-Sep-2024
    • (2024)Exploring Layerwise Adversarial Robustness Through the Lens of t-SNEProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654258(619-622)Online publication date: 14-Jul-2024
    • (2024)Accelerating Hyperbolic t-SNEIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.336484130:7(4403-4415)Online publication date: 1-Jul-2024
    • Show More Cited By

    View Options

    View options


    View or Download as a PDF file.



    View online with eReader.


    Login options

    Full Access






    Share this Publication link

    Share on social media