Article

The boundary forest algorithm for online supervised and unsupervised learning

Authors:

Nate Derbinsky,

Jonathan Rosenthal,

Jonathan YedidiaAuthors Info & Claims

AAAI'15: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence

Pages 2864 - 2870

Published: 25 January 2015 Publication History

Abstract

We describe a new instance-based learning algorithm called the Boundary Forest (BF) algorithm, that can be used for supervised and unsupervised learning. The algorithm builds a forest of trees whose nodes store previously seen examples. It can be shown data points one at a time and updates itself incrementally, hence it is naturally online. Few instance-based algorithms have this property while being simultaneously fast, which the BF is. This is crucial for applications where one needs to respond to input data in real time. The number of children of each node is not set beforehand but obtained from the training procedure, which makes the algorithm very flexible with regards to what data manifolds it can learn. We test its generalization performance and speed on a range of benchmark datasets and detail in which settings it outperforms the state of the art. Empirically we find that training time scales as O(DNlog(N)) and testing as O(Dlog(N)), where D is the dimensionality and N the amount of data.

References

[1]

Aha, D. W.; Kibler, D.; and Albert, M. K. 1991. Instance-based learning algorithms. Machine Learning 6:37-66.

[2]

Beygelzimer, A.; Kakade, S.; and Langford, J. 2006. Cover tree for nearest neighbor. In Proceedings of the 2006 23rd International Conference on Machine Learning.

[3]

Breiman, L. 2001. Random forests. Machine Learning 45:5-32.

[4]

Brin, S. 1995. Near neighbor search in large metric spaces. In Proceedings of the 21st International Conference on Very Large Data Bases, 574-584.

[5]

Chang, C.-C., and Lin, C.-J. 2008. LIBSVM Data: Classification, Regression, and Multi-label. http://www.csie.ntu. edu.tw/~cjlin/libsvmtools/datasets/.[Online; accessed 4-June-2014].

[6]

Crane, D.N. 2011. Cover-Tree. http://www.github.com/DNCrane/Cover-Tree. [Online; accessed 1-July-2014].

[7]

Friedman, J.; Bentley, J.; and Finkel, R. 1977. An algorithm for finding best matches in logarithmic expected time. ACM Trans. on Mathematical Software 3:209-226.

[8]

Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; and Witten, I. H. 2009. The weka data mining software: an update. ACM SIGKDD explorations newsletter 11(1):10-18.

[9]

Johnson, W., and Lindenstrauss, J. 1984. Extensions of lipschitz maps into a hilbert space. Contemp. Math. 26:189-206.

[10]

Kalal, Z.; Matas, J.; and Mikolajczyk, K. 2009. Online learning of robust object detectors during unstable tracking. IEEE Transactions on Online Learning for Computer Vision 1417-1424.

[11]

Lejsek, H.; Jônsson, B. P.; and Amsaleg, L. 2011. NV-tree: nearest neighbors at the billion scale. In Proceedings of the 1st ACM International Conference on Multimedia.

[12]

Muja, M., and Lowe, D. G. 2009. Fast approximate nearest neighbors with automatic algorithm configuration. In VIS-APP (1), 331-340.

[13]

Omohundro, S. M. 1989. Five balltree construction algorithms. International Computer Science Institute Berkeley.

[14]

Samet, H. 2006. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufman.

[15]

Shepard, D. 1968. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM national conference, 517-524. ACM.

[16]

Silpa-Anan, C., and Hartley, R. 2008. Optimised kd-trees for fast image descriptor matching. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, 1-8. IEEE.

[17]

Weinberger, K.; Blitzer, J.; and Saul, L. 2006. Distance metric learning for large margin nearest neighbor classification. Advances in neural information processing systems 18:1473.

[18]

Wilson, D. R., and Martinez, T. R. 2000. Reduction techniques for instance-based learning algorithms. Machine Learning 38:257-286.

Cited By

Shah DXie Q(2018)Q-learning with nearest neighborsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327233(3115-3125)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327233
Carreira-Perpiñán MTavallali P(2018)Alternating optimization of decision trees, with application to learning sparse oblique treesProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3326943.3327055(1219-1229)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3326943.3327055
Wu HWang CYin JLu KZhu LChampin PGandon FMédini LLalmas MIpeirotis P(2018)Sharing Deep Neural Network Models with InterpretationProceedings of the 2018 World Wide Web Conference10.1145/3178876.3185995(177-186)Online publication date: 10-Apr-2018
https://dl.acm.org/doi/10.1145/3178876.3185995

Index Terms

The boundary forest algorithm for online supervised and unsupervised learning
1. Computing methodologies
  1. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Unsupervised Selective Labeling for More Effective Semi-supervised Learning
Computer Vision – ECCV 2022
Abstract
Given an unlabeled dataset and an annotation budget, we study how to selectively label a fixed number of instances so that semi-supervised learning (SSL) on such a partially labeled dataset is most effective. We focus on selecting the right data ...
Semi-supervised partial label learning algorithm via reliable label propagation
Abstract
Partial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
Online MIL tracking with instance-level semi-supervised learning

In this paper we propose an online multiple instance boosting algorithm with instance-level semi-supervised learning, termed SemiMILBoost, to achieve robust object tracking. Our work revisits the multiple instance learning (MIL) formulation to alleviate ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'15: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence

January 2015

4331 pages

ISBN:0262511290

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 25 January 2015

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shah DXie Q(2018)Q-learning with nearest neighborsProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327233(3115-3125)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327144.3327233
Carreira-Perpiñán MTavallali P(2018)Alternating optimization of decision trees, with application to learning sparse oblique treesProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3326943.3327055(1219-1229)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3326943.3327055
Wu HWang CYin JLu KZhu LChampin PGandon FMédini LLalmas MIpeirotis P(2018)Sharing Deep Neural Network Models with InterpretationProceedings of the 2018 World Wide Web Conference10.1145/3178876.3185995(177-186)Online publication date: 10-Apr-2018
https://dl.acm.org/doi/10.1145/3178876.3185995

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents