Streaming active learning with deep neural networks
Article No.: 1245, Pages 30005 - 30021
Abstract
Active learning is perhaps most naturally posed as an online learning problem. However, prior active learning approaches with deep neural networks assume offline access to the entire dataset ahead of time. This paper proposes VeSSAL, a new algorithm for batch active learning with deep neural networks in streaming settings, which samples groups of points to query for labels at the moment they are encountered. Our approach trades off between uncertainty and diversity of queried samples to match a desired query rate without requiring any hand-tuned hyperparameters. Altogether, we expand the applicability of deep neural networks to realistic active learning scenarios, such as applications relevant to HCI and large, fractured datasets.
References
[1]
Ash, J., Goel, S., Krishnamurthy, A., and Kakade, S. Gone fishing: Neural active learning with fisher embeddings. Advances in Neural Information Processing Systems, 34: 8927-8939, 2021.
[2]
Ash, J. T. and Adams, R. P. On warm-starting neural network training. Advances in Neural Information Processing Systems, 2020.
[3]
Ash, J. T., Zhang, C., Krishnamurthy, A., Langford, J., and Agarwal, A. Deep batch active learning by diverse, uncertain gradient lower bounds. International Conference on Learning Representations, 2020.
[4]
Ban, Y., Zhang, Y., Tong, H., Banerjee, A., and He, J. Improved algorithms for neural active learning. Advances in eural Information Processing Systems, 2022.
[5]
Bardenet, R., Lavancier, F., Mary, X., and Vasseur, A. On a few statistical applications of determinantal point processes. ESAIM: Proceedings and Surveys, 60:180-202, 2017.
[6]
Beluch, W. H., Genewein, T., Nürnberger, A., and Köhler, J. M. The power of ensembles for active learning in image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9368-9377, 2018.
[7]
Beygelzimer, A., Dasgupta, S., and Langford, J. Importance weighted active learning. In Twenty-Sixth International Conference on Machine Learning, 2009.
[8]
Beygelzimer, A., Hsu, D. J., Langford, J., and Zhang, T. Agnostic active learning without constraints. In Neural Information Processing Systems, 2010.
[9]
Bhaskara, A., Lattanzi, S., Vassilvitskii, S., and Zadimoghaddam, M. Residual based sampling for online low rank approximation. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pp. 1596-1614. IEEE, 2019.
[10]
Bingham, E. and Mannila, H. Random projection in dimensionality reduction: applications to image and text data. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 245-250, 2001.
[11]
Bohus, D., Andrist, S., Feniello, A., Saw, N., and Horvitz, E. Continual learning about objects in the wild: An interactive approach. In Proceedings of the 2022 International Conference on Multimodal Interaction, pp. 476-486, 2022.
[12]
Boutsidis, C., Garber, D., Karnin, Z., and Liberty, E. Online principal components analysis. In Proceedings of the twenty-sixth annual ACM-SIAM symposium on Discrete algorithms, pp. 887-901. SIAM, 2014.
[13]
Braverman, V., Meyerson, A., Ostrovsky, R., Roytman, A., Shindler, M., and Tagiku, B. Streaming k-means on well-clusterable data. In Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms, pp. 26-40. SIAM, 2011.
[14]
Brust, C.-A., Käding, C., and Denzler, J. Active learning for deep object detection. arXiv preprint arXiv:1809.09875, 2018.
[15]
Choi, J., Elezi, I., Lee, H.-J., Farabet, C., and Alvarez, J. M. Active learning for deep object detection via probabilistic modeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10264-10273, 2021.
[16]
Dasgupta, S. Two faces of active learning. Theoretical computer science, 2011.
[17]
Deshpande, A. and Vempala, S. Adaptive sampling and fast low-rank matrix approximation. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pp. 292-303. Springer, 2006.
[18]
Deshpande, A., Rademacher, L., Vempala, S. S., and Wang, G. Matrix approximation and projective clustering via volume sampling. Theory of Computing, 2(1):225-247, 2006.
[19]
Ducoffe, M. and Precioso, F. Adversarial active learning for deep networks: a margin based approach. arXiv:1802.09841, 2018.
[20]
Frieze, A., Kannan, R., and Vempala, S. Fast monte-carlo algorithms for finding low-rank approximations. Journal of the ACM (JACM), 51(6):1025-1041, 2004.
[21]
Gal, Y., Islam, R., and Ghahramani, Z. Deep bayesian active learning with image data. In International Conference on Machine Learning, 2017.
[22]
Geifman, Y. and El-Yaniv, R. Deep active learning over the long tail. arXiv:1711.00941, 2017.
[23]
Ghashami, M. and Phillips, J. M. Relative errors for deterministic low-rank matrix approximations. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pp. 707-717. SIAM, 2014.
[24]
Gissin, D. and Shalev-Shwartz, S. Discriminative active learning. arXiv:1907.06347, 2019.
[25]
Greub, W. H. Linear algebra, volume 23. Springer Science & Business Media, 2012.
[26]
Gudovskiy, D., Hodgkinson, A., Yamaguchi, T., and Tsukizawa, S. Deep active learning for biased datasets via fisher kernel self-supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9041-9049, 2020.
[27]
Hanneke, S. Theory of disagreement-based active learning. Foundations and Trends in Machine Learning, 2014a.
[28]
Hanneke, S. Theory of active learning. Foundations and Trends in Machine Learning, 7(2-3), 2014b.
[29]
Hanneke, S. and Yang, L. Minimax analysis of active learning. J. Mach. Learn. Res., 16(1):3487-3602, 2015.
[30]
He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[31]
Hsu, D. J. Algorithms for active learning. PhD thesis, UC San Diego, 2010.
[32]
Huang, T.-K., Agarwal, A., Hsu, D. J., Langford, J., and Schapire, R. E. Efficient and parsimonious agnostic active learning. Advances in Neural Information Processing Systems, 28, 2015.
[33]
Kingma, D. P. and Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[34]
Kovashka, A., Russakovsky, O., Fei-Fei, L., Grauman, K., et al. Crowdsourcing in computer vision. Foundations and Trends® in computer graphics and Vision, 10(3): 177-243, 2016.
[35]
Krishnamurthy, A., Agarwal, A., Huang, T.-K., Daumé III, H., and Langford, J. Active learning for cost-sensitive classification. In International Conference on Machine Learning, pp. 1915-1924. PMLR, 2017.
[36]
Krizhevsky, A. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
[37]
Kulesza, A., Taskar, B., et al. Determinantal point processes for machine learning. Foundations and Trends® in Machine Learning, 5(2-3):123-286, 2012.
[38]
Lavania, C., Wei, K., Iyer, R., and Bilmes, J. A practical online framework for extracting running video summaries under a fixed memory budget. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 226-234. SIAM, 2021.
[39]
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al. Gradient-based learning applied to document recognition. IEEE, 1998.
[40]
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A. C. SSD: Single shot multibox detector. In European conference on computer vision, pp. 21-37. Springer, 2016.
[41]
MacKay, D. J. Information-based objective functions for active data selection. Neural computation, 4(4):590-604, 1992.
[42]
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and Ng, A. Y. Reading digits in natural images with unsupervised feature learning. 2011.
[43]
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and Lerer, A. Automatic differentiation in pytorch. 2017.
[44]
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748-8763. PMLR, 2021.
[45]
Ren, P., Xiao, Y., Chang, X., Huang, P.-Y., Li, Z., Gupta, B. B., Chen, X., and Wang, X. A survey of deep active learning. ACM computing surveys (CSUR), 54(9):1-40, 2021.
[46]
Roth, D. and Small, K. Margin-based active learning for structured output spaces. In European Conference on Machine Learning, 2006.
[47]
Roy, S., Unmesh, A., and Namboodiri, V. P. Deep active learning for object detection. In BMVC, pp. 91, 2018.
[48]
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 115: 211-252, 2015.
[49]
Sener, O. and Savarese, S. Active learning for convolutional neural networks: A core-set approach. In International Conference on Learning Representations, 2018.
[50]
Senzaki, Y. and Hamelain, C. Active learning for deep neural networks on edge devices. arXiv preprint arXiv:2106.10836, 2021.
[51]
Settles, B. Active learning literature survey. University of Wisconsin, Madison, 2010.
[52]
Singh, K. K., Fatahalian, K., and Efros, A. A. Krishnacam: Using a longitudinal, single-person, egocentric dataset for scene understanding tasks. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1-9. IEEE, 2016.
[53]
Strang, G. Linear algebra and its applications. Belmont, CA: Thomson, Brooks/Cole, 2006.
[54]
Sun, L. and Gong, Y. Active learning for image classification: A deep reinforcement learning approach. In 2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence (CCHI), pp. 71-76. IEEE, 2019.
[55]
Sun, S., Calandriello, D., Hu, H., Li, A., and Titsias, M. Information-theoretic online memory selection for continual learning. journal=International Conference on Learning Representations, 2022.
[56]
Wang, D. and Shang, Y. A new active labeling method for deep learning. In International Joint Conference on Neural Networks, 2014.
[57]
Wang, J., Wang, X., Shang-Guan, Y., and Gupta, A. Wanderlust: Online continual object detection in the real world. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10829-10838, 2021.
[58]
Woodbury, M. A. Inverting modified matrices. Statistical Research Group, 1950.
Recommendations
Deep learning in neural networks
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. ...
Comments
Information & Contributors
Information
Published In
July 2023
43479 pages
Copyright © 2023.
Publisher
JMLR.org
Publication History
Published: 23 July 2023
Qualifiers
- Research-article
- Research
- Refereed limited
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024