research-article

Active learning in multimedia annotation and retrieval: A survey

Authors:

Xian-Sheng HuaAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 2, Issue 2

Article No.: 10, Pages 1 - 21

https://doi.org/10.1145/1899412.1899414

Published: 24 February 2011 Publication History

Abstract

Active learning is a machine learning technique that selects the most informative samples for labeling and uses them as training data. It has been widely explored in multimedia research community for its capability of reducing human annotation effort. In this article, we provide a survey on the efforts of leveraging active learning in multimedia annotation and retrieval. We mainly focus on two application domains: image/video annotation and content-based image retrieval. We first briefly introduce the principle of active learning and then we analyze the sample selection criteria. We categorize the existing sample selection strategies used in multimedia annotation and retrieval into five criteria: risk reduction, uncertainty, diversity, density and relevance. We then introduce several classification models used in active learning-based multimedia annotation and retrieval, including semi-supervised learning, multilabel learning and multiple instance learning. We also provide a discussion on several future trends in this research direction. In particular, we discuss cost analysis of human annotation and large-scale interactive multimedia annotation.

References

[1]

Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of ACM CHI.

Digital Library

[2]

Ahn, L., Liu, R., and Blum, M. 2006. Peekaboom: A game for locating objects in images. In Proceedings of the ACM Conference on Human Factors in Computing Systems.

Digital Library

[3]

Ahn, L., Maurer, B., Mcmillen, C., Abraham, D., and Blum, M. 2008. Recaptcha: Human-based character recognition via web security measures. Science.

[4]

Andrews, S., Tsochantaridis, I., and Hofmann, T. 2002. Support vector machines for multiple-instance learning. In Proceedings of the Neural Information Processing Systems.

[5]

Angluin, D. 1998. Queries and concept learning. Mach. Learn. 2.

Digital Library

[6]

Ayache, S. and Quénot, G. 2007. Evaluation of active learning strategies for video indexing. In Proceedings of International Workshop on Content-Based Multimedia Indexing.

[7]

Bao, L., Cao, J., Xia, T., Zhang, Y., and Li, J. 2009. Locally non-negative linear structure learning for interactive image retrieval. In Proceedings of ACM Multimedia.

Digital Library

[8]

Berger, A., Pietra, S. D., and Pietra, V. D. 1996. A maximum entropy approach to natural language processing. Computat. Linguistics 22, 1.

Digital Library

[9]

Blum, A. and Mitchell, T. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of Workshop on Computational Learning Theory.

Digital Library

[10]

Brinker, K. 2003. Incorporating diversity in active learning with support vector machines. In Proceedings of the International Conference on Machine Learning.

[11]

Cauwenberghs, G. and Poggio, T. 2000. Incremental and decremental support vector machine learning. In Proceedings of Neural Information Processing Systems.

[12]

Chapelle, O., Zien, A., and Schölkopf, B. 2006. Semi-Supervised Learning. MIT Press.

Digital Library

[13]

Chen, M., Christel, M., Hauptmann, A., and Wactlar, H. 2005. Putting active learning into multimedia applications: dynamic definition and refinement of concept classifiers. In Proceedings of ACM Multimedia.

Digital Library

[14]

Cohen, D. A., Ghahramani, Z., and Jordan, M. I. 1996. Active learning with statistical models. J. Artif. Intell. Res.

Digital Library

[15]

Cohn, D., Atlas, L., and Ladner, R. 1994. Improving generalization with active learning. Mach. Learn. 15, 2.

Digital Library

[16]

Collins, B., Deng, J., Li, K., and Fei-Fei, L. 1995. Towards scalable dataset construction: an active learning approach. In Proceedings of the European Conference on Machine Learning.

Digital Library

[17]

Dagan, I. and Engselon, S. 1995. Committee-based sampling for training probabilistic classifiers. In Proceedings of the International Conference on Machine Learning.

[18]

Dagli, C. K., Rajaram, S., and Huang, T. S. 2006. Leveraging active learning for relevance feedback using an information-theoretic diversity measure. In Proceedings of the International Conference on Image and Video Retrieval.

Digital Library

[19]

Dietterich, T. G., Lathrop, R. H., and Lozano-Perez, T. 1997. Solving the multiple-instance problem with axis-parallel rectangles. Artif. Intell.

Digital Library

[20]

Foote, J. 1997. Content-based retrieval of music and audio. In Proceedings of SPIE Multimedia Storage Archiving Systems II.

[21]

Freund, M., Seung, H. S., Shamir, E., and Tishby, N. 1997. Selective sampling using the query by committee algorithm. Mach. Learn. 28.

Digital Library

[22]

Geng, B., Yang, L., Zha, Z. J., Xu, C., and Hua, X. S. 2008. Unbiased active learning for image retrieval. In Proceedings of the International Conference on Multimedia & Expo.

[23]

Goh, K. S., Chang, E. Y., and Lai, W. C. 2004. Multimodal concept-dependent active learning for image trieval. In Proceedings of ACM Multimedia.

Digital Library

[24]

Gosselin, P. H. and Cord, M. 2004. A comparison of active classification methods for content-based image retrieval. In Proceedings of the International Workshop on Computer Vision Meets Databases.

Digital Library

[25]

Hakkani-tur, D., Riccardi, G., and Gorin, A. 2002. Active learning for automatic speech recognition. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing.

[26]

Hanneke, S. and Yang, L. 2010. Negative results for active learning with convex losses. In Proceedings of the International Conference on Artificial Intelligence and Statistics.

[27]

He, J. R., Li, M., Zhang, H. J., Tong, H., and Zhang, C. 2004. Mean version space: a new active learning method for content-based image retrieval. In Proceedings of the ACM Workshop on Multimedia Information Retrieval.

Digital Library

[28]

Hoi, S. C., Jin, R., Zhu, J., and Lyu, M. R. 2006. Batch mode active learning and its application to medical image classification. In Proceedings of the International Conference on Machine Learning.

Digital Library

[29]

Hoi, S. C. and Lyu, M. R. 2005. A semi-supervised active learning framework for image retrieval. In Proceedings of the International Conference on Computer Vision and Pattern Recognition.

Digital Library

[30]

Hsieh, C. J., Chang, K. W., Lin, C. J., Keerthi, S. S., and Sundararajan, S. 2008. A dual coordinate descent method for large-scale linear svm. In Proceedings of the International Conference on Machine Learning.

Digital Library

[31]

Hua, X. and Qi, G. J. 2008. Online multi-label active annotation: towards large-scale content-based video search. In Proceedings of ACM Multimedia.

Digital Library

[32]

Huang, T. S., Dagli, C. K., Rajaram, S., Chang, E. Y., Mandel, M. I., Poliner, G. E., and Ellis, D. P. W. 2008. Active learning for interactive multimedia retrieval. Proc. IEEE 96, 4.

[33]

Jain, P. and Kapoor, A. 2009. Active learning for large multi-class problems. In Proceedings of the International Conference on Computer Vision and Pattern Recognition.

[34]

Joshi, A. J., Porikli, F., and Papanikolopoulos, N. 2009. Multi-class active learning for image classification. In Proceedings of the International Conference on Computer Vision and Pattern Recognition.

[35]

King, R. D., Whelan, K. E., Jones, F. M., Reiser, P. G., Bryant, C. H., Muggleton, S. H., Kell, D. B., and Oliver, S. G. 2004. Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427, 6971.

[36]

Li, J. and Wang, J. 2008. Real-time computerized annotation of pictures. IEEE Trans. Patt. Anal. Mach. Intell. 30, 6.

Digital Library

[37]

Lin, C., Tseng, B., and Smith, J. R. 2003. VideoAnnEx: IBM MPEG-7 annotation tool for multimedia indexing and concept learning. In Proceedings of the International Conference on Multimedia & Expo.

[38]

Mandel, M., Poliner, G., and Ellis, D. 2006. Support vector machine active learning for music retrieval. Multimed. Syst. 12, 1.

Digital Library

[39]

Maronand, O. and Ratan, A. L. 1998. Multiple-instance learning for natural scene classification. In Proceedings of the International Conference on Machine Learning.

Digital Library

[40]

Mitchell, T. 1982. Generalization as search. Artif. Intell.

[41]

Muslea, I., Minton, S., and Knoblock, C. A. 2002. Active + semi-supervised learning = robust multi-view learning. In Proceedings of the International Conference on Machine Learning.

Digital Library

[42]

Naphade, M. and Smith, J. R. 2004a. Active learning for simultaneous annotation of multiple binary semantic concepts. In Proceedings of the International Conference on Image Processing.

[43]

Naphade, M. R. and Smith, J. R. 2004b. On the detection of semantic concepts at TRECVID. In Proceedings of the ACM Multimedia.

Digital Library

[44]

Nguyen, H. T. and Smeulders, A. 2004. Active learning using pre-clustering. In Proceedings of the International Conference on Machine Learning.

Digital Library

[45]

Olsson, F. 2009. A literature survey of active machine learning in the context of natural language processing. SICS Tech. rep., Swedish Institute of Computer Science.

[46]

Panda, N., Chang, E. Y., and Wu, G. 2006a. Concept boundary detection for speeding up SVMs. In Proceedings of the International Conference on Machine Learning.

Digital Library

[47]

Panda, N., Goh, K., and Chang, E. Y. 2006b. Active learning in very large image databases. J. Multimed. Tools Appl.

Digital Library

[48]

Parzen, E. 1962. On the estimation of a probability density function and the mode. Ann. Math. Stat. 33.

[49]

Qi, G., Hua, X. S., Rui, Y., Tang, J., Mei, T., and Zhang, H. J. 2007. Correlative multi-label video annotation. In Proceedings of ACM Multimedia.

Digital Library

[50]

Qi, G., Hua, X. S., Rui, Y., Tang, J., and Zhang, H. J. 2009. Two-dimensional multilabel active learning with an efficient online adaptation model for image classification. IEEE Trans. Patt. Anal. Mach. Intell. 31.

Digital Library

[51]

Qi, G., Song, Y., Hua, X. S., Zhang, H. J., and Dai, L. R. 2004. Video annotation by active learning and cluster tuning. In Proceedings of the CVPR Workshop.

Digital Library

[52]

Redner, R. and Walker, H. 1984. Mixture densities, maximum likelihood and the em algorithm. SIAM Rev. 26, 2.

[53]

Roy, N. and McCallum, A. 2001. Toward optimal active learning through monte carlo estimation of error reduction. In Proceedings of the International Conference on Machine Learning.

Digital Library

[54]

Rui, Y., Huang, T. S., and Chang, S. F. 1999. Image retrieval: current techniques, promising directions and open issues. J. Vis. Comm. Image Rep. 10, 4.

Digital Library

[55]

Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: a power tool in interactive content-based image retrieval. IEEE Trans. Circuits Syst. Video Tech. 8, 5.

Digital Library

[56]

Sahbi, H., Etyngier, P., Audibert, J., and Keriven, R. 2008. Manifold learning using robust graph Laplacian for interactive image search. In Proceedings of the International Conference on Computer Vision and Pattern Recognition.

[57]

Settles, B. 2009. Active learning literature survey. Computer Sciences Tech. rep., University of Wisconsin-Madison.

[58]

Settles, B., Craven, M., and Friedland, L. 2008. Active learning with real annotation costs. In Proceedings of the NIPS Workshop on Cost-Sensitive Learning.

[59]

Settles, B., Craven, M., and Ray, S. 2007. Multiple-instance active learning. In Proceedings of Neural Information Processing Systems.

[60]

Seung, H. S., Opper, M., and Sompolinsky, H. 1992. Query by committee. In Proceedings of the Annual Workshop on Computational Theory.

Digital Library

[61]

Smeulders, A. W., Worring, M., Santini, S., Gupta, A., and Jain, R. 2000. Content based image retrieval at the end of the early years. IEEE Trans. Patt. Anal. Mach. Intell. 22, 12.

Digital Library

[62]

Song, Y., Hua, X. S., Dai, L. R., and Wang, M. 2005. Semi-automatic video annotation based on active learning with multiple complementary predictors. In Proceedings of the ACM Workshop on Multimedia Information Retrieval.

Digital Library

[63]

Sorokin, A. and Forsyth, D. 2008. Utility data annotation via amazon mechanical turk. In Proceedings of the CVPR Workshop.

[64]

Sychay. G., Chang, E. Y., and Goh, K. 2002. Effective image annotation via active learning. In Proceedings of the International Conference on Image Processing.

[65]

Tang, J., Hua, X. S., Qi, G., Gu, Z., and Wu, X. 2007. Beyond accuracy: Typicality ranking for video annotation. In Proceedings of the International Conference on Multimedia & Expo.

[66]

Tong, S. and Chang, E. Y. 2001. Support vector machine active learning for image retrieval. In Proceedings of the ACM Multimedia.

Digital Library

[67]

Tong, S. and Koller, D. 2000. Support vector machine active learning with applications to text classification. In Proceedings of the International Conference on Machine Learning.

Digital Library

[68]

Vendrig, J., den Hartog, J., van Leeuwen, D., Patras, I., Raaijmakers, S., van Rest, J., Snoek, C., and Worring, M. 2002. Trec feature extraction by active learning. In Proceedings of the TRECVID Workshop.

[69]

Vijayanarasimhan, S. and Grauman, K. 2008. Multi-level active prediction of useful image annotations for recognition. In Proceedings of the Neural Information Processing Systems.

[70]

Vijayanarasimhan, S. and Grauman, K. 2009. What it going to cost you&quest;: Predicting effort vs. informativeness for multi-label image annotations. In Proceedings of the Symposium on Computer Vision and Pattern Recognition.

[71]

Volkmer, T., Smith, J. R., and Natsev, A. 2005. A web-based system for collaborative annotation of large image and video collections. In Proceedings of ACM Multimedia.

Digital Library

[72]

Wang, M., Hua, X. S., Mei, T., Tang, J., Qi, G. J., Song, Y., and Dai, L. R. 2007. Interactive video annotation by multi-concept multi-modality active learning. Int. J. Seman. Comput. 1, 4.

[73]

Wang, M., Hua, X. S., Tang, J., and Hong, R. 2009a. Beyond distance measurement: constructing neighborhood similarity for video annotation. IEEE Trans. Multimed. 11, 3.

Digital Library

[74]

Wang, M., Hua, S, X., Hong, R., Tang, J., Qi, G. J., and Song, Y. 2009b. Unified video annotation via multi-graph learning. IEEE Trans. Circ. Syst. Video Tech. 19, 5.

Digital Library

[75]

Wang, W. and Zhou, Z. 2008. On multi-view active learning and the combination with semi-supervised learning. In Proceedings of the International Conference on Machine Learning.

Digital Library

[76]

Wu, Y., Kozintsev, I., Bouguet, J.-Y., and Dulong, C. 2006. Sampling strategies for active learning in personal photo retrieval. In Proceedings of the International Conference on Multimedia & Expo.

[77]

Yan, R., Natsev, A., and Campbell, M. 2009. Hybrid tagging and browsing approaches for efficient manual image annotation. IEEE Multimed. Mag. 16, 2.

Digital Library

[78]

Yan, R., Yang, J., and Hauptmann, A. 2003. Automatically labeling video data using multi-class active learning. In Proceedings of the International Conference on Computer Vision.

Digital Library

[79]

Yang, J., Li, Y., Tian, Y., Duan, L., and Gao, W. 2009. Multiple kernel active learning for image classification. In Proceedings of the International Conference on Multimedia & Expo.

Digital Library

[80]

Yuan, J., Zhou, X., Zhang, J., Wang, M., Zhang, Q., Wang, W., and Shi, B. 2006. Positive sample enhanced angle-diversity learning for SVM-based image retrieval. In Proceedings of the International Conference on Multimedia & Expo.

[81]

Zhang, C. and Chen, T. 2003. Annotating retrieval database with active learning. In Proceedings of the International Conference on Image Processing.

[82]

Zhang, Q. and Goldman, S. A. 2001. EM-DD: An improved multiple-instance learning technique. In Proceedings of the Neural Information Processing Systems.

[83]

Zhang, X., Cheng, J., Xu, C., Lu, H., and Ma, S. 2009. Multi-view multi-label active learning for image classification. In Proceedings of the International Conference on Multimedia & Expo.

Digital Library

[84]

Zhu, X. 2009. Semi-supervised learning literature survey. Tech. rep. (1530), Wisconsin-Madison.

[85]

Zhu, X., Ghahramani, Z., and Lafferty, J. 2003a. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the International Conference on Machine Learning.

[86]

Zhu, X., Lafferty, J., and Ghabramani, Z. 2003b. Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the ICML 2003 Workshop on the Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining.

Cited By

Li NQi YLi CZhao Z(2024)Active Learning for Data Quality Control: A SurveyJournal of Data and Information Quality10.1145/366336916:2(1-45)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3663369
Lin XLiu XChen BWang YDong CHu P(2024)ATAL: Active Learning Using Adversarial Training for Data AugmentationIEEE Internet of Things Journal10.1109/JIOT.2023.330030011:3(4787-4800)Online publication date: 1-Feb-2024
https://doi.org/10.1109/JIOT.2023.3300300
Shukla AMuhuri P(2024)A novel deep belief network architecture with interval type-2 fuzzy set based uncertain parameters towards enhanced learningFuzzy Sets and Systems10.1016/j.fss.2023.108744477(108744)Online publication date: Feb-2024
https://doi.org/10.1016/j.fss.2023.108744
Show More Cited By

Index Terms

Active learning in multimedia annotation and retrieval: A survey
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Literature survey of active learning in multimedia annotation and retrieval
ICIMCS '13: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service

According to some certain criteria, active learning algorithm selects the most informative samples from the unlabeled sample sets for human experts to label, then the labeled samples, called the training set, are used to train a model for image ...
Automatic medical image annotation and retrieval

The demand for automatically annotating and retrieving medical images is growing faster than ever. In this paper, we present a novel medical image retrieval method for a special medical image retrieval problem where the images in the retrieval database ...
Random forest-based active learning for content-based image retrieval

The classification-based relevance feedback approach suffers from the problem of imbalanced training dataset, which causes instability and degradation in the retrieval results. In order to tackle with this problem, a novel active learning approach based ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 2, Issue 2

February 2011

175 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/1899412

Issue’s Table of Contents

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2011

Accepted: 01 August 2010

Revised: 01 June 2010

Received: 01 February 2010

Published in TIST Volume 2, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Survey
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

184
Total Citations
View Citations
3,310
Total Downloads

Downloads (Last 12 months)89
Downloads (Last 6 weeks)10

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li NQi YLi CZhao Z(2024)Active Learning for Data Quality Control: A SurveyJournal of Data and Information Quality10.1145/366336916:2(1-45)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3663369
Lin XLiu XChen BWang YDong CHu P(2024)ATAL: Active Learning Using Adversarial Training for Data AugmentationIEEE Internet of Things Journal10.1109/JIOT.2023.330030011:3(4787-4800)Online publication date: 1-Feb-2024
https://doi.org/10.1109/JIOT.2023.3300300
Shukla AMuhuri P(2024)A novel deep belief network architecture with interval type-2 fuzzy set based uncertain parameters towards enhanced learningFuzzy Sets and Systems10.1016/j.fss.2023.108744477(108744)Online publication date: Feb-2024
https://doi.org/10.1016/j.fss.2023.108744
Vemulapalli VChakraborty SKorra S(2024)An intensity-based deep approach to mitigate step-imbalance problem under extreme paucity of images from rare classesMultimedia Tools and Applications10.1007/s11042-024-19303-8Online publication date: 9-May-2024
https://doi.org/10.1007/s11042-024-19303-8
Xing CHu TLiao NZhang MDu DWu YGao Q(2024)Active Learning for Low-Resource Project-Specific Code SummarizationKnowledge Science, Engineering and Management10.1007/978-981-97-5489-2_5(48-57)Online publication date: 16-Aug-2024
https://dl.acm.org/doi/10.1007/978-981-97-5489-2_5
Mozafari MMoattar M(2024)A Hybrid Fuzzy Deep Belief Network Extreme Learning Machine Framework With Hyperbolic Secant Activation Function for Robust Semi‐Supervised Sentiment ClassificationApplied AI Letters10.1002/ail2.102Online publication date: 13-Oct-2024
https://doi.org/10.1002/ail2.102
Tharwat ASchenck W(2023)A Survey on Active Learning: State-of-the-Art, Practical Challenges and Research DirectionsMathematics10.3390/math1104082011:4(820)Online publication date: 6-Feb-2023
https://doi.org/10.3390/math11040820
Zeng YChen XJin R(2023)Ensemble Active Learning by Contextual Bandits for AI Incubation in ManufacturingACM Transactions on Intelligent Systems and Technology10.1145/362782115:1(1-26)Online publication date: 19-Dec-2023
https://dl.acm.org/doi/10.1145/3627821
Khozeimeh FAlizadehsani RShirani MTartibi MShoeibi AAlinejad-Rokny HHarlapur CSultanzadeh SKhosravi ANahavandi STan RAcharya U(2023)ALECComputers in Biology and Medicine10.1016/j.compbiomed.2023.106841158:COnline publication date: 1-May-2023
https://dl.acm.org/doi/10.1016/j.compbiomed.2023.106841
Santosh KNakarmi SSantosh KNakarmi S(2023)Active Learning—ReviewActive Learning to Minimize the Possible Risk of Future Epidemics10.1007/978-981-99-7442-9_3(19-30)Online publication date: 23-Nov-2023
https://doi.org/10.1007/978-981-99-7442-9_3
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents