Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Annotation propagation in image databases using similarity graphs

Published: 27 December 2013 Publication History

Abstract

The practicality of large-scale image indexing and querying methods depends crucially upon the availability of semantic information. The manual tagging of images with semantic information is in general very labor intensive, and existing methods for automated image annotation may not always yield accurate results. The aim of this paper is to reduce to a minimum the amount of human intervention required in the semantic annotation of images, while preserving a high degree of accuracy. Ideally, only one copy of each object of interest would be labeled manually, and the labels would then be propagated automatically to all other occurrences of the objects in the database. To this end, we propose an influence propagation strategy, SW-KProp, that requires no human intervention beyond the initial labeling of a subset of the images. SW-KProp distributes semantic information within a similarity graph defined on all images in the database: each image iteratively transmits its current label information to its neighbors, and then readjusts its own label according to the combined influences of its neighbors. SW-KProp influence propagation can be efficiently performed by means of matrix computations, provided that pairwise similarities of images are available. We also propose a variant of SW-KProp which enhances the quality of the similarity graph by selecting a reduced feature set for each prelabeled image and rebuilding its neighborhood. The performances of the SW-KProp method and its variant were evaluated against several competing methods on classification tasks for three image datasets: a handwritten digit dataset, a face dataset and a web image dataset. For the digit images, SW-KProp and its variant performed consistently better than the other methods tested. For the face and web images, SW-KProp outperformed its competitors for the case when the number of prelabeled images was relatively small. The performance was seen to improve significantly when the feature selection strategy was applied.

References

[1]
Ames, M. and Naaman, M. 2007. Why we tag: Motivations for annotation in mobile and online media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 971--980.
[2]
Avrachenkov, K., Dobrynin, V., Nemirovsky, D., Pham, S. K., and Smirnova, E. 2008. Pagerank based clustering of hypertext document collections. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 873--874.
[3]
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D. M., and Jordan, M. I. 2003. Matching words and pictures. J. Mach. Learn. Res. 3, 1107--1135.
[4]
Belkin, M., Niyogi, P., and Sindhwani, V. 2006. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7, 2399--2434.
[5]
Blum, A. and Chawla, S. 2001. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning. 19--26.
[6]
Bradski, G. R. and Kaehler, A. 2008. Learning OpenCV - Computer Vision with the OpenCV Library: Software that Sees. O'Reilly.
[7]
Cao, L., Pozo, A. D., Jin, X., Luo, J., Han, J., and Huang, T. S. 2010. RankCompete: Simultaneous ranking and clustering of web photos. In Proceedings of the 19th International Conference on World Wide Web. 1071--1072.
[8]
Chang, E., Goh, K., Sychay, G., and Wu, G. 2003. CBSA: Content-Based Soft Annotation for multimodal image retrieval using Bayes point machines. IEEE Trans. Circ. Syst. Video Tech. 13, 1, 26--38.
[9]
Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of ACM Conference on Image and Video Retrieval.
[10]
Cusano, C., Ciocca, G., and Schettini, R. 2003. Image annotation using SVM. In Engineers SPIE Conference Series, Vol. 5304, 330--338.
[11]
Desai, C., Kalashnikov, D. V., Mehrotra, S., and Venkatasubramanian, N. 2009. Using semantics for speech annotation of images. In Proceedings of the IEEE International Conference on Data Engineering. 1227--1230.
[12]
Duygulu, P., Barnard, K., de Freitas, J. F. G., and Forsyth, D. A. 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proceedings of the 7th European Conference on Computer Vision:Part IV. 97--112.
[13]
Everingham, M., Sivic, J., and Zisserman, A. 2006. “Hello! My name is… Buffy” -- Automatic naming of characters in TV video. In Proceedings of the British Machine Vision Conference. 899--908.
[14]
Hageman, L. and Young, D. 2004. Applied Iterative Methods. Dover Publications.
[15]
Hardoon, D. R., Saunders, C., Szedmák, S., and Shawe-Taylor, J. 2006. A correlation approach for automatic image annotation. In Advanced Data Mining and Applications. 681--692.
[16]
Hestenes, M. R. and Stiefel, E. 1952. Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Standards 49, 409--436.
[17]
Higham, N. J. and Tisseur, F. 2003. Bounds for eigenvalues of matrix polynomials. Linear Algebra Appl. 358, 1--3, 5--22.
[18]
Houle, M. E., Oria, V., Satoh, S., and Sun, J. 2011. Knowledge propagation in large image databases using neighborhood information. In Proceedings of the ACM Multimedia. 1033--1036.
[19]
Houle, M. E. and Sakuma, J. 2005. Fast approximate similarity search in extremely high-dimensional data sets. In Proceedings of the 21st International Conference on Data Engineering. 619--630.
[20]
Hu, X. and Qian, X. 2009. A novel graph-based image annotation with two level bag generators. In Proceedings of the International Conference on Computational Intelligence and Security. 71--75.
[21]
Jeh, G. and Widom, J. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 538--543.
[22]
Jeon, J., Lavrenko, V., and Manmatha, R. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. 119--126.
[23]
Jing, Y. and Baluja, S. 2008. VisualRank: Applying PageRank to large-scale image search. IEEE Trans. Patt. Anal. Mach. Intell. 30, 11, 1877--1890.
[24]
Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11, 2278--2324.
[25]
Li, R., Zhang, Y., Lu, Z., Lu, J., and Tian, Y. 2010. Technique of image retrieval based on multi-label image annotation. In Proceedings of the 2010 2nd International Conference on Multimedia and Information Technology, Vol. 2. 10--13.
[26]
Li, X., Chen, L., Zhang, L., Lin, F., and Ma, W.-Y. 2006. Image annotation by large-scale content-based image retrieval. In Proceedings of the 14th Annual ACM International Conference on Multimedia. 607--610.
[27]
Liu, J., Li, M., Ma, W.-Y., Liu, Q., and Lu, H. 2006. An adaptive graph model for automatic image annotation. In Multimed. Inf. Ret. 61--70.
[28]
Liu, W., Dumais, S., Sun, Y., Zhang, H., Czerwinski, M., and Field, B. 2001. Semi-automatic image annotation. In Proceedings of Interact: Conference on Human-Computer Interaction. 326--333.
[29]
Liu, W., He, J., and Chang, S.-F. 2010. Large graph construction for scalable semi-supervised learning. In Proceedings of the 27th International Conference on Machine Learning. 679--686.
[30]
Liu, W., Wang, J., and Chang, S.-F. 2012. Robust and scalable graph-based semisupervised learning. Proc. IEEE 100, 9, 2624--2638.
[31]
Makadia, A., Pavlovic, V., and Kumar, S. 2008. A new baseline for image annotation. In Proceedings of the 10th European Conference on Computer Vision: Part III. 316--329.
[32]
Melacci, S. and Belkin, M. 2011. Laplacian support vector machines trained in the primal. J. Mach. Learn. Res. 12, 1149--1184.
[33]
Nov, O. and Ye, C. 2010. Why do people tag?: Motivations for photo tagging. Comm. ACM 53, 7, 128--131.
[34]
Ono, A., Amano, M., Hakaridani, M., Satou, T., and Sakauchi, M. 1996. A flexible content-based image retrieval system with combined scene description keyword. In Proceedings of the 3rd IEEE International Conference on Multimedia Computing and Systems. 201--208.
[35]
Ozkan, D. and Duygulu, P. 2006. A graph based approach for naming faces in news photos. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 1477--1482.
[36]
Page, L., Brin, S., Motwani, R., and Winograd, T. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report 1999--66., Stanford InfoLab.
[37]
Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: A database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 1--3, 157--173.
[38]
Saad, Y. and Schultz, M. H. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 7, 856--869.
[39]
Shi, R., Lee, C.-H., and Chua, T.-S. 2007. Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In Proceedings of the 15th International Conference on Multimedia. 341--344.
[40]
Srikanth, M., Varner, J., Bowden, M., and Moldovan, D. 2005. Exploiting ontologies for automatic image annotation. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 552--558.
[41]
Tang, J., Hong, R., Yan, S., Chua, T.-S., Qi, G.-J., and Jain, R. 2011. Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images. ACM Trans. Intel. Syst. Tech. 2, 2, 14.
[42]
Von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 319--326.
[43]
Wang, C., Jing, F., Zhang, L., and Zhang, H. 2006. Image annotation refinement using random walk with restarts. In Proceedings of the ACM Multimedia. 647--650.
[44]
Zhou, D., Bousquet, O., Lal, T. N., Weston, J., and Schölkopf, B. 2003a. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16.
[45]
Zhou, D., Weston, J., Gretton, A., Bousquet, O., and Schölkopf, B. 2003b. Ranking on data manifolds. In Advances in Neural Information Processing Systems 16.
[46]
Zhu, J., Hoi, S. C. H., and Lyu, M. R. 2008. Face annotation using transductive kernel fisher discriminant. IEEE Trans. Multimed. 10, 1, 86--96.
[47]
Zhu, X., Ghahramani, Z., and Lafferty, J. D. 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning. 912--919.

Cited By

View all
  • (2022)Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge GraphsACM Transactions on Information Systems10.1145/349520940:4(1-45)Online publication date: 11-Jan-2022
  • (2022)Semi-supervised False Data Injection Attacks Detection in Smart GridApplied Cryptography in Computer and Communications10.1007/978-3-031-17081-2_12(189-200)Online publication date: 6-Oct-2022
  • (2019)BTDPACM Transactions on Multimedia Computing, Communications, and Applications10.1145/328246915:2s(1-21)Online publication date: 3-Jul-2019
  • Show More Cited By

Index Terms

  1. Annotation propagation in image databases using similarity graphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 10, Issue 1
    December 2013
    166 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/2559928
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 December 2013
    Accepted: 01 May 2013
    Revised: 01 November 2012
    Received: 01 August 2012
    Published in TOMM Volume 10, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Classification
    2. feature selection
    3. image annotation
    4. iterative method
    5. linear system
    6. neighborhood

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge GraphsACM Transactions on Information Systems10.1145/349520940:4(1-45)Online publication date: 11-Jan-2022
    • (2022)Semi-supervised False Data Injection Attacks Detection in Smart GridApplied Cryptography in Computer and Communications10.1007/978-3-031-17081-2_12(189-200)Online publication date: 6-Oct-2022
    • (2019)BTDPACM Transactions on Multimedia Computing, Communications, and Applications10.1145/328246915:2s(1-21)Online publication date: 3-Jul-2019
    • (2018)Affective image classification via semi-supervised learning from web imagesMultimedia Tools and Applications10.5555/3288443.328852077:23(30633-30650)Online publication date: 1-Dec-2018
    • (2018)Comparative study of visual saliency maps in the problem of classification of architectural images with Deep CNNs2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)10.1109/IPTA.2018.8608125(1-6)Online publication date: Nov-2018
    • (2018)Affective image classification via semi-supervised learning from web imagesMultimedia Tools and Applications10.1007/s11042-018-6131-177:23(30633-30650)Online publication date: 1-Dec-2018
    • (2017)Query Expansion for Content-Based Similarity Search Using Local and Global FeaturesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/306359513:3(1-23)Online publication date: 31-May-2017
    • (2017)A survey on context-aware mobile visual recognitionMultimedia Systems10.1007/s00530-016-0523-823:6(647-665)Online publication date: 1-Nov-2017
    • (2015)Graph-Based Label Propagation in Digital MediaACM Computing Surveys10.1145/270038147:3(1-35)Online publication date: 1-Apr-2015
    • (2014)Local Selection of Features for Image Search and AnnotationProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654863(655-658)Online publication date: 3-Nov-2014
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media