Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1459359.1459386acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Boosting image retrieval through aggregating search results based on visual annotations

Published: 26 October 2008 Publication History

Abstract

Online photo sharing systems, such as Flickr and Picasa, provide a valuable source of human-annotated photos. Textual annotations are used not only to describe the visual content of an image, but also subjective, spatial, temporal and social dimensions, complicating the task of keyword-based search. In this paper we investigate a method that exploits visual annotations, e.g. notes in Flickr, to enhance keyword-based systems retrieval performance. For this purpose we adopt the bag-of-visual-words approach for content-based image retrieval as our baseline. We then apply rank aggregation of the top 25 results obtained with a set of visual annotations that match the keyword-based query. The results on retrieval experiments show significant improvements in retrieval performance when comparing the aggregated approach with our baseline, which also slightly outperforms text-only search. When using a textual filter on the search space in combination with the aggregated approach an additional boost in retrieval performance is observed, which underlines the need for large scale content-based image retrieval techniques to complement the text-based search.

References

[1]
M. Ames and M. Naaman. Why we tag: Motivations for annotation in mobile and online media. In CHI '07: Proceedings of the SIGCHI conference on Human Factors in computing systems, New York, NY, USA, 2007. ACM Press.
[2]
J. A. Aslam and M. Montague. Models for metasearch. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 276--284, New York, NY, USA, 2001. ACM.
[3]
J. C. Borda. Memoire sur les elections au scrutin. In Histoire de l'Academie Royale des Sciences, 1781.
[4]
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 25--32, New York, NY, USA, 2004. ACM.
[5]
O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Proceedings of the 11th International Conference on Computer Vision, Rio de Janeiro, Brazil, 2007.
[6]
Corel clipart & photos. http://www.corel.com/products/clipartandphotos/, 1999.
[7]
G. Csurka, C. Dance, J. Willamowski, L. Fan, and C. Bray. Categorization in multiple category systems. In ICML '06: Proceedings of the 23rd international conference on Machine learning, pages 745--752, New York, NY, USA, 2006. ACM Press.
[8]
M. Dubinko, R. Kumar, J. Magnani, J. Novak, P. Raghavan, and A. Tomkins. Visualizing tags over time. ACM Trans. Web, 1(2):7, 2007.
[9]
C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the web. In WWW, pages 613--622, 2001.
[10]
R. Fagin, R. Kumar, K. S. McCurley, J. Novak, D. Sivakumar, J. A. Tomlin, and D. P. Williamson. Searching the workplace web. In WWW, pages 366--375, 2003.
[11]
G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.
[12]
A. Hauptmann, R. Yan, and W.-H. Lin. How many high-level concepts will fill the semantic gap in news video retrieval? In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 627--634, New York, NY, USA, 2007. ACM.
[13]
V. Lepetit, P. Lagger, and P. Fua. Randomized trees for real-time keypoint recognition. In Proceedings of Computer Vision and pattern Recognition (CVPR2005), San Diego, USA, June 2005.
[14]
R. Lienhart and M. Slaney. Plsa on large scale image databases. In IEEE International Conference on Acoustics, Speech and Signal Processing 2007 (ICASSP 2007), 2007.
[15]
D. Lowe. Distinctive image features from scale-invariant keypoints. In International Journal of Computer Vision, volume 20, pages 91--110, 2003.
[16]
D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91--110, 2004.
[17]
C. Marlow, M. Naaman, D. Boyd, and M. Davis. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In HYPERTEXT '06: Proceedings of the seventeenth conference on Hypertext and hypermedia, pages 31--40, New York, NY, USA, 2006. ACM Press.
[18]
K. Mikolajczyk and C. Schmid. Scale & affine invariant interest point detectors. Int. J. Comput. Vision, 60(1):63--86, October 2004.
[19]
K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis & Machine Intelligence, 27(10):1615--1630, 2005.
[20]
S. Nene, S. Nayar, and H. Murase. Columbia object image library: Coil, 1996.
[21]
J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman. Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2007.
[22]
J. Sivic and A. Zisserman. Video Google: Efficient visual search of videos. In J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, editors, Toward Category-Level Object Recognition, volume 4170 of LNCS, pages 127--144. Springer, 2006.
[23]
C. G. M. Snoek, B. Huurnink, L. Hollink, M. de Rijke, G. Schreiber, and M. Worring. Adding semantics to detectors for video retrieval. IEEE Transactions on Multimedia, 9(5):975--986, 2007.
[24]
Text retrieval conference homepage. http://trec.nist.gov/.
[25]
R. van Zwol, V. Murdock, L. Garcia, and G. Ramirez. Diversifying image search with user generated content. In Proceedings of the International ACM Conference on Multimedia Information Retrieval (MIR 2008), Vancouver, Canada, October 2008.

Cited By

View all

Index Terms

  1. Boosting image retrieval through aggregating search results based on visual annotations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '08: Proceedings of the 16th ACM international conference on Multimedia
    October 2008
    1206 pages
    ISBN:9781605583037
    DOI:10.1145/1459359
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. image retrieval
    2. rank aggregation
    3. visual annotations

    Qualifiers

    • Research-article

    Conference

    MM08
    Sponsor:
    MM08: ACM Multimedia Conference 2008
    October 26 - 31, 2008
    British Columbia, Vancouver, Canada

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)An Adaptive Genetic Algorithm Approach for Optimizing Feature Weights in Multimodal ClusteringIntelligent Computing10.1007/978-3-030-52246-9_13(181-197)Online publication date: 4-Jul-2020
    • (2017)Understanding-Oriented Multimedia News RetrievalUnderstanding-Oriented Multimedia Content Analysis10.1007/978-981-10-3689-7_5(101-129)Online publication date: 27-May-2017
    • (2014)Multimedia search rerankingACM Computing Surveys10.1145/253679846:3(1-38)Online publication date: 1-Jan-2014
    • (2014)Contextual Query Expansion for Image RetrievalIEEE Transactions on Multimedia10.1109/TMM.2014.230590916:4(1104-1114)Online publication date: 1-Jun-2014
    • (2014)Contextual object category recognition for RGB-D scene labelingRobotics and Autonomous Systems10.1016/j.robot.2013.10.00162:2(241-256)Online publication date: 1-Feb-2014
    • (2014)Image understanding and the webJournal of Intelligent Information Systems10.1007/s10844-014-0323-643:2(271-306)Online publication date: 1-Oct-2014
    • (2013)Enhancing news organization for convenient retrieval and browsingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/248873210:1(1-20)Online publication date: 27-Dec-2013
    • (2013)Query-Document-Dependent FusionIEEE Transactions on Multimedia10.1109/TMM.2013.228043715:8(1830-1842)Online publication date: 1-Dec-2013
    • (2012)Harvesting Social Images for Bi-Concept SearchIEEE Transactions on Multimedia10.1109/TMM.2012.219194314:4(1091-1104)Online publication date: 1-Aug-2012
    • (2012)Indoor Furniture and Room Recognition for a Robot Using Internet-Derived Models and Object ContextProceedings of the 2012 10th International Conference on Frontiers of Information Technology10.1109/FIT.2012.30(122-128)Online publication date: 17-Dec-2012
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media