Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1873951.1874332acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Towards a universal detector by mining concepts with small semantic gaps

Published: 25 October 2010 Publication History
  • Get Citation Alerts
  • Abstract

    Can we have a universal detector that could recognize unseen objects with no training exemplars available? Such a detector is so desirable, as there are hundreds of thousands of object concepts in human vocabulary but few available labeled image examples. In this study, we attempt to build such a universal detector to predict concepts in the absence of training data. First, by considering both semantic relatedness and visual variance, we mine a set of realistic small-semantic-gap (SSG) concepts from a large-scale image corpus. Detectors of these concepts can deliver reasonably satisfactory recognition accuracies. From these distinctive visual models, we then leverage the semantic ontology knowledge and co-occurrence statistics of concepts to extend visual recognition to unseen concepts. To the best of our knowledge, this work presents the first research attempting to substantiate the semantic gap measuring of a large amount of concepts and leverage visually learnable concepts to predicate those with no training images available. Testings on NUS-WIDE dataset demonstrate that the selected concepts with small semantic gaps can be well modeled and the prediction of unseen concepts delivers promising results with comparable accuracy to preliminary training-based methods.

    References

    [1]
    T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: A real-world web image database from national university of singapore. In CIVR, 2009.
    [2]
    R. Cilibrasi and P. Vitányi. The google similarity distance. TKDE, 2007.
    [3]
    J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
    [4]
    R. Duda, D. Stork, and P. Hart. Pattern Classification. John Wiley, 2000.
    [5]
    F. Li, A. Iyer, C. Koch, and P. Perona. What do we perceive in a glance of a real-world scene? Journal of Vision, 2007.
    [6]
    C. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, 1998.
    [7]
    Y. Gao and J. Fan. Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation. In MIR, 2006.
    [8]
    G. Griffin and D. Perona. Learning and using taxonomies for fast visual categorization. In CVPR, 2008.
    [9]
    Y. Jiang, C. Ngo, and S. Chang. Semantic context transfer across heterogeneous sources for domain adaptive video search. In MM, 2009.
    [10]
    D. Liu, X.-S. Hua, L. Yang, M. Wang and H.-J. Zhang, Tag ranking, In WWW, 2009.
    [11]
    E. Rosch and B. Lloyd. Cognition and categorization. Hillsdale, NJ: Lawrence Erlbaum, 1978.
    [12]
    B. Schölkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, 2002.
    [13]
    J. Tang, S. Yan, R. Hong, G. Qi, and T. Chua. Inferring semantic concepts from community-contributed images and noisy tags. In MM, 2009.
    [14]
    B. Tversky and K. Hemenway. Categories of environmental scenes. Cognitive Psychology, 1983.
    [15]
    Z. Wu and M. Palmer. Verb semantics and lexical selection. In ACL, 1994.
    [16]
    J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classification of texture and object categories: A comprehensive study. IJCV, 2007.
    [17]
    A. Zweig and D. Weinshall. Exploiting object hierarchy: Combining models from different category levels. In ICCV, 2007.

    Cited By

    View all
    • (2014)Multimedia event detection with multimodal feature fusion and temporal concept localizationMachine Vision and Applications10.1007/s00138-013-0525-x25:1(49-69)Online publication date: 1-Jan-2014
    • (2012)Short communicationExpert Systems with Applications: An International Journal10.1016/j.eswa.2012.03.01239:12(11312-11320)Online publication date: 1-Sep-2012
    • (2011)Known-item video search via query-to-modality mappingProceedings of the 19th ACM international conference on Multimedia10.1145/2072298.2071957(1133-1136)Online publication date: 28-Nov-2011

    Index Terms

    1. Towards a universal detector by mining concepts with small semantic gaps

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '10: Proceedings of the 18th ACM international conference on Multimedia
      October 2010
      1836 pages
      ISBN:9781605589336
      DOI:10.1145/1873951
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 October 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. concept detection
      2. concept transfer
      3. semantic gap

      Qualifiers

      • Short-paper

      Conference

      MM '10
      Sponsor:
      MM '10: ACM Multimedia Conference
      October 25 - 29, 2010
      Firenze, Italy

      Acceptance Rates

      Overall Acceptance Rate 995 of 4,171 submissions, 24%

      Upcoming Conference

      MM '24
      The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2014)Multimedia event detection with multimodal feature fusion and temporal concept localizationMachine Vision and Applications10.1007/s00138-013-0525-x25:1(49-69)Online publication date: 1-Jan-2014
      • (2012)Short communicationExpert Systems with Applications: An International Journal10.1016/j.eswa.2012.03.01239:12(11312-11320)Online publication date: 1-Sep-2012
      • (2011)Known-item video search via query-to-modality mappingProceedings of the 19th ACM international conference on Multimedia10.1145/2072298.2071957(1133-1136)Online publication date: 28-Nov-2011

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media