Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2393347.2396387acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

What is happening: annotating images with verbs

Published: 29 October 2012 Publication History

Abstract

Image annotation has been widely investigated to discover the semantics of an image. However, most of the existing algorithms focus on noun tags (e.g. concepts and objects). Since an image is a snapshot of the real world event, annotating images with verbs will enable richer understanding of an image. In this paper, we propose a data-driven approach to verb oriented image annotation. At first, we obtain verb candidates by generating search queries for a given image with initial noun tags and establishing a sentence corpus from those queries. We utilize visualness to filter tags which are not visually presentable (e.g. pain) and differentiate tags into two categories (i.e. scene based and object based) to impose linguistic rules in verb extraction. Then we further re-rank the candidate verbs with the tag context discovered from the images which are both semantically and visually similar to the given image in the MIRFlickr dataset. Our experimental results from user study demonstrate that our proposed approach is promising.

References

[1]
J. K. Aggarwal and M. S. Ryoo. Human activity analysis. ACM Computing Surveys, 43(3), 2011.
[2]
K. Deschacht and M.-F. Moens. Text analysis for automatic image annotation. In the 45th Annual Meeting of the Association of Computational Linguistics, 2007.
[3]
J. J. Jiang and D. W. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In the International Conference Research on Computational Linguistics, 1997.
[4]
L. S. Kennedy, S.-F. Chang, and I. V. Kozintsev. To search or to label? predicting the performance of search-based automatic image classifiers. In the 8th ACM international workshop on Multimedia information retrieval, 2006.
[5]
D. Lin. An information-theoretic definition of similarity. In the 15th International Conference on Machine Learning, 1998.
[6]
Y. Ushiku, T. Harada, and Y. Kuniyoshi. Automatic sentence generation from images. In the 19th ACM international conference on Multimedia, 2011.
[7]
X.-J. Wang, L. Zhang, X. Li, and W.-Y. Ma. Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11):1919--1932, November 2008.
[8]
Z. Wang and D. Feng. Discovering semantics from visual information. In Machine Learning: Concepts, Methodologies, Tools and Applications, chapter 8.8, pages 1981--2009. 2012.
[9]
D. Zhang, M. M. Islam, and G. Lu. A review on automatic image annotation techniques. Pattern Recognition, 45(1):346--362, 2012.

Cited By

View all
  • (2016)What Is Happening in the Video? —Annotate Video by SentenceIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2015.247581526:9(1746-1757)Online publication date: 1-Sep-2016

Index Terms

  1. What is happening: annotating images with verbs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '12: Proceedings of the 20th ACM international conference on Multimedia
    October 2012
    1584 pages
    ISBN:9781450310895
    DOI:10.1145/2393347
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. image annotation
    2. re-ranking
    3. semantic similarity
    4. verb tag
    5. visual similarity
    6. visualness

    Qualifiers

    • Poster

    Conference

    MM '12
    Sponsor:
    MM '12: ACM Multimedia Conference
    October 29 - November 2, 2012
    Nara, Japan

    Acceptance Rates

    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)What Is Happening in the Video? —Annotate Video by SentenceIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2015.247581526:9(1746-1757)Online publication date: 1-Sep-2016

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media