Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1631135.1631137acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Visual tag dictionary: interpreting tags with visual words

Published: 23 October 2009 Publication History

Abstract

Visual-word based image representation has shown effectiveness in a wide variety of applications such as categorization, annotation and search. By detecting keypoints in images and treating their patterns as visual words, an image can be represented as a bag of visual words, which is analogous to the bag-of-words representation of text documents. In this paper, we introduce a corpus named visual tag dictionary. Unlike the conventional dictionaries that define terms with textual words, the visual tag dictionary interprets each tag with visual words. The dictionary is constructed in a fully automatic way by exploring the tagged image data on the Internet. With this dictionary, tags and images are connected via visual words and many applications can be thus facilitated. As examples, we empirically demonstrate the effectiveness of the dictionary in tag-based image search, tag ranking and image annotation.

References

[1]
C. M. D. Alamo, F. J. C. Gil, C. D. T. Munilla, and L. H. Gomez. Discriminative training of gmm for speaker identification. In Proceeding of ICASSP, 1996.
[2]
T. S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. T. Zheng. Nus-wide: A real-world web image database from national university of singapore. In Proceedings of ACM International Conference on Image and Video Retrieval, 2009.
[3]
K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information System, 2002.
[4]
F. Jing, L. Zhang, and W. Y. Ma. Visualtour: an online travel assistant based on high quality images. In Proceedings of ACM Multimedia, 2006.
[5]
L. S. Kennedy, S. F. Chang, and I. V. Kozintsev. To search or to label? predicting the performance of search-based automatic image classifiers. In Proceedings of ACM international workshop on Multimedia information retrieval, 2006.
[6]
R. H. V. Leuken, L. Garcia, X. Olivares, and R. Zwol. Visual diversification of image search results. In Proceedings of International World Wide Web Conference, 2009.
[7]
J. Li and J. Wang. Real-time computerized annotation of pictures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 2008.
[8]
X. R. Li, C. G. Snoek, and M. Worring. Learning tag relevance by neighbor voting for social image retrieval. In Proceeding of ACM International Conference on Multimedia Information Retrieval, 2008.
[9]
D. Liu, X. S. Hua, L. Yang, M. Wang, and H. J. Zhang. Tag ranking. In Proceedings of International World Wide Web Conference, 2008.
[10]
D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 2004.
[11]
N. Naphade and J. R. Smith. The role of classifiers in multimedia content management. SPIE Storage and retrieval for media databases, 2003.
[12]
J. Rissanen. Stochastic complexity in statistical inquiry. World Scientific, 1989.
[13]
B. Sigurbjornsson and R. V. Zwol. Flickr tag recommendation based on collective knowledge. In Proceeding of ACM International World Wide Web Conference, 2008.
[14]
J. Tang, S. Yan, R. Hong, G. J. Qi, and T. S. Chua. Inferring semantic concepts from community-contributed images and noisy tags. In Proceeding of ACM Multimedia, 2009.
[15]
M. Wang, X. S. Hua, R. Hong, J. Tang, G. J. Qi, and Y. Song. Unified video annotation via multi-graph learning. IEEE Transactions on Circuits and Systems for Video Annotation, 19(5), 2009.
[16]
M. Wang, X. S. Hua, J. Tang, and R. Hong. Beyond distance measurement: constructing neighborhood similarity for video annotation. IEEE Transactions on Multimedia, 11(3), 2009.
[17]
K. Weinberger, M.Slaney, and R.V.Zowl. Resolving tag ambiguity. In Proceeding of ACM Multimedia, 2008.
[18]
L. Wu, X. S. Hua, N. Yu, W. Y. Ma, and S. Li. Flickr distance. In Proceeding of ACM Multimedia, 2008.
[19]
R. Yan, A. Natsev, and M. Campbell. A learning-based hybrid tagging and browsing approach for efficient manual image annotation. In Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, 2008.
[20]
A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia university_Ss baseline detectors for 374 LSCOM semantic visual concepts. Columbia University ADVENT Technical Report 222-2006-8, 2007.
[21]
Q. Yang, X. Chen, and G. Wang. Web 2.0 dictionary. In Proceedings of ACM International Conference on Image and Video Retrieval, 2008.

Cited By

View all
  • (2023)A deep learning approach for image and text classification using neutrosophyInternational Journal of Information Technology10.1007/s41870-023-01529-816:2(853-859)Online publication date: 13-Oct-2023
  • (2020)WordNet and Wiki Based Approach for Finding Polysemy Tags in a Tag Set2020 International Conference on Intelligent Systems and Computer Vision (ISCV)10.1109/ISCV49265.2020.9204288(1-8)Online publication date: Jun-2020
  • (2016)Part-based clothing image annotation by visual neighbor retrievalNeurocomputing10.1016/j.neucom.2015.12.141213:C(115-124)Online publication date: 12-Nov-2016
  • Show More Cited By

Index Terms

  1. Visual tag dictionary: interpreting tags with visual words

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSMC '09: Proceedings of the 1st workshop on Web-scale multimedia corpus
    October 2009
    56 pages
    ISBN:9781605587615
    DOI:10.1145/1631135
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 October 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Flickr
    2. image annotation
    3. image search
    4. tag

    Qualifiers

    • Research-article

    Conference

    MM09
    Sponsor:
    MM09: ACM Multimedia Conference
    October 23, 2009
    Beijing, China

    Acceptance Rates

    WSMC '09 Paper Acceptance Rate 12 of 20 submissions, 60%;
    Overall Acceptance Rate 12 of 20 submissions, 60%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A deep learning approach for image and text classification using neutrosophyInternational Journal of Information Technology10.1007/s41870-023-01529-816:2(853-859)Online publication date: 13-Oct-2023
    • (2020)WordNet and Wiki Based Approach for Finding Polysemy Tags in a Tag Set2020 International Conference on Intelligent Systems and Computer Vision (ISCV)10.1109/ISCV49265.2020.9204288(1-8)Online publication date: Jun-2020
    • (2016)Part-based clothing image annotation by visual neighbor retrievalNeurocomputing10.1016/j.neucom.2015.12.141213:C(115-124)Online publication date: 12-Nov-2016
    • (2015)Creating descriptive visual words for tag ranking of compressed social image2015 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2015.7351536(3901-3905)Online publication date: Sep-2015
    • (2015)Social images tag ranking based on visual words in compressed domainNeurocomputing10.1016/j.neucom.2014.11.027153(278-285)Online publication date: Apr-2015
    • (2014)Creating the Bag-of-Words with Spatial Context Information for Image RetrievalApplied Mechanics and Materials10.4028/www.scientific.net/AMM.556-562.4788556-562(4788-4791)Online publication date: May-2014
    • (2014)Social Image Tagging With Diverse SemanticsIEEE Transactions on Cybernetics10.1109/TCYB.2014.230959344:12(2493-2508)Online publication date: Dec-2014
    • (2014)Efficient multi-nodes image retrieval method based on visual words2014 IEEE International Conference on Progress in Informatics and Computing10.1109/PIC.2014.6972324(200-203)Online publication date: May-2014
    • (2013)Picture tags and world knowledgeProceedings of the 21st ACM international conference on Multimedia10.1145/2502081.2502113(967-976)Online publication date: 21-Oct-2013
    • (2013)Multimedia encyclopedia construction by mining web knowledgeSignal Processing10.1016/j.sigpro.2012.06.02893:8(2361-2368)Online publication date: 1-Aug-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media