research-article

Towards indexing representative images on the web

Authors:

Yong RuiAuthors Info & Claims

MM '12: Proceedings of the 20th ACM international conference on Multimedia

Pages 1229 - 1238

https://doi.org/10.1145/2393347.2396423

Published: 29 October 2012 Publication History

Abstract

Even after 20 years of research on real-world image retrieval, there is still a big gap between what search engines can provide and what users expect to see. To bridge this gap, we present an image knowledge base, ImageKB, a graph representation of structured entities, categories, and representative images, as a new basis for practical image indexing and search. ImageKB is automatically constructed via a both bottom-up and top-down, scalable approach that efficiently matches 2 billion web images onto an ontology with millions of nodes. Our approach consists of identifying duplicate image clusters from billions of images, obtaining a candidate set of entities and their images, discovering definitive texts to represent an image and identifying representative images for an entity. To date, ImageKB contains 235.3M representative images corresponding to 0.52M entities, much larger than the state-of-the-art alternative ImageNet that contains 14.2M images for 0.02M synsets. Compared to existing image databases, ImageKB reflects the distributions of both images on the web and users' interests, contains rich semantic descriptions for images and entities, and can be widely used for both text to image search and image to text understanding.

References

[1]

Smith, J., Chang, S.F.: An image and video search engine for the world wide web (1996) In: SPIE.

[2]

Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40 (2008) 1--60.

Digital Library

[3]

Garcia, S., Williams, H.E., Cannane, A.: Access-ordered indexes (2004) In: ACSC.

Digital Library

[4]

Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38 (2006)

Digital Library

[5]

Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large dataset for non-parametric object and scene recognition. (In: IEEE T-PAMI).

Digital Library

[6]

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database (2009) In: CVPR.

[7]

Fellbaum, C.: Wordnet: An electronic lexical database (1998) Bradford Books.

[8]

Shi, S., Zhang, H., Yuan, X., Wen, J.: Corpus-based semantic class mining: distributional vs. pattern-based approaches (2010) In: ICCL.

Digital Library

[9]

Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision.

Digital Library

[10]

Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision.

Digital Library

[11]

Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision.

Digital Library

[12]

Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694 (2007).

[13]

Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. In: IJCV 77 (2008) 157--173.

Digital Library

[14]

Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet (2011) In: CVPR.

Digital Library

[15]

Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.

[16]

Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.

[17]

Good, J.: How many photos have ever been taken? (2011) http://blog.1000memories.com/94-number-ofphotos-ever-taken-digital-and-analog-in-shoebox.

[18]

Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: Lei zhang, feng jing, wei-ying ma, annosearch: Image auto-annotation by search (2006) In: CVPR.

Digital Library

[19]

Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: Arista - image search to annotation on billions of web photos (2010) In: CVPR.

[20]

Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate search-based image annotation using web-scale data. Proceedings of IEEE (2012)

[21]

Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos (2003) In Proc. ICCV.

Digital Library

[22]

Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting (2008) In Proc. BMVC.

[23]

Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicate detection and sub-image retrieval (2004) In: ACM Multimedia.

Digital Library

[24]

Chum, O., Matas, J.: Large scale discovery of spatilly related images. IEEE T-PAMI (2010)

Digital Library

[25]

Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery (2010) In: ECCV.

Digital Library

[26]

Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2 (1901) 559--572.

[27]

Abdi, H., Williams, L.: Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics 2 (2010) 433--459.

Digital Library

[28]

Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization (1997) In: ICML.

Digital Library

[29]

Chang, C., Lin, C.: Libsvm: A library for support vector machines (2012) http://www.csie.ntu.edu.tw/cjlin/libsvm.

Digital Library

[30]

Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press (1999).

Cited By

Imani SMadrid FDing WCrouter SKeogh E(2020)Introducing time series snippets: a new primitive for summarizing long time seriesData Mining and Knowledge Discovery10.1007/s10618-020-00702-yOnline publication date: 2-Jul-2020
https://doi.org/10.1007/s10618-020-00702-y
Zhang HFang QQian SXu CBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Learning Multimodal Taxonomy via Variational Deep Graph Embedding and ClusteringProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240586(681-689)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240586
Li WLi JWang CZhang LZhang B(2018)Visual instance mining from the graph perspectiveMultimedia Systems10.1007/s00530-016-0533-624:2(147-162)Online publication date: 1-Mar-2018
https://dl.acm.org/doi/10.1007/s00530-016-0533-6
Show More Cited By

Index Terms

Towards indexing representative images on the web
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
    2. Retrieval tasks and goals
  2. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

Understanding web images by object relation network
WWW '12: Proceedings of the 21st international conference on World Wide Web

This paper presents an automatic method for understanding and interpreting the semantics of unannotated web images. We observe that the relations between objects in an image carry important semantics about the image. To capture and describe such ...
Joint statistical analysis of images and keywords with applications in semantic image enhancement
MM '12: Proceedings of the 20th ACM international conference on Multimedia

With the advent of social image-sharing communities, millions of images with associated semantic tags are now available online for free and allow us to exploit this abundant data in new ways. We present a fast non-parametric statistical framework ...
Image search—from thousands to billions in 20 years
Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012

This article presents a comprehensive review and analysis on image search in the past 20 years, emphasizing the challenges and opportunities brought by the astonishing increase of dataset scales from thousands to billions in the same time period, which ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '12: Proceedings of the 20th ACM international conference on Multimedia

October 2012

1584 pages

ISBN:9781450310895

DOI:10.1145/2393347

General Chairs:
Noboru Babaguchi
Osaka University, Japan
,
Kiyoharu Aizawa
The University of Tokyo, Japan
,
John Smith
IBM, USA
,
Program Chairs:
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Thomas Plagemann
University of Oslo, Norway
,
Xian-Sheng Hua
Microsoft, USA
,
Rong Yan
Facebook, USA

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '12

Sponsor:

SIGMM

MM '12: ACM Multimedia Conference

October 29 - November 2, 2012

Nara, Japan

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
286
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Imani SMadrid FDing WCrouter SKeogh E(2020)Introducing time series snippets: a new primitive for summarizing long time seriesData Mining and Knowledge Discovery10.1007/s10618-020-00702-yOnline publication date: 2-Jul-2020
https://doi.org/10.1007/s10618-020-00702-y
Zhang HFang QQian SXu CBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Learning Multimodal Taxonomy via Variational Deep Graph Embedding and ClusteringProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240586(681-689)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240586
Li WLi JWang CZhang LZhang B(2018)Visual instance mining from the graph perspectiveMultimedia Systems10.1007/s00530-016-0533-624:2(147-162)Online publication date: 1-Mar-2018
https://dl.acm.org/doi/10.1007/s00530-016-0533-6
Wen YYeo JPeng WHwang S(2017)Efficient Keyword-Aware Representative Travel Route RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.269042129:8(1639-1652)Online publication date: 1-Aug-2017
https://doi.org/10.1109/TKDE.2017.2690421
Yeo JCho HPark JHwang S(2017)Multimodal KB Harvesting for Emerging Spatial EntitiesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.265180529:5(1073-1086)Online publication date: 1-May-2017
https://dl.acm.org/doi/10.1109/TKDE.2017.2651805
Fang QXu CSang JHossain MGhoneim A(2016)Folksonomy-Based Visual Ontology Construction and Its ApplicationsIEEE Transactions on Multimedia10.1109/TMM.2016.252760218:4(702-713)Online publication date: Apr-2016
https://doi.org/10.1109/TMM.2016.2527602
Sarma MSagar R(2016)Human computation implementation of text to user manipulative scene2016 International Conference on Information Technology (InCITe) - The Next Generation IT Summit on the Theme - Internet of Things: Connect your Worlds10.1109/INCITE.2016.7857579(7-11)Online publication date: Oct-2016
https://doi.org/10.1109/INCITE.2016.7857579
Yanai K(2015)[Invited Paper] A Review of Web Image MiningITE Transactions on Media Technology and Applications10.3169/mta.3.1563:3(156-169)Online publication date: 2015
https://doi.org/10.3169/mta.3.156
Wu CMei THsu WRui YGeva STrotman ABruza PClarke CJärvelin K(2014)Learning to personalize trending image search suggestionProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609569(727-736)Online publication date: 3-Jul-2014
https://dl.acm.org/doi/10.1145/2600428.2609569
Kimura AIshiguro KYamada MMarcos Alvarez AKataoka KMurasaki KJaimes ASebe NBoujemaa NGatica-Perez DShamma DWorring MZimmermann R(2013)Image context discovery from socially curated contentsProceedings of the 21st ACM international conference on Multimedia10.1145/2502081.2502149(565-568)Online publication date: 21-Oct-2013
https://dl.acm.org/doi/10.1145/2502081.2502149
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents