Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2393347.2396423acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Towards indexing representative images on the web

Published: 29 October 2012 Publication History

Abstract

Even after 20 years of research on real-world image retrieval, there is still a big gap between what search engines can provide and what users expect to see. To bridge this gap, we present an image knowledge base, ImageKB, a graph representation of structured entities, categories, and representative images, as a new basis for practical image indexing and search. ImageKB is automatically constructed via a both bottom-up and top-down, scalable approach that efficiently matches 2 billion web images onto an ontology with millions of nodes. Our approach consists of identifying duplicate image clusters from billions of images, obtaining a candidate set of entities and their images, discovering definitive texts to represent an image and identifying representative images for an entity. To date, ImageKB contains 235.3M representative images corresponding to 0.52M entities, much larger than the state-of-the-art alternative ImageNet that contains 14.2M images for 0.02M synsets. Compared to existing image databases, ImageKB reflects the distributions of both images on the web and users' interests, contains rich semantic descriptions for images and entities, and can be widely used for both text to image search and image to text understanding.

References

[1]
Smith, J., Chang, S.F.: An image and video search engine for the world wide web (1996) In: SPIE.
[2]
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40 (2008) 1--60.
[3]
Garcia, S., Williams, H.E., Cannane, A.: Access-ordered indexes (2004) In: ACSC.
[4]
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38 (2006)
[5]
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large dataset for non-parametric object and scene recognition. (In: IEEE T-PAMI).
[6]
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database (2009) In: CVPR.
[7]
Fellbaum, C.: Wordnet: An electronic lexical database (1998) Bradford Books.
[8]
Shi, S., Zhang, H., Yuan, X., Wen, J.: Corpus-based semantic class mining: distributional vs. pattern-based approaches (2010) In: ICCL.
[9]
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision.
[10]
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision.
[11]
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision.
[12]
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694 (2007).
[13]
Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. In: IJCV 77 (2008) 157--173.
[14]
Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet (2011) In: CVPR.
[15]
Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.
[16]
Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.
[17]
Good, J.: How many photos have ever been taken? (2011) http://blog.1000memories.com/94-number-ofphotos-ever-taken-digital-and-analog-in-shoebox.
[18]
Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: Lei zhang, feng jing, wei-ying ma, annosearch: Image auto-annotation by search (2006) In: CVPR.
[19]
Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: Arista - image search to annotation on billions of web photos (2010) In: CVPR.
[20]
Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate search-based image annotation using web-scale data. Proceedings of IEEE (2012)
[21]
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos (2003) In Proc. ICCV.
[22]
Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting (2008) In Proc. BMVC.
[23]
Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicate detection and sub-image retrieval (2004) In: ACM Multimedia.
[24]
Chum, O., Matas, J.: Large scale discovery of spatilly related images. IEEE T-PAMI (2010)
[25]
Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery (2010) In: ECCV.
[26]
Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2 (1901) 559--572.
[27]
Abdi, H., Williams, L.: Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics 2 (2010) 433--459.
[28]
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization (1997) In: ICML.
[29]
Chang, C., Lin, C.: Libsvm: A library for support vector machines (2012) http://www.csie.ntu.edu.tw/cjlin/libsvm.
[30]
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press (1999).

Cited By

View all
  • (2020)Introducing time series snippets: a new primitive for summarizing long time seriesData Mining and Knowledge Discovery10.1007/s10618-020-00702-yOnline publication date: 2-Jul-2020
  • (2018)Learning Multimodal Taxonomy via Variational Deep Graph Embedding and ClusteringProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240586(681-689)Online publication date: 15-Oct-2018
  • (2018)Visual instance mining from the graph perspectiveMultimedia Systems10.1007/s00530-016-0533-624:2(147-162)Online publication date: 1-Mar-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. image knowledge base
  2. image understanding
  3. large-scale text to image translation

Qualifiers

  • Research-article

Conference

MM '12
Sponsor:
MM '12: ACM Multimedia Conference
October 29 - November 2, 2012
Nara, Japan

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Introducing time series snippets: a new primitive for summarizing long time seriesData Mining and Knowledge Discovery10.1007/s10618-020-00702-yOnline publication date: 2-Jul-2020
  • (2018)Learning Multimodal Taxonomy via Variational Deep Graph Embedding and ClusteringProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240586(681-689)Online publication date: 15-Oct-2018
  • (2018)Visual instance mining from the graph perspectiveMultimedia Systems10.1007/s00530-016-0533-624:2(147-162)Online publication date: 1-Mar-2018
  • (2017)Efficient Keyword-Aware Representative Travel Route RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.269042129:8(1639-1652)Online publication date: 1-Aug-2017
  • (2017)Multimodal KB Harvesting for Emerging Spatial EntitiesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.265180529:5(1073-1086)Online publication date: 1-May-2017
  • (2016)Folksonomy-Based Visual Ontology Construction and Its ApplicationsIEEE Transactions on Multimedia10.1109/TMM.2016.252760218:4(702-713)Online publication date: Apr-2016
  • (2016)Human computation implementation of text to user manipulative scene2016 International Conference on Information Technology (InCITe) - The Next Generation IT Summit on the Theme - Internet of Things: Connect your Worlds10.1109/INCITE.2016.7857579(7-11)Online publication date: Oct-2016
  • (2015)[Invited Paper] A Review of Web Image MiningITE Transactions on Media Technology and Applications10.3169/mta.3.1563:3(156-169)Online publication date: 2015
  • (2014)Learning to personalize trending image search suggestionProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609569(727-736)Online publication date: 3-Jul-2014
  • (2013)Image context discovery from socially curated contentsProceedings of the 21st ACM international conference on Multimedia10.1145/2502081.2502149(565-568)Online publication date: 21-Oct-2013
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media