research-article

Cross Indexing With Grouplets

Authors:

Shiliang Zhang,

Qi TianAuthors Info & Claims

IEEE Transactions on Multimedia, Volume 17, Issue 11

Pages 1969 - 1979

https://doi.org/10.1109/TMM.2015.2478055

Published: 01 November 2015 Publication History

Abstract

Most of the current image indexing systems for retrieval view a database as a set of individual images. It limits the flexibility of the retrieval framework to conduct sophisticated cross-image analysis, resulting in higher memory consumption and sub-optimal retrieval accuracy. To conquer this issue, we propose cross indexing with grouplets, where the core idea is to view the database images as a set of grouplets, each of which is defined as a group of highly relevant images. Because a grouplet groups similar images together, the number of grouplets is smaller than the number of images, thus naturally leading to less memory cost. Moreover, the definition of a grouplet could be based on customized relations, allowing for seamless integration of advanced image features and data mining techniques like the deep convolutional neural network (DCNN) in off-line indexing . To validate the proposed framework, we construct three different types of grouplets, which are respectively based on local similarity, regional relation, and global semantic modeling. Extensive experiments on public benchmark datasets demonstrate the efficiency and superior performance of our approach.

References

[1]

O. Russakovsky et al., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vis., pp. 1–42, Apr. 2015, [Online]. Available: http://www.image-net.org/challenges/LSVRC/2010.

[2]

R. Arandjelović and A. Zisserman, “Three things everyone should know to improve object retrieval,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 2911–2918.

[3]

A. Babenko, A. Slesarev, A. Chigorin, and V. S. Lempitsky, “Neural codes for image retrieval,” in Proc. ECCV, 2014, pp. 584–599.

[4]

H. Bay, T. Tuytelaars, and L. V. Gool, “Surf: Speeded up robust features,” in Proc. ECCV, 2006, pp. 404–417.

[5]

Y.-L. Boureau, J. Ponce, and Y. LeCun, “A theoretical analysis of feature pooling in visual recognition,” in Proc. ICML, 2010, pp. 111–118.

[6]

O. Chum and J. Matas, “Large-scale discovery of spatially related images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 2, pp. 371–377, Feb. 2010.

Digital Library

[7]

J. Deng, A. C. Berg, and L. Fei-Fei, “Hierarchical semantic indexing for large scale image retrieval,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 785–792.

[8]

M. Douze, A. Ramisa, and C. Schmid, “Combining attributes and fisher vectors for efficient image retrieval,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 745–752.

[9]

C. Farabet, C. Couprie, L. Najman, and Y. LeCun, “Learning hierarchical features for scene labeling,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 8, pp. 1915–1929, Aug. 2013.

Digital Library

[10]

A. Farhadi, I. Endres, D. Hoiem, and D. Forsyth, “Describing objects by their attributes,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2009, pp. 1778–1785.

[11]

C. Fellbaum, WordNet: An Electronic Lexical Database, Cambridge, MA USA: Bradford Book, 1998.

[12]

V. Ferrari and A. Zisserman, “Learning visual attributes,” in Proc. NIPS, 2007, pp. 433–440.

[13]

B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science, vol. 315, no. 5814, pp. 972–976, 2007.

[14]

A. Gionis, P. Indyky, and R. Motwaniz, “Similarity search in high dimensions via hashing,” in Proc. VLDB, 1999, pp. 518–529.

[15]

A. G. Hauptmann, M. G. Christel, and R. Yan, “Video retrieval based on semantic concepts,” Proc. IEEE, vol. 96, no. 4, pp. 602–622, Apr. 2008.

[16]

H. Jégou, M. Douze, and C. Schmid, “Hamming embedding and weak geometric consistency for large scale image search,” in Proc. ECCV, 2008, pp. 304–317.

[17]

H. Jégou, M. Douze, and C. Schmid, “Improving bag-of- feature for large scale image search,” Int. J. Comput. Vis., vol. 87, no. 3, pp. 316–336, 2010.

Digital Library

[18]

H. Jégou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 1, pp. 117–128, Jan. 2011.

Digital Library

[19]

Y. Jia et al., “Caffe: Convolutional architecture for fast feature embedding,” arXiv preprint arXiv:1408.5093, 2014, [Online]. Available: http://caffe.berkeleyvision.org/.

[20]

R. M. Karp, Reducibility Among Combinatorial Problems, New York, NY USA: Springer, 1972.

[21]

A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. NIPS, 2012, pp. 1097–1105.

[22]

Y. Lin, F. Lv, S. Zhu, M. Yang, T. Cour, K. Yu, L. Cao, and T. Huang, “Large-scale image classification: Fast feature extraction and SVM training,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 1689–1696.

[23]

W. Liu, J. Wang, R. Ji, Y. Jiang, and S.-F. Chang, “Supervised hashing with kernels,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 2074–2081.

[24]

Z. Liu, H. Li, W. Zhou, and Q. Tian, “Embedding spatial context into inverted file for large-scale image search,” in Proc. ACM Multimedia, 2012, pp. 199–208.

[25]

D. G. Lowe, “Distinctive image features from scale invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.

Digital Library

[26]

Q. Luo, S. Zhang, T. Huang, W. Gao, and Q. Tian, “Superimage: Packing semantic-relevant images for indexing and retrieval,” in Proc. ICMR, 2014, pp. 41–48.

[27]

K. Makino and T. Uno, New Algorithms for Enumerating All Maximal Cliques, New York, NY USA: Springer, 2004.

[28]

A. Mikulk, M. Perdoch, O. Chum, and J. Matas, “Learning a fine vocabulary,” in Proc. ECCV, 2010, pp. 1–14.

[29]

D. Nistér and H. Stewénius, “Scalable recognition with a vocabulary tree,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2006, vol. 2, pp. 2161–2168.

[30]

M. Norouzi and D. J. Fleet, “Cartesian k-means,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 3017–3024.

[31]

F. Perronnin, J. Sánchez, and T. Mensink, “Improving the fisher kernel for large-scale image classification,” in Proc. ECCV, pp. 143–156.

[32]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2007, pp. 1–8.

[33]

D. Qin, S. Gammeter, L. Bossard, T. Quack, and L. van Gool, “Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 777–784.

[34]

D. Qin, C. Wengert, and L. van Gool, “Query adaptive similarity for large scale object retrieval,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2013, pp. 1610–1617.

[35]

E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to sift or surf,” in Proc. Int. Conf. Comput. Vis., Nov. 2011, pp. 2564–2571.

[36]

X. Shen, Z. Lin, J. Brandt, S. Avidan, and Y. Wu, “Object retrieval and localization with spatially-constrained similarity measure and k-NN reranking,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 3013–3020.

[37]

J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” in Proc. Int. Conf. Comput. Vis., Oct. 2003, vol. 2, pp. 1470–1477.

[38]

C. Szegedy, T. Alexander, and E. Dumitru, “Deep neural networks for object detection,” in Proc. NIPS, 2013, pp. 2553–2561.

[39]

E. Tomita, A. Tanaka, and H. Takahashi, “The worst-case time complexity for generating all maximal cliques and computational experiments,” Theoretical Comput. Sci., vol. 363, no. 1, pp. 28–42, 2006.

Digital Library

[40]

L. Torresani, M. Szummer, and A. Fitzgibbon, “Efficient object category recognition using classemes,” in Proc. ECCV, pp. 776–789.

[41]

J. Wang, K. Sanjiv, and S.-F. Chang, “Semi-supervised hashing for large-scale search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12, pp. 2393–2406, Dec. 2012.

Digital Library

[42]

X. Wang, M. Yang, T. Cour, S. Zhu, K. Yu, and T. X. Han, “Contextual weighting for vocabulary tree based image retrieval,” in Proc. Int. Conf. Comput. Vis., Nov. 2011, pp. 209–216.

[43]

Z. Wu, Q. Ke, M. Isard, and J. Sun, “Bundling feature for large scale partial-duplicated web image search,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2009, pp. 25–32.

[44]

G. Ye, D. Liu, I.-H. Jhuo, and S.-F. Chang, “Robust late fusion with rank minimization,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 3021–3028.

[45]

F. X. Yu, R. Ji, M.-H. Tsai, G. Ye, and S.-F. Chang, “Weak attributes for large-scale image retrieval,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2012, pp. 2949–2956.

[46]

S. Zhang, Q. Huang, G. Hua, S. Jiang, W. Gao, and Q. Tian, “Building contextual visual vocabulary for large-scale image applications,” in Proc. ACM Multimedia, 2010, pp. 501–510.

[47]

S. Zhang, Q. Tian, G. Hua, Q. Huang, and W. Gao, “Objectpatchnet: Towards scalable and semantic image annotation and retrieval,” Comput. Vis. Image Understand., no. 118, pp. 16–29, 2014.

[48]

S. Zhang, Q. Tian, Q. Huang, W. Gao, and Y. Rui, “Cascade category-aware visual search,” IEEE Trans. Image Process., vol. 23, no. 6, pp. 2514–2527, Jun. 2014.

Digital Library

[49]

S. Zhang, Q. Tian, Q. Huang, W. Gao, and Y. Rui, “USB: Ultra short binary descriptor for fast visual matching and retrieval,” IEEE Trans. Image Process., vol. 23, no. 8, pp. 3671–3683, Aug. 2014.

[50]

S. Zhang, Q. Tian, K. Lu, Q. Huang, and W. Gao, “Edge-SIFT: Discriminative binary descriptor for scalable partial-duplicate mobile search,” IEEE Trans. Image Process., vol. 22, no. 7, pp. 2889–2902, Jul. 2013.

[51]

S. Zhang, M. Yang, T. Cour, K. Yu, and D. N. Metaxas, “Query specific fusion for image retrieval,” in Proc. ECCV, 2012, pp. 660–673.

[52]

S. Zhang, M. Yang, X. Wang, Y. Lin, and Q. Tian, “Semantic-aware co-indexing for image retrieval,” in Proc. Int. Conf. Comput. Vis., Dec. 2013, pp. 1673–1680.

[53]

S. Zhang, M. Yang, X. Wang, Y. Lin, and Q. Tian, “Semantic-aware co-indexing for image retrieval,” IEEE Trans. Pattern Anal. Mach. Intell.

[54]

Y. Zhang, Z. Jia, and T. Chen, “Image retrieval with geometry-preserving visual phrases,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recog., Jun. 2011, pp. 809–816.

[55]

W. Zhou, Y. Lu, H. Li, Y. Song, and Q. Tian, “Spatial coding for large scale partial-duplicate web image search,” in Proc. ACM Multimedia, 2010, pp. 511–520.

Cited By

Wei LZhang SYao HGao WTian QLiu QLienhart RWang HChen SBoll SChen PFriedland GLi JYan S(2017)GLADProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123279(420-428)Online publication date: 23-Oct-2017
https://dl.acm.org/doi/10.1145/3123266.3123279
Yao HZhang SZhang YLi JTian QLiu QLienhart RWang HChen SBoll SChen PFriedland GLi JYan S(2017)One-Shot Fine-Grained Instance RetrievalProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123278(342-350)Online publication date: 23-Oct-2017
https://dl.acm.org/doi/10.1145/3123266.3123278

Index Terms

Cross Indexing With Grouplets
1. Computing methodologies
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing
  2. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Index terms have been assigned to the content through auto-classification.

Recommendations

Texture Synthesis with Grouplets

This paper proposes a new method to synthesize and inpaint geometric textures. The texture model is composed of a geometric layer that drives the computation of a new grouplet transform. The geometry is an orientation flow that follows the patterns of ...
A Unified Indexing Structure for Efficient Cross-Media Retrieval
DASFAA '09: Proceedings of the 14th International Conference on Database Systems for Advanced Applications

An important trend in web information processing is the support of content-based multimedia retrieval (CBMR). However, the most prevailing paradigm of CBMR, such as content-based image retrieval, content-based audio retrieval, etc, is rather ...
Indexing images in Oracle8i

Content-based retrieval of images is the ability to retrieve images that are similar to a query image. Oracle8i Visual Information Retrieval provides this facility based on technology licensed from Virage, Inc. This product is built on top of Oracle8i ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Multimedia

IEEE Transactions on Multimedia Volume 17, Issue 11

Nov. 2015

235 pages

ISSN:1520-9210

Issue’s Table of Contents

Copyright © 2015.

Publisher

IEEE Press

Publication History

Published: 01 November 2015

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wei LZhang SYao HGao WTian QLiu QLienhart RWang HChen SBoll SChen PFriedland GLi JYan S(2017)GLADProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123279(420-428)Online publication date: 23-Oct-2017
https://dl.acm.org/doi/10.1145/3123266.3123279
Yao HZhang SZhang YLi JTian QLiu QLienhart RWang HChen SBoll SChen PFriedland GLi JYan S(2017)One-Shot Fine-Grained Instance RetrievalProceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3123278(342-350)Online publication date: 23-Oct-2017
https://dl.acm.org/doi/10.1145/3123266.3123278

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents