Orientation Covariant Aggregation of Local Descriptors with Embeddings

Tolias, Giorgos; Furon, Teddy; Jégou, Hervé

doi:10.1007/978-3-319-10599-4_25

Giorgos Tolias¹⁹,
Teddy Furon¹⁹ &
Hervé Jégou¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8694))

Included in the following conference series:

European Conference on Computer Vision

17k Accesses
22 Citations

Abstract

Image search systems based on local descriptors typically achieve orientation invariance by aligning the patches on their dominant orientations. Albeit successful, this choice introduces too much invariance because it does not guarantee that the patches are rotated consistently.

This paper introduces an aggregation strategy of local descriptors that achieves this covariance property by jointly encoding the angle in the aggregation stage in a continuous manner. It is combined with an efficient monomial embedding to provide a codebook-free method to aggregate local descriptors into a single vector representation.

Our strategy is also compatible and employed with several popular encoding methods, in particular bag-of-words, VLAD and the Fisher vector. Our geometric-aware aggregation strategy is effective for image search, as shown by experiments performed on standard benchmarks for image and particular object retrieval, namely Holidays and Oxford buildings.

Download to read the full chapter text

Chapter PDF

Image Search with Selective Match Kernels: Aggregation Across Single and Multiple Images

Article 05 March 2015

Encoding High-Order Statistics in VLAD for Scalable Image Retrieval

FhVLAD: Fine-grained quantization and encoding high-order descriptor statistics for scalable image retrieval

Article 02 February 2021

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: SURF: Speeded up robust features. Computer Vision and Image Understanding 110(3), 346–359 (2008)
Article Google Scholar
Sivic, J., Zisserman, A.: Video Google: A text retrieval approach to object matching in videos. In: ICCV (October 2003)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR (June 2007)
Google Scholar
Chum, O., Philbin, J., Sivic, J., Isard, M., Zisserman, A.: Total recall: Automatic query expansion with a generative feature model for object retrieval. In: ICCV (October 2007)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87(3), 316–336 (2010)
Article Google Scholar
Mikulík, A., Perdoch, M., Chum, O., Matas, J.: Learning a fine vocabulary. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 1–14. Springer, Heidelberg (2010)
Chapter Google Scholar
Jégou, H., Douze, M., Schmid, C.: On the burstiness of visual elements. In: CVPR (June 2009)
Google Scholar
Tolias, G., Avrithis, Y., Jégou, H.: To aggregate or not to aggregate: Selective match kernels for image search. In: ICCV (December 2013)
Google Scholar
Wang, J., Yang, J., Yu, K., Huang, F.L., Gong, T., Locality-constrained, Y.: linear coding for image classification. In: CVPR (June 2010)
Google Scholar
Perronnin, F., Dance, C.R.: Fisher kernels on visual vocabularies for image categorization. In: CVPR (June 2007)
Google Scholar
Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local descriptors into compact codes. Trans. PAMI (September 2012)
Google Scholar
Perronnin, F., Liu, Y., Sanchez, J., Poirier, H.: Large-scale image retrieval with compressed Fisher vectors. In: CVPR (June 2010)
Google Scholar
Charikar, M.: Similarity estimation techniques from rounding algorithms. In: STOC (May 2002)
Google Scholar
Jégou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. Trans. PAMI 33(1), 117–128 (2011)
Article Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (June 2006)
Google Scholar
Douze, M., Jégou, H., Singh, H., Amsaleg, L., Schmid, C.: Evaluation of GIST descriptors for web-scale image search. In: CIVR (July 2009)
Google Scholar
Koniusz, P., Yan, F., Mikolajczyk, K.: Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection. Computer Vision and Image Understanding 17(5), 479–492 (2013)
Article Google Scholar
Zhao, W., Jégou, H., Gravier, G.: Oriented pooling for dense and non-dense rotation-invariant features. In: BMVC (September 2013)
Google Scholar
Bo, L., Sminchisescu, C.: Efficient match kernel between sets of features for visual recognition. In: NIPS (December 2009)
Google Scholar
Bo, L., Ren, X., Fox, D.: Kernel descriptors for visual recognition. In: NIPS (December 2010)
Google Scholar
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. Trans. PAMI 34(3), 480–492 (2012)
Article Google Scholar
Jegou, H., Douze, M., Schmid, C.: Hamming embedding and weak geometric consistency for large scale image search. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 304–317. Springer, Heidelberg (2008)
Chapter Google Scholar
Lyu, S.: Mercer kernels for object recognition with local features. In: CVPR (June 2005)
Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV Workshop Statistical Learning in Computer Vision (May 2004)
Google Scholar
Picard, D., Gosselin, P.H.: Efficient image signatures and similarities using tensor products of local descriptors. Computer Vision and Image Understanding 117 (June 2013)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Carreira, J., Caseiro, R., Batista, J., Sminchisescu, C.: Semantic segmentation with second-order pooling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 430–443. Springer, Heidelberg (2012)
Chapter Google Scholar
Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions with formulas, graphs, and mathematical tables. National Bureau of Standards Applied Mathematics Series, vol. 55. U.S. Government Printing Office (1964)
Google Scholar
Băzăvan, E.G., Li, F., Sminchisescu, C.: Fourier kernel learning. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 459–473. Springer, Heidelberg (2012)
Google Scholar
Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: NIPS (December 1998)
Google Scholar
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC (September 2011)
Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Gool, L.V.: A comparison of affine region detectors. IJCV 65(1/2), 43–72 (2005)
Article Google Scholar
Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (June 2012)
Google Scholar
Delhumeau, J., Gosselin, P.H., Jégou, H., Pérez, P.: Revisiting the VLAD image representation. ACM Multimedia (October 2013)
Google Scholar
Perdoch, M., Chum, O., Matas, J.: Efficient representation of local geometry for large scale object retrieval. In: CVPR (June 2009)
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (June 2008)
Google Scholar
Douze, M., Jégou, H.: The Yael library. ACM Multimedia (November 2014)
Google Scholar
Torii, A., Sivic, J., Pajdla, T., Okutomi, M.: Visual place recognition with repetitive structures. In: CVPR (June 2013)
Google Scholar
Jégou, H., Chum, O.: Negative evidences and co-occurences in image retrieval: The benefit of PCA and whitening. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 774–787. Springer, Heidelberg (2012)
Chapter Google Scholar
Safadi, B., Quenot, G.: Descriptor optimization for multimedia indexing and retrieval. In: CBMI (June 2013)
Google Scholar
Arandjelovic, R., Zisserman, A.: All about VLAD. In: CVPR (June 2013)
Google Scholar
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP (February 2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Inria, Rennes, France
Giorgos Tolias, Teddy Furon & Hervé Jégou

Authors

Giorgos Tolias
View author publications
You can also search for this author in PubMed Google Scholar
Teddy Furon
View author publications
You can also search for this author in PubMed Google Scholar
Hervé Jégou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tolias, G., Furon, T., Jégou, H. (2014). Orientation Covariant Aggregation of Local Descriptors with Embeddings. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham. https://doi.org/10.1007/978-3-319-10599-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-10599-4_25
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10598-7
Online ISBN: 978-3-319-10599-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Orientation Covariant Aggregation of Local Descriptors with Embeddings

Abstract

Chapter PDF

Similar content being viewed by others

Image Search with Selective Match Kernels: Aggregation Across Single and Multiple Images

Encoding High-Order Statistics in VLAD for Scalable Image Retrieval

FhVLAD: Fine-grained quantization and encoding high-order descriptor statistics for scalable image retrieval

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Orientation Covariant Aggregation of Local Descriptors with Embeddings

Abstract

Chapter PDF

Similar content being viewed by others

Image Search with Selective Match Kernels: Aggregation Across Single and Multiple Images

Encoding High-Order Statistics in VLAD for Scalable Image Retrieval

FhVLAD: Fine-grained quantization and encoding high-order descriptor statistics for scalable image retrieval

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation