Multilevel Image Coding with Hyperfeatures

Agarwal, Ankur; Triggs, Bill

doi:10.1007/s11263-007-0072-x

Multilevel Image Coding with Hyperfeatures

Published: 03 August 2007

Volume 78, pages 15–27, (2008)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Ankur Agarwal¹ &
Bill Triggs²

314 Accesses
35 Citations
Explore all metrics

Abstract

Histograms of local appearance descriptors are a popular representation for visual recognition. They are highly discriminant with good resistance to local occlusions and to geometric and photometric variations, but they are not able to exploit spatial co-occurrence statistics over scales larger than the local input patches. We present a multilevel visual representation that remedies this. The starting point is the notion that to detect object parts in images, in practice it often suffices to detect co-occurrences of more local object fragments. This can be formalized by coding image patches against a codebook of known fragments or a more general statistical model and locally histogramming the resulting labels to capture their co-occurrence statistics. Local patch descriptors are converted into somewhat less local histograms over label occurrences. The histograms are themselves local descriptor vectors so the process can be iterated to code ever larger assemblies of object parts and increasingly abstract or ‘semantic’ image properties. We call these higher-level descriptors “hyperfeatures”. We formulate the hyperfeature model and study its performance under several different image coding methods including k-means based Vector Quantization, Gaussian Mixtures, and combinations of these with Latent Dirichlet Allocation. We find that the resulting high-level features provide improved performance in several object image and texture image classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Agarwal, A., & Triggs, B. (2006). Hyperfeatures—multilevel local coding for visual recognition. In European conference on computer vision (pp. 30–43).
Agarwal, S., Awan, A., & Roth, D. (2004). Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11), 1475–1490.
Article Google Scholar
Berg, A., & Malik, J. (2001). Geometric blur for template matching. In International conference on computer vision & pattern recognition.
Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Article MATH Google Scholar
Bouman, C. A. (1997). Cluster: an unsupervised algorithm for modeling Gaussian mixtures. Available from http://www.ece.purdue.edu/~bouman, April 1997.
Buntine, W., & Jakaulin, A. (2005). Discrete principal component analysis. Technical report, HIIT.
Buntine, W., & Perttu, S. (2003). Is multinomial PCA multi-faceted clustering or dimensionality reduction? In AI and statistics.
Canny, J. (2004). GaP: A factor model for discrete data. In ACM conference on information retrieval (SIGIR), Sheffield, UK.
Csurka, G., Bray, C., Dance, C., & Fan, L. (2004). Visual categorization with bags of keypoints. In European conference on computer vision.
Dorko, G., & Schmid, C. (2005). Object class recognition using discriminative local features. Technical report, INRIA Rhône Alpes.
Everingham, M., et al. (2006). The 2005 PASCAL visual object classes challenge. In F. d’Alche Buc, I. Dagan, & J. Quinonero (Eds.), Springer Lecture notes in artificial intelligence. Proceedings of the first PASCAL challenges workshop. Berlin: Springer.
Google Scholar
Epshtein, B., & Ullman, S. (2005). Feature hierarchies for object classification. In International conference on computer vision.
Fei-Fei, L., & Perona, P. (2005) A Bayesian hierarchical model for learning natural scene categories. In International conference on computer vision & pattern recognition.
Ferencz, A., Learned-Miller, E., & Malik, J. (2004). Learning hyper-features for visual identification. In Neural information processing systems.
Fritz, M., Hayman, E., Caputo, B., & Eklundh, J.-O. (2004). On the significance of real-world conditions for material classification. In European conference on computer vision.
Fukushima, K. (1980). Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), 193–202.
Article MATH Google Scholar
Harris, C., & Stephens, M. (1988). A combined corner and edge detector. In Alvey vision conference (pp. 147–151).
Hofmann, T. (1999). Probabilistic latent semantic analysis. In Proceedings of uncertainty in artificial intelligence, Stockholm.
Joachims, T. (1999). Making large-scale SVM learning practical. In Advances in kernel methods—support vector learning. London: MIT Press.
Google Scholar
Jurie, F., & Triggs, B. (2005). Creating efficient codebooks for visual recognition. In International conference on computer vision.
Kadir, T., & Brady, M. (2001). Saliency, scale and image description. International Journal of Computer Vision, 45(2), 83–105.
Article MATH Google Scholar
Keller, M., & Bengio, S. (2004). Theme-topic mixture model for document representation. In PASCAL workshop on learning methods for text understanding and mining.
Lang, G., & Seitz, P. (1997). Robust classification of arbitrary object classes based on hierarchical spatial feature-matching. Machine Vision and Applications, 10(3), 123–135.
Article Google Scholar
Lazebnik, S., Schmid, C., & Ponce, J. (2003). Affine-invariant local descriptors and neighborhood statistics for texture recognition. In International conference on computer vision.
Lazebnik, S., Schmid, C., & Ponce, J. (2004). Semi-local affine parts for object recognition. In British machine vision conference (Vol. 2, pp. 779–788).
Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In International conference on computer vision & pattern recognition.
LeCun, Y., Huang, F.-J., & Bottou, L. (2004). Learning methods for generic object recognition with invariance to pose and lighting. In IEEE conference on computer vision and pattern recognition.
Leung, T., & Malik, J. (1999). Recognizing surfaces using three-dimensional textons. In International conference on computer vision.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Malik, J., & Perona, P. (1990). Preattentive texture discrimination with early vision mechanisms. Journal of the Optical Society of America A, 7(5), 923–932.
Google Scholar
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Article Google Scholar
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., & Van Gool, L. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1–2), 43–72.
Article Google Scholar
Mori, G., & Malik, J. (2003). Recognizing objects in adversarial clutter: breaking a visual CAPTCHA. In International conference on computer vision & pattern recognition.
Mutch, J., & Lowe, D. (2006). Multiclass object recognition with sparse, localized features. In International conference on computer vision & pattern recognition (Vol. I, pp. 11–18).
Opelt, A., Fussenegger, M., Pinz, A., & Auer, P. (2004). Weak hypotheses and boosting for generic object detection and recognition. In European conference on computer vision.
Puzicha, J., Hofmann, T., & Buhmann, J. (1999). Histogram clustering for unsupervised segmentation and image retrieval. Pattern Recognition Letters, 20, 899–909.
Article Google Scholar
Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025.
Article Google Scholar
Schaffalitzky, F., & Zisserman, A. (2001). Viewpoint invariant texture matching and wide baseline stereo. In International conference on computer vision (pp. 636–643), Vancouver.
Schiele, B., & Crowley, J. (2000). Recognition without correspondence using multidimensional receptive field histograms. International Journal of Computer Vision, 36(1), 31–50.
Article Google Scholar
Schiele, B., & Pentland, A. (1999). Probabilistic object recognition and localization. In International conference on computer vision.
Schmid, C. (2004). Weakly supervised learning of visual models and its application to content-based retrieval. International Journal of Computer Vision, 56(1), 7–16.
Article Google Scholar
Schmid, C., & Mohr, R. (1997). Local gray value invariants for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5), 530–534.
Article Google Scholar
Serre, T., Wolf, L., & Poggio, T. (2005). Object recognition with features inspired by visual cortex. In International conference on computer vision & pattern recognition.
Vapnik, V. (1995). The nature of statistical learning theory. Berlin: Springer.
MATH Google Scholar
Varma, M., & Zisserman, A. (2003). Texture classification: are filter banks necessary? In International conference on computer vision & pattern recognition.

Download references

Author information

Authors and Affiliations

Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge, CB3 0FB, UK
Ankur Agarwal
LJK–INRIA site, LJK–CNRS, 655 Avenue de l’Europe, 38330, Montbonnot, France
Bill Triggs

Authors

Ankur Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Bill Triggs
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ankur Agarwal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Agarwal, A., Triggs, B. Multilevel Image Coding with Hyperfeatures. Int J Comput Vis 78, 15–27 (2008). https://doi.org/10.1007/s11263-007-0072-x

Download citation

Received: 14 July 2006
Accepted: 26 June 2007
Published: 03 August 2007
Issue Date: June 2008
DOI: https://doi.org/10.1007/s11263-007-0072-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multilevel Image Coding with Hyperfeatures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Spatially Local Coding for Object Recognition

Leveraging Mutual Information in Local Descriptions: From Local Binary Patterns to the Image

Locality constrained encoding of frequency and spatial information for image classification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Multilevel Image Coding with Hyperfeatures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Spatially Local Coding for Object Recognition

Leveraging Mutual Information in Local Descriptions: From Local Binary Patterns to the Image

Locality constrained encoding of frequency and spatial information for image classification

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation