Abstract
Scene recognition systems are generally based on features that represent the image semantics by modeling the content depicted in a given image. In this paper we propose a framework for scene recognition that goes beyond the mere visual content analysis by exploiting a new cue for categorization: the image composition, namely its photographic style and layout. We extract information about the image composition by storing the values of affective, aesthetic and artistic features in a compositional vector. We verify the discriminative power of our compositional vector for scene categorization by using it for the classification of images from various, diverse, large scale scene understanding datasets. We then combine the compositional features with traditional semantic features in a complete scene recognition framework. Results show that, due to the complementarity of compositional and semantic features, scene categorization systems indeed benefit from the incorporation of descriptors representing the image photographic layout (+ 13-15% over semantic-only categorization).
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, p. 22. Citeseer (2004)
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42 (2001)
Krages, B.: Photography: the art of composition. Allworth Pr. (2005)
Freeman, M.: The photographer’s eye: composition and design for better digital photos. Focal Pr. (2007)
van Gemert, J.: Exploiting photographic style for category-level image classification by generalizing the spatial pyramid. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, p. 14. ACM (2011)
Dhar, S., Ordonez, V., Berg, T.: High level describable attributes for predicting aesthetics and interestingness. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1657–1664. IEEE (2011)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying Aesthetics in Photographic Images Using a Computational Approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006)
Obrador, P., Saad, M.A., Suryanarayan, P., Oliver, N.: Towards Category-Based Aesthetic Models of Photographs. In: Schoeffmann, K., Merialdo, B., Hauptmann, A.G., Ngo, C.-W., Andreopoulos, Y., Breiteneder, C. (eds.) MMM 2012. LNCS, vol. 7131, pp. 63–76. Springer, Heidelberg (2012)
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the International Conference on Multimedia, pp. 83–92. ACM (2010)
Rigau, J., Feixas, M., Sbert, M.: Conceptualizing birkhoff’s aesthetic measure using shannon entropy and kolmogorov complexity. In: Computational Aesthetics in Graphics, Visualization, and Imaging (2007)
Redi, M., Merialdo, B.: Saliency moments for image categorization. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR 2011 (2011)
Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492. IEEE (2010)
Wong, L., Low, K.: Saliency-enhanced image aesthetics class prediction. In: 2009 16th IEEE International Conference on Image Processing (ICIP). IEEE (2009)
Wang, W., Yu, Y.: Image emotional semantic query based on color semantic description. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, vol. 7, pp. 4571–4576. IEEE (2005)
Li, C., Chen, T.: Aesthetic visual quality assessment of paintings. IEEE Journal of Selected Topics in Signal Processing 3, 236–252 (2009)
Leslie, L., Chua, T., Ramesh, J.: Annotation of paintings with high-level semantic concepts using transductive inference and ontology-based concept disambiguation. In: Proceedings of the 15th International Conference on Multimedia. ACM (2007)
Haralick, R.M., Shapiro, L.G.: Computer and Robot Vision, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1992)
Desnoyer, M., Wettergreen, D.: Aesthetic image classification for autonomous agents. In: Proc. ICPR. Citeseer (2010)
Michelson, A.: Studies in optics. Dover Pubns. (1995)
Birkhoff, G.: Aesthetic measure (1933)
Won, C., Park, D., Park, S.: Efficient use of mpeg-7 edge histogram descriptor. Etri Journal 24, 23–30 (2002)
Ruderman, D.: The statistics of natural images. Network: Computation in Neural Systems 5, 517–548 (1994)
Delezoide, B., Precioso, F., Redi, M., Merialdo, B., Granjon, L., Pellerin, D., Rombaut, M., Jégou, H., Vieux, R., Mansencal, B., et al.: Irim at trecvid 2011: Semantic indexing and instance search. TREC Online Proceedings (2011)
Hou, X., Zhang, L.: Saliency detection: A spectral residual approach. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007. IEEE (2007)
Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2. IEEE (2003)
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 524–531. IEEE (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Redi, M., Merialdo, B. (2012). Enhancing Semantic Features with Compositional Analysis for Scene Recognition. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-33885-4_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)