- Research Article
- Open access
- Published:
Discovering Recurrent Image Semantics from Class Discrimination
EURASIP Journal on Advances in Signal Processing volume 2006, Article number: 076093 (2006)
Abstract
Supervised statistical learning has become a critical means to design and learn visual concepts (e.g., faces, foliage, buildings, etc.) in content-based indexing systems. The drawback of this approach is the need of manual labeling of regions. While several automatic image annotation methods proposed recently are very promising, they usually rely on the availability and analysis of associated text descriptions. In this paper, we propose a hybrid learning framework to discover local semantic regions and generate their samples for training of local detectors with minimal human intervention. A multiscale segmentation-free framework is proposed to embed the soft presence of discovered semantic regions and local class patterns in an image independently for indexing and matching. Based on 2400 heterogeneous consumer images with 16 semantic queries, both similarity matching based on individual index and integrated similarity matching have outperformed a feature fusion approach by 26% and 37% in average precisions, respectively.
References
Hsu WH-M, Chang S-F: Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 2: 1091–1094.
Li B, Goh K, Chang EY: Confidence-based dynamic ensemble for image annotation and semantics discovery. Proceedings of 11th ACM International Conference on Multimedia (MM '03), November 2003, Berkeley, Calif, USA 195–206.
Snoek CGM, Worring M, Hauptmann AG: Detection of TV news monologues by style analysis. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '04), June 2004, Taipei, Taiwan 2: 1103–1106.
Tseng BL, Lin C-Y, Naphade MR, Natsev A, Smith JR: Normalized classifier fusion for semantic visual concept detection. Proceedings of IEEE International Conference on Image Processing (ICIP '03), September 2003, Barcelona, Spain 2: 535–538.
Amir A, Iyengar G, Lin C-Y, et al.: The IBM semantic concept detection framework. 2003, https://doi.org/www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
Lin C-Y, Tseng BL, Smith JR: VideoAnnEx: IBM MPEG-7 annotation tool for multimedia indexing and concept learning. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA
Adams WH, Iyengar G, Lin C-Y, et al.: Semantic indexing of multimedia content using visual, audio, and text cues. EURASIP Journal on Applied Signal Processing 2003, 2003(2):170–185. 10.1155/S1110865703211173
Wang L, Chan KL, Zhang Z: Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 1: 629–634.
Wu Y, Tian Q, Huang TS: Discriminant-EM algorithm with application to image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '00), June 2000, Hilton Head Island, SC, USA 1: 222–227.
Lu YL, Hu C, Zhu X, Zhang HJ, Yang Q: A unified framework for semantics and feature based relevance feedback in image retrieval systems. Proceedings of 8th ACM International Conference on Multimedia (MM '00), October–November 2000, Los Angeles, Calif, USA 31–37.
Liu W, Sun Y, Zhang H: MiAlbum—a system for home photo management using the semi-automatic image annotation approach. Proceedings of 8th ACM International Conference on Multimedia (MM '00), October–November 2000, Los Angeles, Calif, USA 479–480.
Benitez AB, Chang S-F: Automatic multimedia knowledge discovery, summarization and evaluation. to appear in IEEE Trans. Multimedia
Benitez AB, Smith JR, Chang S-F: MediaNet: a multimedia information network for knowledge representation. Internet Multimedia Management Systems, November 2000, Boston, Mass, USA, Proceedings of SPIE 4210: 1–12.
Benitez AB, Chang S-F: Image classification using multimedia knowledge networks. Proceedings of IEEE International Confererence on Image Processing (ICIP '03), September 2003, Barcelona, Spain 3: 613–616.
Duygulu P, Barnard K, de Freitas N, Forsyth D: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Proceedings of 7th European Conference on Computer Vision (ECCV '02), May 2002, Copenhagen, Denmark 4: 97–112.
Barnard K, Duygulu P, Forsyth D, de Freitas N, Blei DM, Jordan MI: Matching words and pictures. Journal of Machine Learning Research 2003, 3(6):1107–1135.
Kutics A, Nakagawa A, Tanaka K, Yamada M, Sanbe Y, Ohtsuka S: Linking images and keywords for semantics-based image retrieval. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '03), July 2003, Baltimore, Md, USA 1: 777–780.
Li J, Wang JZ: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions On Pattern Analysis And Machine Intelligence 2003, 25(9):1075–1088. 10.1109/TPAMI.2003.1227984
Barnard K, Duygulu P, Guru R, Gabbur P, Forsyth D: The effects of segmentation and feature choice in a translation model of object recognition. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 675–682.
Fergus R, Perona P, Zisserman A: Object class recognition by unsupervised scale-invariant learning. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '03), June 2003, Madison, Wis, USA 2: 264–271.
Selinger A, Nelson RC: Minimally supervised acquisition of 3D recognition models from cluttered images. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001, Kauai, Hawaii, USA 1: 213–220.
Weber M, Welling M, Perona P: Unsupervised learning of models for recognition. Proceedings of 6th European Conference on Computer Vision (ECCV '00), June–July 2000, Dublin, Ireland 1: 18–32.
Schmid C: Constructing models for content-based image retrieval. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR '01), December 2001, Kauai, Hawaii, USA 2: 39–45.
Vapnik VN: Statistical Learning Theory. John Wiley & Sons, New York, NY, USA; 1998.
Bezdek JC: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, NY, USA; 1981.
Vailaya A, Figueiredo MAT, Jain AK, Zhang H-J: Image classification for content-based indexing. IEEE Transactions On Image Processing 2001, 10(1):117–130. 10.1109/83.892448
Manjunath BS, Ma WY: Texture features for browsing and retrieval of image data. IEEE Transactions On Pattern Analysis And Machine Intelligence 1996, 18(8):837–842. 10.1109/34.531803
Boughorbel S, Tarel J-P, Fleuret F: Non-mercer kernel for SVM object recognition. Proceedings of British Machine Vision Conference (BMVC '04), September 2004, London, UK 137–146.
Joachims T: Making large-scale SVM learning practical. In Advances in Kernel Methods—Support Vector Learning. Edited by: Schölkopf B, Burges CJC, Smola A. MIT Press, Cambridge, Mass, USA; 1999:169–184.
Bishop CM: Neural Networks for Pattern Recognition. Oxford University Press, Oxford, UK; 1995.
Papageorgiou CP, Oren M, Poggio T: A general framework for object detection. Proceedings of IEEE 6th International Conference on Computer Vision (ICCV '98), January 1998, Bombay, India 555–562.
Swain MJ, Ballard DH: Color indexing. International Journal of Computer Vision 1991, 7(1):11–32. 10.1007/BF00130487
Szummer M, Picard RW: Indoor-outdoor image classification. Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Databases, January 1998, Bombay, India 42–51.
Author information
Authors and Affiliations
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Lim, JH., Jin, J.S. Discovering Recurrent Image Semantics from Class Discrimination. EURASIP J. Adv. Signal Process. 2006, 076093 (2006). https://doi.org/10.1155/ASP/2006/76093
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1155/ASP/2006/76093