Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1178677.1178693acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Robust scene recognition using language models for scene contexts

Published: 26 October 2006 Publication History

Abstract

We propose a robust scene recognition framework using scene context information for multimedia contents. Multimedia contents con-sist of scene sequences that are more likely to happen compared with other scene sequences. We employ a statistical approach to deal with this scene context information. We employ a hidden Markov model (HMM) to model each scene and n-gram language model to represent the contexts among scenes. We evaluated the proposed method in scene recognition experiments for 16 scenes in video data of 25 baseball games. The proposed method significantly improved the results compared to that without scene context information.

References

[1]
R. Brunelli, O. Mich, and C. M. Modena, "A survey on the automatic indexing of video data," Journal of Visual Communication and Image Representation, vol. 10, no. 2, pp. 78--112, 1999.
[2]
S. Kumar and M. Hebert, "A hierarchical field framework for unified context-based classification," Proc. IEEE International Conference on Computer Vision,vol.3, pp. 1284--1291, 2005.
[3]
H. B. Nguyen, K. Shinoda, and S. Furui, "Robust highlight extraction using multi-stream hidden Markov models for baseball video," Proc. IEEE International Conference on Image Processing, vol. 3, pp. 173--176, 2005.
[4]
T. Mochizuki, M. Tadenuma, and N. Yagi, "Baseball video indexing using patternization of scenes and hidden Markov model," Proc. IEEE International Conference on Image Processing, vol. 3, pp. 1212--1215, 2005.
[5]
P. Chang, M. Han, and Y. Gong, "Extract highlights from baseball game video with hidden Markov models," Proc. IEEE International Conference on Image Processing, vol.1, pp. I-609--612, 2002.
[6]
Y. Gong, M. Han, W. Hua, and W. Xu, "Maximum entropy model-based baseball highlight detection and classification," International Journal of Computer Vision and Image Understanding, vol. 96, pp. 181--199, 2004.
[7]
C.-H. Liang, W.-T. Chu, J.-H. Kuo, J.-L. Wu, and W.-H. Cheng, "Baseball event detection using game-specific feature sets and rules," Proc. IEEE International Symposium on Circuits and Systems, pp. 3829--3832, 2005.
[8]
P. Xu, L. Xie, S. F. Chang, A. Divakaran, A. Vetro, and H. Sun, "Algorithms and system for segmentation and structure analysis in soccer video," Proc. IEEE International Conference on Multimedia and Expo, pp. 928--931, 2001.
[9]
Y. Gong, L.-T. Sin, C.-H. Chuan, H.-J. Zhang, and M. Sakauchi, "Automatic parsing of TV soccer programs," Proc. IEEE International Conference on Multimedia Computing and Systems, pp. 167--174, 1995.
[10]
E. Kijak, L. Oisel, and P. Gros, "Hierarchical structure analysis of sport videos using HMMs," Proc. IEEE International Conference on Image Processing, vol.3, pp. 1025--1028, 2003.
[11]
G. Xu, Y.-F. Ma, H.-J. Zhang, and S.-Q. Yang, "Motion based event recognition using HMM," IEEE Trans. Circuits and Systems, vol. 15, pp. 1422--1433, 2005.
[12]
N. Babaguchi, Y. Kwai, and T. Kitahashi, "Event based indexing of broadcasted sports video by intermodal collaboration," IEEE Trans. Multimedia, vol. 4, no. 1, pp. 68--75, 2002.
[13]
L. Rabiner and B.-H. Juang, "Fundamentals of speech recognition," Prentice Hall, 1993.
[14]
G. Xu, Y.-F. Ma, H.-J. Zhang, and S. Yang, "Motion based event recognition using HMM," Proc. IEEE International Conference on Pattern Recognition, vol. 2, pp. 831--834, 2002.
[15]
D. Zhong and S. F. Chang, "Structure analysis of sports video using demain models," Proc. IEEE International Conference on Multimedia and Expo, pp. 920--923, 2001.
[16]
S. Takagi, S. Hattori, K. Yokoyama, A. Kodate, and H. Tominaga, "Sports video categorizing method using camera motion parameters," Proc. IEEE International Conference on Multimedia and Expo, pp. 461--464, 2003.
[17]
B. Lucas and T. Kanade, "An iterative image registration technique with an application to stereo vision," Proc. 7th International Joint Conference on Artificial Intelligence, pp. 674--679, 1981.
[18]
S. M. Katz, "Estimation of probabilities from sparse data for the language model component of a speech recognizer," IEEE Trans. Acoustics, Speech and Signal Processing, vol. 35, pp. 400--401, 1987.
[19]
H. Ney, U. Essen, and R. Kneser, "On structuring probabilistic dependencies in stochastic language modeling," Computer Speech and Language, vol. 8, no. 1, pp. 1--38, 1994.
[20]
P. Placeway, R. Schwartz, P. Fung, and L. Nguyen, "The estimation of powerful language models from small and large corpora," Proc. IEEE Acoustics, Speech and Signal Processing, vol. II, pp. 33--36, 1993.
[21]
I. H. Witten and T. C. Bell, "The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression," IEEE Trans. Information Theory, vol.37, no. 4, pp. 1085--1094, 1991.
[22]
G. Saon and M. Padmanablan, "Data-driven approach to designing compound words for continuous speech recognition," IEEE Trans. Speech and Audio Processing, vol. 9, no. 4, pp. 327--332, 2001.
[23]
A. Kilgariff and D. Tugwell, "Wasp-bench: an mt lexicographer's workstation supporting state-of-the-art lexical disambiguation," Proc. the 8th Machine Translation Summit, pp. 187--190, 2001.
[24]
http://htk.eng.cam.ac.uk.
[25]
http://svr-www.eng.cam.ac.uk/¿prc14/toolkit.html.
[26]
X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, 2001.
[27]
http://julius.sourceforge.jp.

Cited By

View all
  • (2018)Towards large-scale multimedia retrieval enriched by knowledge about human interpretationMultimedia Tools and Applications10.1007/s11042-014-2292-875:1(297-331)Online publication date: 31-Dec-2018
  • (2014)n-gram Models for Video Semantic IndexingProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654961(777-780)Online publication date: 3-Nov-2014
  • (2014)Multimedia Event Detection Using Hidden Conditional Random FieldsProceedings of International Conference on Multimedia Retrieval10.1145/2578726.2578742(9-16)Online publication date: 1-Apr-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MIR '06: Proceedings of the 8th ACM international workshop on Multimedia information retrieval
October 2006
344 pages
ISBN:1595934952
DOI:10.1145/1178677
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CBVIR
  2. HMM
  3. indexing
  4. n-gram model
  5. sports video

Qualifiers

  • Article

Conference

MM06
MM06: The 14th ACM International Conference on Multimedia 2006
October 26 - 27, 2006
California, Santa Barbara, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Towards large-scale multimedia retrieval enriched by knowledge about human interpretationMultimedia Tools and Applications10.1007/s11042-014-2292-875:1(297-331)Online publication date: 31-Dec-2018
  • (2014)n-gram Models for Video Semantic IndexingProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654961(777-780)Online publication date: 3-Nov-2014
  • (2014)Multimedia Event Detection Using Hidden Conditional Random FieldsProceedings of International Conference on Multimedia Retrieval10.1145/2578726.2578742(9-16)Online publication date: 1-Apr-2014
  • (2014)Weakly supervised detection of video events using hidden conditional random fieldsInternational Journal of Multimedia Information Retrieval10.1007/s13735-014-0068-64:1(17-32)Online publication date: 28-Sep-2014
  • (2008)Automatic score scene detection for baseball videoProceedings of the 3rd international conference on Large-scale knowledge resources: construction and application10.5555/1787800.1787825(226-240)Online publication date: 3-Mar-2008
  • (2008)Automatic Score Scene Detection for Baseball VideoLarge-Scale Knowledge Resources. Construction and Application10.1007/978-3-540-78159-2_21(226-240)Online publication date: 2008
  • (2007)A robust scene recognition system for baseball broadcast using data-driven approachProceedings of the 6th ACM international conference on Image and video retrieval10.1145/1282280.1282312(186-193)Online publication date: 9-Jul-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media