Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1413918.1413922acmconferencesArticle/Chapter ViewAbstractPublication PagesmobicomConference Proceedingsconference-collections
research-article

Scene detection using visual and audio attention

Published: 14 February 2008 Publication History

Abstract

Shot and scene segmentation are basic steps for a variety of applications in video analysis and processing. In this paper, we propose a new method for automatic scene detection which takes visual patterns of movies and audio features into account. In particular, we also show that the use of audio analysis, to detect transitions in an audio stream, is suitable in order to capture the scene boundaries as well.

References

[1]
A. A. Altan, A. Akansu, and W. Wolf, Multi-Modal dialog Scene Detection using Hidden Markov Models for Content-Based Multimedia Indexing, Multimedia Tools and Application Journal, 14(2):1380--7501, 2001.
[2]
G. Boccignone, A. Chianese, V. Moscato and A. Picariello, Foveated Shot Detection for Video Segmentation, IEEE Trans. on Circuits and Systems for Video Technology, 15(3):365--377, 2005.
[3]
G. Boccignone, A. Chianese, V. Moscato, A. Picariello, Context-sensitive queries for image retrieval in digital libraries, Journal of Intelligent Information Systems, Online first, 2007.
[4]
R. M. Ford, C. Robson, Daniel Temple, and M. Gerlach, Metrics for Scene Change Detection in digital Video Sequences, IEEE Int. Conf. on Multimedia Computing and Systems, 1997.
[5]
U. Gargi, R. Kasturi and S. H. Strayer, Performance characterization of video-shot change detection methods, IEEE Trans. on Circ. Sys. for Video Tech., 10(1):1--13, 2000.
[6]
N. Haering, R. J. Qian, and M. Sezan, A Semantic Event-Detection Approach and Its Application to Detecting Hunts in Wildlife Video, IEEE Trans. on Circuits and Systems for Video Tech., 10(6), 2000.
[7]
L. Itti, C. Koch, and E. Niebur, A model of saliency based visual attention for rapid scene analysis, IEEE Trans. on Pattern Analysis and Machine Intelligence, 20:1254--1259, 1998.
[8]
Z. Liu, Y. Wang, znd T. Chen, Audio Feature Extraction and Analysis for Scene Segmentation and Classification, Journal of VLSI Signal Processing, 20(1--2):61--79, 1998.
[9]
L. Lu, HJ Zhang, and H. Jiang, Content Analysis for Audio Classification and Segmentation, IEEE Trans. on speech and audio processing, 10(7):504--516, 2002.
[10]
D. Noton, and L. Stark, Scanpaths in the saccdice eye movements during pattern perception, Visual Research, 11:929--942, 1990.
[11]
S. Pfeiffer, R. Lienhart, and W. Effelsberg, Scene Determination based on Video and Audio Features, Multimedia Tools and Application Journal, 15:59--81, 2001.
[12]
Y. Qi, A. Hauptmann, and T. Liu, Supervised Classification for Video Shot Segmentation, IEEE Conference on Multimedia & Expo (ICME03), 2003.
[13]
C. Saraceno, and R. Leonardi, Audio as a Support to Scene Change Detection and Characterization of Video Sequences, Proc. Int. Conf. Acoustics, Speech, and Signal Processing, pages 2597--2600, 1997.
[14]
N. Sebe, Q. Tian, E. Loupias, M. Lew, and T. Huang, Evaluation of salient point techniques, Image and Vision Computing, 21:1087--1095, 2003.
[15]
H. Sundaram, and S. Chang, Determing Computable scenes in Films and their Structures using Audio-Visual Memory, ACM Iternational Multimedia Conference, pages 95--104, 2000.
[16]
G. J. Walker-Smith, A. G. Gale, and J. M. Findlay, Eye movement strategies involved in face perception, Perception, 6:313--326, 1997.
[17]
Wang, Z. Liu, and J. Huang, Multimedia Content Analysis, IEEE Signal Processing Magazine, 12--36, November 2002.
[18]
J. Wang, and T. Chua, A Framework for Video Scene Boundary Detection, ACM Iternational Multimedia Conference, 2002.

Cited By

View all
  • (2013)Multimodal late fusion bag of features applied to scene detectionProceedings of the 19th Brazilian symposium on Multimedia and the web10.1145/2526188.2526202(15-22)Online publication date: 5-Nov-2013

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AMDIT '08: Proceedings of the 2008 Ambi-Sys workshop on Ambient media delivery and interactive television
February 2008
58 pages

Sponsors

Publisher

ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)

Brussels, Belgium

Publication History

Published: 14 February 2008

Check for updates

Qualifiers

  • Research-article

Conference

Ambi-Sys08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 65 of 93 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2013)Multimodal late fusion bag of features applied to scene detectionProceedings of the 19th Brazilian symposium on Multimedia and the web10.1145/2526188.2526202(15-22)Online publication date: 5-Nov-2013

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media