Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1026711.1026751acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Latent semantic analysis for an effective region-based video shot retrieval system

Published: 15 October 2004 Publication History

Abstract

We present a complete and efficient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Region-based approaches suffer from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the effect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation

References

[1]
Shih-Fu Chang, W. Chen, H.J. Meng, H. Sundaram, and Di Zhong. A fully automated content-based video search engine supporting spatiotemporal queries. In IEEE Transactions on Circuits and Systems for Video Technology, volume 8, pages 602-- 615, 1998.]]
[2]
M.R. Naphade, T. Kristjansson, B. Frey, and T.S. Huang. Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval. In IEEE International Conference on Image Processing, volume 3, pages 536--540, 1998.]]
[3]
Howard Wactlar, Takeo Kanade, Michael A. Smith, and Scott M. Stevens. Intelligent access to digital video: The informedia project. IEEE Computer, 29(5), 1996.]]
[4]
E. Ardizzone and M. La Cascia. Automatic video database indexing and retrieval. Multimedia Tools Applications, 4(1):29--56, 1997.]]
[5]
Chad Carson, Megan Thomas, and Serge Belongie. Blobworld: A system for region-based image indexing and retrieval. In Third internation conference on visual information systems, 1999.]]
[6]
Feng Jing, Mingling Li, Hong-Jiang Zhang, and Bo Zhang. An effective region-based image retrieval framework. In ACM Multimedia, 2002.]]
[7]
Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.]]
[8]
Mikko Kurimo. Indexing audio documents by using latent semantic analysis and som. In Erkki Oja and Samuel Kaski, editors, Kohonen Maps, pages 363--374. Elsevier, 1999.]]
[9]
Rong Zhao and William I Grosky. From features to semantics: Some preliminary results. In International Conference on Multimedia and Expo, 2000.]]
[10]
Joo-Hwee Lim. Learning visual keywords for content-based retrieval. In IEEE International Conference on Multimedia Computing and Systems, volume 2, pages 169--173, 1999.]]
[11]
Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In IEEE International Conference on Computer Vision, pages 97--112, 2002.]]
[12]
P. Felzenszwalb and D. Huttenlocher. Efficiently computing a good segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 98--104, 1998.]]
[13]
Wei-Ying Ma and Hong Jiang Zhang. Benchmarking of image features for content-based image retrieval. In Thirty-second Asilomar Conference on Signals, System and Computers, volume 1, pages 253--257, 1998.]]
[14]
Fabrice Souvannavong, Bernard Merialdo, and Benoît Huet. Latent semantic analysis for semantic content detection of video shots. In International Conference on Multimedia and Expo, 2004.]]
[15]
Fabrice Souvannavong, Bernard Merialdo, and Benoît Huet. Video content modeling with latent semantic analysis. In Third International Workshop on Content-Based Multimedia Indexing, 2003.]]
[16]
M. Mirmehdi and R. Perissamy. Perceptual image indexing and retrieval. Journal of Visual Communication and Image Representation, 13(4):460--475, December 2002.]]
[17]
Charles E. Jacob, Adam Finkelstein, and David H. Salesin. Fast multiresolution image querying. In International conference on computer graphics and iteractive techniques, pages 277--286, 1995.]]
[18]
Fabrice Moscheni, Sushil Bhattacharjee, and MuratKunt. Spatio-temporal segmentation based on region merging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20:897--915, 1998.]]
[19]
Daniel DeMenthon. Spatio-temporal segmentation of video by hierarchical mean shift analysis. In Workshop on Statistical Methods in Video Processing, 2002.]]
[20]
J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall, 1971.]]
[21]
Ching-Yung Lin, Belle L. Tseng, and John R. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In Proceedings of the TRECVID 2003 Workshop, 2003.]]
[22]
Fabrice Souvannavong, Bernard Merialdo, and Benoit Huet. Latent semantic indexing for video content modeling and analysis. In The 12th Text REtrieval Conference (TREC), 2003.]]
[23]
K. Kira and L. Rendell. A practical approach to feature selection. In Proceedings of the 9 International Conference on Machine Learning, pages 249--256, 1992.]]

Cited By

View all
  • (2018)Joint analysis of simultaneous EEG and eye tracking data for video imagesCOMPEL - The international journal for computation and mathematics in electrical and electronic engineering10.1108/COMPEL-07-2018-028137:5(1870-1884)Online publication date: 3-Sep-2018
  • (2017)Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet ProcessesIEICE Transactions on Information and Systems10.1587/transinf.2016MUP0007E100.D:1(33-41)Online publication date: 2017
  • (2017)Joint analysis of simultaneous EEG and eye tracking data for video picture2017 18th International Symposium on Electromagnetic Fields in Mechatronics, Electrical and Electronic Engineering (ISEF) Book of Abstracts10.1109/ISEF.2017.8090693(1-2)Online publication date: Sep-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MIR '04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
October 2004
334 pages
ISBN:1581139403
DOI:10.1145/1026711
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. latent semantic analysis
  2. region clustering
  3. region similarity
  4. region-based video retrieval
  5. video analysis

Qualifiers

  • Article

Conference

MM04

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Joint analysis of simultaneous EEG and eye tracking data for video imagesCOMPEL - The international journal for computation and mathematics in electrical and electronic engineering10.1108/COMPEL-07-2018-028137:5(1870-1884)Online publication date: 3-Sep-2018
  • (2017)Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet ProcessesIEICE Transactions on Information and Systems10.1587/transinf.2016MUP0007E100.D:1(33-41)Online publication date: 2017
  • (2017)Joint analysis of simultaneous EEG and eye tracking data for video picture2017 18th International Symposium on Electromagnetic Fields in Mechatronics, Electrical and Electronic Engineering (ISEF) Book of Abstracts10.1109/ISEF.2017.8090693(1-2)Online publication date: Sep-2017
  • (2016)Sequential Correspondence Hierarchical Dirichlet Processes for Video Data AnalysisProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912041(229-233)Online publication date: 6-Jun-2016
  • (2016)Predicting Movie Trailer Viewer's “Like/Dislike” via Learned Shot Editing PatternsIEEE Transactions on Affective Computing10.1109/TAFFC.2015.24443717:1(29-44)Online publication date: 1-Jan-2016
  • (2014)Multimedia Topic Models Considering Burstiness of Local FeaturesIEICE Transactions on Information and Systems10.1587/transinf.E97.D.714E97.D:4(714-720)Online publication date: 2014
  • (2013)Content Based 3D Human Document Retrieval Using Latent Semantic MappingProceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops10.1109/CVPRW.2013.86(550-557)Online publication date: 23-Jun-2013
  • (2012)Image similarityMultimedia Tools and Applications10.1007/s11042-010-0562-757:1(5-27)Online publication date: 1-Mar-2012
  • (2010)Topic models for semantics-preserving video compressionProceedings of the international conference on Multimedia information retrieval10.1145/1743384.1743433(275-284)Online publication date: 29-Mar-2010
  • (2010)Affective Audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene ClassificationIEEE Transactions on Multimedia10.1109/TMM.2010.205187112:6(523-535)Online publication date: 1-Oct-2010
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media