Article

Latent semantic analysis for an effective region-based video shot retrieval system

Authors:

Fabrice Souvannavong,

Bernard Merialdo,

Benoît HuetAuthors Info & Claims

MIR '04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval

Pages 243 - 250

https://doi.org/10.1145/1026711.1026751

Published: 15 October 2004 Publication History

Abstract

We present a complete and efficient framework for video shot indexing and retrieval. Video shots are described by their key-frame, themselves described by their regions. Region-based approaches suffer from the complexity of segmentation and comparison tasks. A compact region-based shot representation is usually obtained thanks to vector-quantization method. We thus introduce LSA to reduce the noise inherent to the segmentation and the quantization processes. Then to better capture the content of video shots, we propose two original methods. The first takes advantage of a multi-scale segmentation of frames while the second uses multiple frames to represent a shot. Both approaches require more computation time during the pre-processing but not for indexing and comparison tasks. Indeed the extra information is included in the original signatures of shots. Finally we introduce a relevance feedback loop to optimize the search and propose a new method to optimize the effect of LSA. In the experimental section, we make an evaluation of latent semantic analysis and proposed approaches on two problems, namely object retrieval and semantic content estimation

References

[1]

Shih-Fu Chang, W. Chen, H.J. Meng, H. Sundaram, and Di Zhong. A fully automated content-based video search engine supporting spatiotemporal queries. In IEEE Transactions on Circuits and Systems for Video Technology, volume 8, pages 602-- 615, 1998.]]

Digital Library

[2]

M.R. Naphade, T. Kristjansson, B. Frey, and T.S. Huang. Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval. In IEEE International Conference on Image Processing, volume 3, pages 536--540, 1998.]]

[3]

Howard Wactlar, Takeo Kanade, Michael A. Smith, and Scott M. Stevens. Intelligent access to digital video: The informedia project. IEEE Computer, 29(5), 1996.]]

Digital Library

[4]

E. Ardizzone and M. La Cascia. Automatic video database indexing and retrieval. Multimedia Tools Applications, 4(1):29--56, 1997.]]

Digital Library

[5]

Chad Carson, Megan Thomas, and Serge Belongie. Blobworld: A system for region-based image indexing and retrieval. In Third internation conference on visual information systems, 1999.]]

Digital Library

[6]

Feng Jing, Mingling Li, Hong-Jiang Zhang, and Bo Zhang. An effective region-based image retrieval framework. In ACM Multimedia, 2002.]]

Digital Library

[7]

Scott C. Deerwester, Susan T. Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.]]

[8]

Mikko Kurimo. Indexing audio documents by using latent semantic analysis and som. In Erkki Oja and Samuel Kaski, editors, Kohonen Maps, pages 363--374. Elsevier, 1999.]]

[9]

Rong Zhao and William I Grosky. From features to semantics: Some preliminary results. In International Conference on Multimedia and Expo, 2000.]]

[10]

Joo-Hwee Lim. Learning visual keywords for content-based retrieval. In IEEE International Conference on Multimedia Computing and Systems, volume 2, pages 169--173, 1999.]]

Digital Library

[11]

Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In IEEE International Conference on Computer Vision, pages 97--112, 2002.]]

Digital Library

[12]

P. Felzenszwalb and D. Huttenlocher. Efficiently computing a good segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 98--104, 1998.]]

Digital Library

[13]

Wei-Ying Ma and Hong Jiang Zhang. Benchmarking of image features for content-based image retrieval. In Thirty-second Asilomar Conference on Signals, System and Computers, volume 1, pages 253--257, 1998.]]

[14]

Fabrice Souvannavong, Bernard Merialdo, and Benoît Huet. Latent semantic analysis for semantic content detection of video shots. In International Conference on Multimedia and Expo, 2004.]]

[15]

Fabrice Souvannavong, Bernard Merialdo, and Benoît Huet. Video content modeling with latent semantic analysis. In Third International Workshop on Content-Based Multimedia Indexing, 2003.]]

[16]

M. Mirmehdi and R. Perissamy. Perceptual image indexing and retrieval. Journal of Visual Communication and Image Representation, 13(4):460--475, December 2002.]]

[17]

Charles E. Jacob, Adam Finkelstein, and David H. Salesin. Fast multiresolution image querying. In International conference on computer graphics and iteractive techniques, pages 277--286, 1995.]]

Digital Library

[18]

Fabrice Moscheni, Sushil Bhattacharjee, and MuratKunt. Spatio-temporal segmentation based on region merging. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20:897--915, 1998.]]

Digital Library

[19]

Daniel DeMenthon. Spatio-temporal segmentation of video by hierarchical mean shift analysis. In Workshop on Statistical Methods in Video Processing, 2002.]]

[20]

J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall, 1971.]]

[21]

Ching-Yung Lin, Belle L. Tseng, and John R. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. In Proceedings of the TRECVID 2003 Workshop, 2003.]]

[22]

Fabrice Souvannavong, Bernard Merialdo, and Benoit Huet. Latent semantic indexing for video content modeling and analysis. In The 12th Text REtrieval Conference (TREC), 2003.]]

[23]

K. Kira and L. Rendell. A practical approach to feature selection. In Proceedings of the 9 International Conference on Machine Learning, pages 249--256, 1992.]]

Digital Library

Cited By

Szajerman DNapieralski PLecointe J(2018)Joint analysis of simultaneous EEG and eye tracking data for video imagesCOMPEL - The international journal for computation and mathematics in electrical and electronic engineering10.1108/COMPEL-07-2018-028137:5(1870-1884)Online publication date: 3-Sep-2018
https://doi.org/10.1108/COMPEL-07-2018-0281
XUE JEGUCHI K(2017)Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet ProcessesIEICE Transactions on Information and Systems10.1587/transinf.2016MUP0007E100.D:1(33-41)Online publication date: 2017
https://doi.org/10.1587/transinf.2016MUP0007
Szajerman DNapieralski P(2017)Joint analysis of simultaneous EEG and eye tracking data for video picture2017 18th International Symposium on Electromagnetic Fields in Mechatronics, Electrical and Electronic Engineering (ISEF) Book of Abstracts10.1109/ISEF.2017.8090693(1-2)Online publication date: Sep-2017
https://doi.org/10.1109/ISEF.2017.8090693
Show More Cited By

Index Terms

Latent semantic analysis for an effective region-based video shot retrieval system
1. Information systems
  1. Information retrieval

Recommendations

Semantic image retrieval based on probabilistic latent semantic analysis
MM '06: Proceedings of the 14th ACM international conference on Multimedia

Content-based image retrieval (CBIR) systems combine computer vision techniques and learning methodologies to find images in the database similar to the query images. Relevance feedback methods are introduced to the CBIR area as a tool to help the user ...
Quantum latent semantic analysis
ICTIR'11: Proceedings of the Third international conference on Advances in information retrieval theory

The main goal of this paper is to explore latent topic analysis (LTA), in the context of quantum information retrieval. LTA is a valuable technique for document analysis and representation, which has been extensively used in information retrieval and ...
Incremental probabilistic Latent Semantic Analysis for video retrieval

Recent research trends in Content-based Video Retrieval have shown topic models as an effective tool to deal with the semantic gap challenge. In this scenario, this paper has a dual target: (1) it is aimed at studying how the use of different topic ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MIR '04: Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval

October 2004

334 pages

ISBN:1581139403

DOI:10.1145/1026711

General Chairs:
Michael S. Lew
LIACS Media Lab, The Netherlands
,
Nicu Sebe
University of Amsterdam, The Netherlands
,
Chabane Djeraba
LIFL, France

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM04

Sponsor:

MM04: 2004 12th Annual ACM International Conference on Multimedia

October 15 - 16, 2004

NY, New York, USA

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
556
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Szajerman DNapieralski PLecointe J(2018)Joint analysis of simultaneous EEG and eye tracking data for video imagesCOMPEL - The international journal for computation and mathematics in electrical and electronic engineering10.1108/COMPEL-07-2018-028137:5(1870-1884)Online publication date: 3-Sep-2018
https://doi.org/10.1108/COMPEL-07-2018-0281
XUE JEGUCHI K(2017)Video Data Modeling Using Sequential Correspondence Hierarchical Dirichlet ProcessesIEICE Transactions on Information and Systems10.1587/transinf.2016MUP0007E100.D:1(33-41)Online publication date: 2017
https://doi.org/10.1587/transinf.2016MUP0007
Szajerman DNapieralski P(2017)Joint analysis of simultaneous EEG and eye tracking data for video picture2017 18th International Symposium on Electromagnetic Fields in Mechatronics, Electrical and Electronic Engineering (ISEF) Book of Abstracts10.1109/ISEF.2017.8090693(1-2)Online publication date: Sep-2017
https://doi.org/10.1109/ISEF.2017.8090693
Xue JEguchi KKender JSmith JLuo JBoll SHsu W(2016)Sequential Correspondence Hierarchical Dirichlet Processes for Video Data AnalysisProceedings of the 2016 ACM on International Conference on Multimedia Retrieval10.1145/2911996.2912041(229-233)Online publication date: 6-Jun-2016
https://dl.acm.org/doi/10.1145/2911996.2912041
Hou YXiao TZhang SJiang XLi XHu XHan JGuo LMiller LNeupert RLiu T(2016)Predicting Movie Trailer Viewer's “Like/Dislike” via Learned Shot Editing PatternsIEEE Transactions on Affective Computing10.1109/TAFFC.2015.24443717:1(29-44)Online publication date: 1-Jan-2016
https://dl.acm.org/doi/10.1109/TAFFC.2015.2444371
XIE YEGUCHI K(2014)Multimedia Topic Models Considering Burstiness of Local FeaturesIEICE Transactions on Information and Systems10.1587/transinf.E97.D.714E97.D:4(714-720)Online publication date: 2014
https://doi.org/10.1587/transinf.E97.D.714
Jin YPrabhakaran B(2013)Content Based 3D Human Document Retrieval Using Latent Semantic MappingProceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops10.1109/CVPRW.2013.86(550-557)Online publication date: 23-Jun-2013
https://dl.acm.org/doi/10.1109/CVPRW.2013.86
Perkiö JTuominen AVähäkangas TMyllymäki P(2012)Image similarityMultimedia Tools and Applications10.1007/s11042-010-0562-757:1(5-27)Online publication date: 1-Mar-2012
https://dl.acm.org/doi/10.1007/s11042-010-0562-7
Wanke JUlges ALampert CBreuel TWang JBoujemaa NRamirez NNatsev A(2010)Topic models for semantics-preserving video compressionProceedings of the international conference on Multimedia information retrieval10.1145/1743384.1743433(275-284)Online publication date: 29-Mar-2010
https://dl.acm.org/doi/10.1145/1743384.1743433
Irie GSatou TKojima AYamasaki TAizawa K(2010)Affective Audio-Visual Words and Latent Topic Driving Model for Realizing Movie Affective Scene ClassificationIEEE Transactions on Multimedia10.1109/TMM.2010.205187112:6(523-535)Online publication date: 1-Oct-2010
https://dl.acm.org/doi/10.1109/TMM.2010.2051871
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents