poster

Searching informative concept banks for video event detection

Authors:

Masoud Mazloom,

Efstratios Gavves,

Koen van de Sande,

Cees SnoekAuthors Info & Claims

ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

Pages 255 - 262

https://doi.org/10.1145/2461466.2461507

Published: 16 April 2013 Publication History

Abstract

An emerging trend in video event detection is to learn an event from a bank of concept detector scores. Different from existing work, which simply relies on a bank containing all available detectors, we propose in this paper an algorithm that learns from examples what concepts in a bank are most informative per event. We model finding this bank of informative concepts out of a large set of concept detectors as a rare event search. Our proposed approximate solution finds the optimal concept bank using a cross-entropy optimization. We study the behavior of video event detection based on a bank of informative concepts by performing three experiments on more than 1,000 hours of arbitrary internet video from the TRECVID multimedia event detection task. Starting from a concept bank of 1,346 detectors we show that 1.)some concept banks are more informative than others for specific events, 2.) event detection using an automatically obtained informative concept bank is more robust than using all available concepts, 3.) even for small amounts of training examples an informative concept bank outperforms a full bank and a bag-of-word event representation, and 4.) we show qualitatively that the informative concept banks make sense for the events of interest, without being programmed to do so. We conclude that for concept banks it pays to be informative.

References

[1]

T. Althoff, H. O. Song, and T. Darrell. Detection bank: An object detection based video representation for multimedia event recognition. In ACM Multimedia, 2012.

Digital Library

[2]

S. Ayache and G. Quénot. Video corpus annotation using active learning. In ECIR, 2008.

Digital Library

[3]

L. Ballan, M. Bertini, A. D. Bimbo, and G. Serra. Video event classification using string kernels. MTAP, 48(1), 2010.

Digital Library

[4]

L. Ballan, M. Bertini, A. Del Bimbo, L. Seidenari, and G. Serra. Event detection and recognition for semantic annotation of video. MTAP, 51, 2011.

Digital Library

[5]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.

[6]

L. Duan, D. Xu, I. W.-H. Tsang, and J. Luo. Visual event recognition in videos by learning from web data. TPAMI, 34(9), 2012.

Digital Library

[7]

S. Ebadollahi, L. Xie, S.-F. Chang, and J. R. Smith. Visual event detection using multi-dimensional concept dynamics. In ICME, 2006.

[8]

N. Gkalelis, V. Mezaris, and I. Kompatsiaris. High-level event detection in video exploiting discriminant concepts. In CBMI, 2011.

[9]

N. Haering, R. Qian, and I. Sezan. A semantic event-detection approach and its application to detecting hunts in wildlife video. TCSVT, 2000.

Digital Library

[10]

A. G. Hauptmann, M. G. Christel, and R. Yan. Video retrieval based on semantic concepts. Proceedings of the IEEE, 2008.

[11]

B. Huurnink, K. Hofmann, and M. de Rijke. Assessing concept selection for video retrieval. In ACM MIR, 2008.

Digital Library

[12]

N. Inoue et al. TokyoTech

[13]

Canon at TRECVID 2011. In NIST TRECVID Workshop, 2011.

[14]

Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. TPAMI, 22(8), 2000.

Digital Library

[15]

Y.-G. Jiang, J. Yang, C.-W. Ngo, and A. Hauptmann. Representations of keypoint-based semantic concept detection: A comprehensive study. TMM, 12(1), 2010.

Digital Library

[16]

Y.-G. Jiang, X. Zeng, G. Ye, S. Bhattacharya, D. Ellis, M. Shah, and S.-F. Chang. Columbia-ucf trecvid2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In NIST TRECVID Workshop, 2010.

[17]

G. Lavee, E. Rivlin, and M. Rudzsky. Understanding video events: A survey of methods for automatic interpretation of semantic occurrences in videos. TSMC, 39(5), 2009.

Digital Library

[18]

L.-J. Li, H. Su, E. P. Xing, and L. Fei-Fei. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In NIPS, 2010.

Digital Library

[19]

X. Li, E. Gavves, C. G. M. Snoek, M. Worring, and A. W. M. Smeulders. Personalizing automated image annotation using cross-entropy. In ACM Multimedia, 2011.

Digital Library

[20]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60, 2004.

Digital Library

[21]

S. Maji, A. C. Berg, and J. Malik. Classification using intersection kernel support vector machines is efficient. In CVPR, 2008.

[22]

M. Merler, B. Huang, L. Xie, G. Hua, and A. Natsev. Semantic model vectors for complex video event recognition. IEEE Trans. Multimedia, 14(1), 2012.

Digital Library

[23]

M. R. Naphade, J. R. Smith, J. Tesić, S.-F. Chang, W. Hsu, L. S. Kennedy, A. G. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE MultiMedia, 13(3), 2006.

Digital Library

[24]

P. Natarajan, S. Wu, S. N. P. Vitaladevuni, X. Zhuang, S. Tsakalidis, U. Park, R. Prasad, and P. Natarajan. Multimodal feature fusion for robust event detection in web videos. In CVPR, 2012.

Digital Library

[25]

R. Y. Rubinstein and D. P. Kroese. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer, 2004.

Digital Library

[26]

S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. In CVPR, 2012.

Digital Library

[27]

S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: primal estimated sub-gradient solver for svm. Math. Program., 127(1), 2011.

Digital Library

[28]

A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. In ACM MIR, 2006.

Digital Library

[29]

C. G. M. Snoek and M. Worring. Concept-based video retrieval. FnTIR, 2(4), 2009.

Digital Library

[30]

A. Tamrakar, S. Ali, Q. Yu, J. Liu, O. Javed, A. Divakaran, H. Cheng, and H. S. Sawhney. Evaluation of low-level features and their combinations for complex event detection in open source videos. In CVPR, 2012.

[31]

TRECVID Multimedia Event Detection Evaluation Track, 2011. http://www.nist.gov/itl/iad/mig/med.cfm.

[32]

K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek. Evaluating color descriptors for object and scene recognition. TPAMI, 32(9), 2010.

Digital Library

[33]

X.-Y. Wei, C.-W. Ngo, and Y.-G. Jiang. Selection of concept detectors for video search by ontology-enriched semantic spaces. TMM, 10(6), 2008.

Digital Library

[34]

L. Xie, H. Sundaram, and M. Campbell. Event mining in multimedia streams. Proceedings of the IEEE, 96, 2008.

[35]

D. Xu and S.-F. Chang. Video event recognition using kernel methods with multilevel temporal alignment. TPAMI, 30(11), 2008.

Digital Library

Cited By

Jin YJiang WYang YMu Y(2022)Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and MatchingIEEE Transactions on Multimedia10.1109/TMM.2021.307362424(1896-1908)Online publication date: 2022
https://doi.org/10.1109/TMM.2021.3073624
Honarparvar SSaeedi SLiang SSquires J(2021)Design and Development of an Internet of Smart Cameras Solution for Complex Event Detection in COVID-19 Risk Behaviour RecognitionISPRS International Journal of Geo-Information10.3390/ijgi1002008110:2(81)Online publication date: 18-Feb-2021
https://doi.org/10.3390/ijgi10020081
Luo MChang XGong C(2021)Reliable shot identification for complex event detection via visual-semantic embeddingComputer Vision and Image Understanding10.1016/j.cviu.2021.103300213:COnline publication date: 1-Dec-2021
https://dl.acm.org/doi/10.1016/j.cviu.2021.103300
Show More Cited By

Index Terms

Searching informative concept banks for video event detection
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization

Recommendations

Composite Concept Discovery for Zero-Shot Video Event Detection
ICMR '14: Proceedings of International Conference on Multimedia Retrieval

We consider automated detection of events in video without the use of any visual training examples. A common approach is to represent videos as classification scores obtained from a vocabulary of pre-trained concept classifiers. Where others construct ...
Recommendations for video event recognition using concept vocabularies
ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

Representing videos using vocabularies composed of concept detectors appears promising for event recognition. While many have recently shown the benefits of concept vocabularies for recognition, the important question what concepts to include in the ...
Encoding Concept Prototypes for Video Event Detection and Summarization
ICMR '15: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval

This paper proposes a new semantic video representation for few and zero example event detection and unsupervised video event summarization. Different from existing works, which obtain a semantic representation by training concepts over images or entire ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

April 2013

362 pages

ISBN:9781450320337

DOI:10.1145/2461466

General Chairs:
Ramesh Jain
University of California, Irvine, USA
,
Balakrisknan Prabhakaran
University of Texas at Dallas, USA
,
Program Chairs:
Marcel Worring
University of Amsterdam, The Netherlands
,
John Smith
IBM Research, New York, USA
,
Tat-Seng Chua
National University of Singapore

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 April 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

ICMR'13

Sponsor:

SIGMM

ICMR'13: International Conference on Multimedia Retrieval

April 16 - 20, 2013

Texas, Dallas, USA

Acceptance Rates

ICMR '13 Paper Acceptance Rate 38 of 96 submissions, 40%;

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

28
Total Citations
View Citations
184
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jin YJiang WYang YMu Y(2022)Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and MatchingIEEE Transactions on Multimedia10.1109/TMM.2021.307362424(1896-1908)Online publication date: 2022
https://doi.org/10.1109/TMM.2021.3073624
Honarparvar SSaeedi SLiang SSquires J(2021)Design and Development of an Internet of Smart Cameras Solution for Complex Event Detection in COVID-19 Risk Behaviour RecognitionISPRS International Journal of Geo-Information10.3390/ijgi1002008110:2(81)Online publication date: 18-Feb-2021
https://doi.org/10.3390/ijgi10020081
Luo MChang XGong C(2021)Reliable shot identification for complex event detection via visual-semantic embeddingComputer Vision and Image Understanding10.1016/j.cviu.2021.103300213:COnline publication date: 1-Dec-2021
https://dl.acm.org/doi/10.1016/j.cviu.2021.103300
Song HWu XYu WJia Y(2018)Extracting Key Segments of Videos for Event Detection by Learning From Web SourcesIEEE Transactions on Multimedia10.1109/TMM.2017.276332220:5(1088-1100)Online publication date: May-2018
https://doi.org/10.1109/TMM.2017.2763322
Boer MLu YZhang HSchutte KNgo CKraaij W(2017)Semantic Reasoning in Zero Example Video Event RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/313128813:4(1-17)Online publication date: 4-Oct-2017
https://dl.acm.org/doi/10.1145/3131288
Ma ZChang XXu ZSebe NHauptmann A(2017)Joint Attributes and Event Analysis for Multimedia Event DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2017.2709308(1-10)Online publication date: 2017
https://doi.org/10.1109/TNNLS.2017.2709308
Chang XMa ZYang YZeng ZHauptmann A(2017)Bi-Level Semantic Representation Analysis for Multimedia Event DetectionIEEE Transactions on Cybernetics10.1109/TCYB.2016.253954647:5(1180-1197)Online publication date: May-2017
https://doi.org/10.1109/TCYB.2016.2539546
Song HWu XLiang WJia Y(2017)Recognizing key segments of videos for video annotation by learning from web image setsMultimedia Tools and Applications10.1007/s11042-016-3253-176:5(6111-6126)Online publication date: 1-Mar-2017
https://dl.acm.org/doi/10.1007/s11042-016-3253-1
Zhang XZhang HZhang YYang YWang MLuan HLi JChua T(2016)Deep Fusion of Multiple Semantic Cues for Complex Event RecognitionIEEE Transactions on Image Processing10.1109/TIP.2015.251158525:3(1033-1046)Online publication date: 1-Mar-2016
https://dl.acm.org/doi/10.1109/TIP.2015.2511585
Wang Yvon der Weth CZhang YLow KSingh VKankanhalli M(2016)Concept Based Hybrid Fusion of Multimodal Event Signals2016 IEEE International Symposium on Multimedia (ISM)10.1109/ISM.2016.0013(14-19)Online publication date: Dec-2016
https://doi.org/10.1109/ISM.2016.0013
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten