Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2461466.2461507acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
poster

Searching informative concept banks for video event detection

Published: 16 April 2013 Publication History

Abstract

An emerging trend in video event detection is to learn an event from a bank of concept detector scores. Different from existing work, which simply relies on a bank containing all available detectors, we propose in this paper an algorithm that learns from examples what concepts in a bank are most informative per event. We model finding this bank of informative concepts out of a large set of concept detectors as a rare event search. Our proposed approximate solution finds the optimal concept bank using a cross-entropy optimization. We study the behavior of video event detection based on a bank of informative concepts by performing three experiments on more than 1,000 hours of arbitrary internet video from the TRECVID multimedia event detection task. Starting from a concept bank of 1,346 detectors we show that 1.)some concept banks are more informative than others for specific events, 2.) event detection using an automatically obtained informative concept bank is more robust than using all available concepts, 3.) even for small amounts of training examples an informative concept bank outperforms a full bank and a bag-of-word event representation, and 4.) we show qualitatively that the informative concept banks make sense for the events of interest, without being programmed to do so. We conclude that for concept banks it pays to be informative.

References

[1]
T. Althoff, H. O. Song, and T. Darrell. Detection bank: An object detection based video representation for multimedia event recognition. In ACM Multimedia, 2012.
[2]
S. Ayache and G. Quénot. Video corpus annotation using active learning. In ECIR, 2008.
[3]
L. Ballan, M. Bertini, A. D. Bimbo, and G. Serra. Video event classification using string kernels. MTAP, 48(1), 2010.
[4]
L. Ballan, M. Bertini, A. Del Bimbo, L. Seidenari, and G. Serra. Event detection and recognition for semantic annotation of video. MTAP, 51, 2011.
[5]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
[6]
L. Duan, D. Xu, I. W.-H. Tsang, and J. Luo. Visual event recognition in videos by learning from web data. TPAMI, 34(9), 2012.
[7]
S. Ebadollahi, L. Xie, S.-F. Chang, and J. R. Smith. Visual event detection using multi-dimensional concept dynamics. In ICME, 2006.
[8]
N. Gkalelis, V. Mezaris, and I. Kompatsiaris. High-level event detection in video exploiting discriminant concepts. In CBMI, 2011.
[9]
N. Haering, R. Qian, and I. Sezan. A semantic event-detection approach and its application to detecting hunts in wildlife video. TCSVT, 2000.
[10]
A. G. Hauptmann, M. G. Christel, and R. Yan. Video retrieval based on semantic concepts. Proceedings of the IEEE, 2008.
[11]
B. Huurnink, K. Hofmann, and M. de Rijke. Assessing concept selection for video retrieval. In ACM MIR, 2008.
[12]
N. Inoue et al. TokyoTech
[13]
Canon at TRECVID 2011. In NIST TRECVID Workshop, 2011.
[14]
Y. A. Ivanov and A. F. Bobick. Recognition of visual activities and interactions by stochastic parsing. TPAMI, 22(8), 2000.
[15]
Y.-G. Jiang, J. Yang, C.-W. Ngo, and A. Hauptmann. Representations of keypoint-based semantic concept detection: A comprehensive study. TMM, 12(1), 2010.
[16]
Y.-G. Jiang, X. Zeng, G. Ye, S. Bhattacharya, D. Ellis, M. Shah, and S.-F. Chang. Columbia-ucf trecvid2010 multimedia event detection: Combining multiple modalities, contextual concepts, and temporal matching. In NIST TRECVID Workshop, 2010.
[17]
G. Lavee, E. Rivlin, and M. Rudzsky. Understanding video events: A survey of methods for automatic interpretation of semantic occurrences in videos. TSMC, 39(5), 2009.
[18]
L.-J. Li, H. Su, E. P. Xing, and L. Fei-Fei. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In NIPS, 2010.
[19]
X. Li, E. Gavves, C. G. M. Snoek, M. Worring, and A. W. M. Smeulders. Personalizing automated image annotation using cross-entropy. In ACM Multimedia, 2011.
[20]
D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60, 2004.
[21]
S. Maji, A. C. Berg, and J. Malik. Classification using intersection kernel support vector machines is efficient. In CVPR, 2008.
[22]
M. Merler, B. Huang, L. Xie, G. Hua, and A. Natsev. Semantic model vectors for complex video event recognition. IEEE Trans. Multimedia, 14(1), 2012.
[23]
M. R. Naphade, J. R. Smith, J. Tesić, S.-F. Chang, W. Hsu, L. S. Kennedy, A. G. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE MultiMedia, 13(3), 2006.
[24]
P. Natarajan, S. Wu, S. N. P. Vitaladevuni, X. Zhuang, S. Tsakalidis, U. Park, R. Prasad, and P. Natarajan. Multimodal feature fusion for robust event detection in web videos. In CVPR, 2012.
[25]
R. Y. Rubinstein and D. P. Kroese. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation and Machine Learning. Springer, 2004.
[26]
S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. In CVPR, 2012.
[27]
S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: primal estimated sub-gradient solver for svm. Math. Program., 127(1), 2011.
[28]
A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. In ACM MIR, 2006.
[29]
C. G. M. Snoek and M. Worring. Concept-based video retrieval. FnTIR, 2(4), 2009.
[30]
A. Tamrakar, S. Ali, Q. Yu, J. Liu, O. Javed, A. Divakaran, H. Cheng, and H. S. Sawhney. Evaluation of low-level features and their combinations for complex event detection in open source videos. In CVPR, 2012.
[31]
TRECVID Multimedia Event Detection Evaluation Track, 2011. http://www.nist.gov/itl/iad/mig/med.cfm.
[32]
K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek. Evaluating color descriptors for object and scene recognition. TPAMI, 32(9), 2010.
[33]
X.-Y. Wei, C.-W. Ngo, and Y.-G. Jiang. Selection of concept detectors for video search by ontology-enriched semantic spaces. TMM, 10(6), 2008.
[34]
L. Xie, H. Sundaram, and M. Campbell. Event mining in multimedia streams. Proceedings of the IEEE, 96, 2008.
[35]
D. Xu and S.-F. Chang. Video event recognition using kernel methods with multilevel temporal alignment. TPAMI, 30(11), 2008.

Cited By

View all
  • (2022)Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and MatchingIEEE Transactions on Multimedia10.1109/TMM.2021.307362424(1896-1908)Online publication date: 2022
  • (2021)Design and Development of an Internet of Smart Cameras Solution for Complex Event Detection in COVID-19 Risk Behaviour RecognitionISPRS International Journal of Geo-Information10.3390/ijgi1002008110:2(81)Online publication date: 18-Feb-2021
  • (2021)Reliable shot identification for complex event detection via visual-semantic embeddingComputer Vision and Image Understanding10.1016/j.cviu.2021.103300213:COnline publication date: 1-Dec-2021
  • Show More Cited By

Index Terms

  1. Searching informative concept banks for video event detection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
    April 2013
    362 pages
    ISBN:9781450320337
    DOI:10.1145/2461466
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 April 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. concept detection
    2. cross-entropy optimization
    3. event recognition

    Qualifiers

    • Poster

    Conference

    ICMR'13
    Sponsor:

    Acceptance Rates

    ICMR '13 Paper Acceptance Rate 38 of 96 submissions, 40%;
    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Zero-Shot Video Event Detection With High-Order Semantic Concept Discovery and MatchingIEEE Transactions on Multimedia10.1109/TMM.2021.307362424(1896-1908)Online publication date: 2022
    • (2021)Design and Development of an Internet of Smart Cameras Solution for Complex Event Detection in COVID-19 Risk Behaviour RecognitionISPRS International Journal of Geo-Information10.3390/ijgi1002008110:2(81)Online publication date: 18-Feb-2021
    • (2021)Reliable shot identification for complex event detection via visual-semantic embeddingComputer Vision and Image Understanding10.1016/j.cviu.2021.103300213:COnline publication date: 1-Dec-2021
    • (2018)Extracting Key Segments of Videos for Event Detection by Learning From Web SourcesIEEE Transactions on Multimedia10.1109/TMM.2017.276332220:5(1088-1100)Online publication date: May-2018
    • (2017)Semantic Reasoning in Zero Example Video Event RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/313128813:4(1-17)Online publication date: 4-Oct-2017
    • (2017)Joint Attributes and Event Analysis for Multimedia Event DetectionIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2017.2709308(1-10)Online publication date: 2017
    • (2017)Bi-Level Semantic Representation Analysis for Multimedia Event DetectionIEEE Transactions on Cybernetics10.1109/TCYB.2016.253954647:5(1180-1197)Online publication date: May-2017
    • (2017)Recognizing key segments of videos for video annotation by learning from web image setsMultimedia Tools and Applications10.1007/s11042-016-3253-176:5(6111-6126)Online publication date: 1-Mar-2017
    • (2016)Deep Fusion of Multiple Semantic Cues for Complex Event RecognitionIEEE Transactions on Image Processing10.1109/TIP.2015.251158525:3(1033-1046)Online publication date: 1-Mar-2016
    • (2016)Concept Based Hybrid Fusion of Multimodal Event Signals2016 IEEE International Symposium on Multimedia (ISM)10.1109/ISM.2016.0013(14-19)Online publication date: Dec-2016
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media