Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2578726.2578764acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
tutorial

Zero-Example Event Search using MultiModal Pseudo Relevance Feedback

Published: 01 April 2014 Publication History

Abstract

We propose a novel method MultiModal Pseudo Relevance Feedback (MMPRF) for event search in video, which requires no search examples from the user. Pseudo Relevance Feedback has shown great potential in retrieval tasks, but previous works are limited to unimodal tasks with only a single ranked list. To tackle the event search task which is inherently multimodal, our proposed MMPRF takes advantage of multiple modalities and multiple ranked lists to enhance event search performance in a principled way. The approach is unique in that it leverages not only semantic features, but also non-semantic low-level features for event search in the absence of training data. Evaluated on the TRECVID MEDTest dataset, the approach improves the baseline by up to 158% in terms of the mean average precision. It also significantly contributes to CMU Team's final submission in TRECVID-13 Multimedia Event Detection.

References

[1]
M. Berkelaar. lpsolve: Interface to lp solve v.5.5 to solve linear/integer programs. R package version, 5(4), 2008.
[2]
S. P. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004.
[3]
G. Cao, J. Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In SIGIR, pages 243--250, 2008.
[4]
J. Dalton, J. Allan, and P. Mirajkar. Zero-shot video retrieval using content and concepts. In CIKM, pages 1857--1860, 2013.
[5]
A. G. Hauptmann, M. G. Christel, and R. Yan. Video retrieval based on semantic concepts. Proceedings of the IEEE, 96(4):602--622, 2008.
[6]
W. H. Hsu, L. S. Kennedy, and S.-F. Chang. Video search reranking via information bottleneck principle. In Multimedia, pages 35--44, 2006.
[7]
G. Iyengar, P. Duygulu, S. Feng, P. Ircing, S. Khudanpur, et al. Joint visual-text modeling for automatic retrieval of multimedia documents. In Multimedia, pages 21--30, 2005.
[8]
L. Jiang, A. G. Hauptmann, and G. Xiang. Leveraging high-level and low-level features for multimedia event detection. In Multimedia, pages 449--458, 2012.
[9]
T. Joachims. A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. Technical report, DTIC Document, 1996.
[10]
T. Joachims. Optimizing search engines using clickthrough data. In SIGKDD, pages 133--142, 2002.
[11]
S.-J. Kim, K. Koh, M. Lustig, S. Boyd, and D. Gorinevsky. An interior-point method for large-scale-l1-regularized least squares. Selected Topics in Signal Processing, IEEE Journal of, 1(4):606--617, 2007.
[12]
A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pages 1106--1114, 2012.
[13]
Z. Lan, L. Bao, S. Yu, W. Liu, and A. G. Hauptmann. Double fusion for multimedia event detection. In Advances in Multimedia Modeling, pages 173--185, 2012.
[14]
Z. Lan, L. Jiang, S. Yu, et al. Informedia@trecvid 2013. In NIST TRECVID, Workshop, 2013.
[15]
V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR, pages 120--127, 2001.
[16]
K. S. Lee, W. B. Croft, and J. Allan. A cluster-based resampling method for pseudo-relevance feedback. In SIGIR, pages 235--242, 2008.
[17]
Y. Liu, T. Mei, X.-S. Hua, J. Tang, X. Wu, and S. Li. Learning to video search rerank via pseudo preference feedback. In ICME, pages 297--300, 2008.
[18]
Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In SIGIR, pages 579--586, 2010.
[19]
I. Mironica, B. Ionescu, J. Uijlings, and N. Sebe. Fisher kernel based relevance feedback for multimodal video retrieval. In ICMR, pages 65--72, 2013.
[20]
S. Oh, S. McCloskey, I. Kim, A. Vahdat, K. J. Cannons, H. Hajimirsadeghi, G. Mori, A. A. Perera, M. Pandey, and J. J. Corso. Multimedia event detection with multimodal feature fusion and temporal concept localization. Machine Vision and Applications, pages 1--21, 2013.
[21]
F. Perronnin, J. Sánchez, and T. Mensink. Improving the fisher kernel for large-scale image classification. In ECCV, pages 143--156, 2010.
[22]
C. G. Snoek, M. Worring, and A. W. Smeulders. Early versus late fusion in semantic video analysis. In Multimedia, pages 399--402, 2005.
[23]
X. Tian, L. Yang, J. Wang, Y. Yang, X. Wu, and X.-S. Hua. Bayesian video search reranking. In Multimedia, pages 131--140, 2008.
[24]
W. Tong, Y. Yang, L. Jiang, S. Yu, Z. Lan, Z. Ma, W. Sze, E. Younessian, and A. G. Hauptmann. E-lamp: integration of innovative ideas for multimedia event detection. Machine Vision and Applications, pages 1--11, 2013.
[25]
H. Wang, A. Klaser, C. Schmid, and C. L. Liu. Action recognition by dense trajectories. In CVPR, pages 3169--3176, 2011.
[26]
Q. Wu, C. J. Burges, K. M. Svore, and J. Gao. Adapting boosting for information retrieval measures. Information Retrieval, 13(3):254--270, 2010.
[27]
R. Yan, A. G. Hauptmann, and R. Jin. Multimedia search with pseudo-relevance feedback. In CVIR, pages 238--247, 2003.
[28]
R. Yan, A. G. Hauptmann, and R. Jin. Negative pseudo-relevance feedback in content-based video retrieval. In Multimedia, pages 343--346, 2003.
[29]
L. Yang and A. Hanjalic. Supervised reranking for web image search. In Multimedia, pages 183--192, 2010.
[30]
E. Younessian, T. Mitamura, and A. G. Hauptmann. Multimodal knowledge-based analysis in multimedia event detection. In ICMR, page 51, 2012.
[31]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR, pages 334--342, 2001.
[32]
H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301--320, 2005.

Cited By

View all

Index Terms

  1. Zero-Example Event Search using MultiModal Pseudo Relevance Feedback

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICMR '14: Proceedings of International Conference on Multimedia Retrieval
      April 2014
      564 pages
      ISBN:9781450327824
      DOI:10.1145/2578726
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 April 2014

      Check for updates

      Author Tags

      1. 0Ex
      2. MED
      3. MultiModal Pseudo Relevance Feedback
      4. Multimedia Event Detection
      5. PRF
      6. Zero-Example

      Qualifiers

      • Tutorial
      • Research
      • Refereed limited

      Conference

      ICMR '14
      ICMR '14: International Conference on Multimedia Retrieval
      April 1 - 4, 2014
      Glasgow, United Kingdom

      Acceptance Rates

      ICMR '14 Paper Acceptance Rate 21 of 111 submissions, 19%;
      Overall Acceptance Rate 254 of 830 submissions, 31%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 27 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)(Un)likelihood Training for Interpretable EmbeddingACM Transactions on Information Systems10.1145/363275242:3(1-26)Online publication date: 13-Nov-2023
      • (2019)A Fine Granularity Object-Level Representation for Event Detection and RecountingIEEE Transactions on Multimedia10.1109/TMM.2018.288447821:6(1450-1463)Online publication date: 22-May-2019
      • (2019)New ModalityIEEE Transactions on Multimedia10.1109/TMM.2018.286236321:2(402-415)Online publication date: 1-Feb-2019
      • (2019)Decomposition-Based Evolutionary Multiobjective Optimization to Self-Paced LearningIEEE Transactions on Evolutionary Computation10.1109/TEVC.2018.285076923:2(288-302)Online publication date: Apr-2019
      • (2019)Concept‐Based and Event‐Based Video Search in Large Video CollectionsBig Data Analytics for Large‐Scale Multimedia Search10.1002/9781119376996.ch2(31-60)Online publication date: 15-Mar-2019
      • (2018)A deep learned method for video indexing and retrievalProceedings of the 26th Pacific Conference on Computer Graphics and Applications: Short Papers10.2312/pg.20181287(85-88)Online publication date: 8-Oct-2018
      • (2018)Self-Paced Learning with Statistics Uncertainty PriorIEICE Transactions on Information and Systems10.1587/transinf.2017EDL8169E101.D:3(812-816)Online publication date: 2018
      • (2018)Query-Free Clothing Retrieval via Implicit Relevance FeedbackIEEE Transactions on Multimedia10.1109/TMM.2017.278525320:8(2126-2137)Online publication date: Aug-2018
      • (2018)Learning From Web Videos for Event ClassificationIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2017.276462428:10(3019-3029)Online publication date: 1-Oct-2018
      • (2018)Social Relevance Feedback Based on Multimedia Content PowerIEEE Transactions on Computational Social Systems10.1109/TCSS.2017.27662505:1(109-117)Online publication date: Mar-2018
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media