short-paper

VIRET: A Video Retrieval Tool for Interactive Known-item Search

Authors:

Gregor Kovalčík,

Tomáš Souček,

Jaroslav Moravec,

Přemysl ČechAuthors Info & Claims

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

Pages 177 - 181

https://doi.org/10.1145/3323873.3325034

Published: 05 June 2019 Publication History

Abstract

Known-item search in large video collections still represents a challenging task for current video retrieval systems that have to rely both on state-of-the-art ranking models and interactive means of retrieval. We present a general overview of the current version of the VIRET tool, an interactive video retrieval system that successfully participated at several international evaluation campaigns. The system is based on multi-modal search and convenient inspection of results. Based on collected query logs of four users controlling instances of the tool at the Video Browser Showdown 2019, we highlight query modification statistics and a list of successful query formulation strategies. We conclude that the VIRET tool represents a competitive reference interactive system for effective known-item search in one thousand hours of video.

References

[1]

Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, Franca Debole, Fabrizio Falchi, Claudio Gennaro, Lucia Vadicamo, and Claudio Vairo. 2019. VISIONE at VBS2019. In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part II. 591--596.

[2]

Stelios Andreadis, Anastasia Moumtzidou, Damianos Galanopoulos, Foteini Markatopoulou, Konstantinos Apostolidis, Thanassis Mavropoulos, Ilias Gialampoukidis, Stefanos Vrochidis, Vasileios Mezaris, Ioannis Kompatsiaris, and Ioannis Patras. 2019. VERGE in VBS 2019. In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part II. 602--608.

[3]

Kai Uwe Barthel and Nico Hezel. 2019. Visually exploring millions of images using image maps and graphs. In Big Data Analytics for Large-scale Multimedia Search, Benoit Huet, Stefanos Vrochidis, and Edward Chang (Eds.). John Wiley and Sons Inc., 251--275.

[4]

Kai Uwe Barthel, Nico Hezel, and Klaus Jung. 2018. Fusing Keyword Search and Visual Exploration for Untagged Videos. In MultiMedia Modeling, Klaus Schoeffmann, Thanarat H. Chalidabhongse, Chong Wah Ngo, Supavadee Aramvith, Noel E. O'Connor, Yo-Sung Ho, Moncef Gabbouj, and Ahmed Elgammal (Eds.). Springer International Publishing, Cham, 413--418.

[5]

Adam Blazek, Jakub Lokoc, and Tomás Skopal. 2014. Video Retrieval with Feature Signature Sketches. In Similarity Search and Applications, Agma Juci Machado Traina, Caetano Traina, and Robson Leonardo Ferreira Cordeiro (Eds.). Springer International Publishing, Cham, 25--36.

[6]

Petra Budíková, Michal Batko, and Pavel Zezula. 2017. Fusion Strategies for Large- Scale Multi-modal Image Retrieval. T. Large-Scale Data- and Knowledge-Centered Systems 33 (2017), 146--184.

[7]

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Computer Vision -- ECCV 2018, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 833--851.

Digital Library

[8]

Claudiu Cobârzan, Klaus Schoeffmann, Werner Bailer, Wolfgang Hürst, Adam Blazek, Jakub Lokoc, Stefanos Vrochidis, Kai Uwe Barthel, and Luca Rossetto. 2017. Interactive video search tools: a detailed analysis of the video browser showdown 2015. Multimedia Tools Appl. 76, 4 (2017), 5539--5571.

Digital Library

[9]

Cathal Gurrin, Klaus Schoeffmann, Hideo Joho, Andreas Leibetseder, Liting Zhou, Aaron Duane, Duc-Tien Dang-Nguyen, Michael Riegler, Luca Piras, Minh-Triet Tran, Jakub Loko?, and Wolfgang Hürst. 2019. {Invited papers} Comparing Approaches to Interactive Lifelog Search at the Lifelog Search Challenge (LSC2018). ITE Transactions on Media Technology and Applications 7, 2 (2019), 46--59.

[10]

Cathal Gurrin, Klaus Schoeffmann, Hideo Joho, Bernd Munzer, Rami Albatal, Frank Hopfgartner, Liting Zhou, and Duc-Tien Dang-Nguyen. 2019. A Test Collection for Interactive Lifelog Retrieval. In MultiMedia Modeling, Ioannis Kompatsiaris, Benoit Huet, Vasileios Mezaris, Cathal Gurrin, Wen-Huang Cheng, and Stefanos Vrochidis (Eds.). Springer International Publishing, Cham, 312--324.

[11]

Michael Gygli. 2018. Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks. In 2018 International Conference on Content- Based Multimedia Indexing, CBMI 2018, La Rochelle, France, September 4--6, 2018. 1--4.

[12]

Peiyun Hu and Deva Ramanan. 2016. Finding Tiny Faces. CoRR abs/1612.04402 (2016). arXiv:1612.04402 http://arxiv.org/abs/1612.04402

[13]

Weiming Hu, Nianhua Xie, Li Li, Xianglin Zeng, and S. Maybank. 2011. A Survey on Visual Content-Based Video Indexing and Retrieval. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 41, 6 (Nov 2011), 797--819.

Digital Library

[14]

Jakub Lokoc, Werner Bailer, Klaus Schoeffmann, Bernd Münzer, and George Awad. 2018. On Influential Trends in Interactive Video Retrieval: Video Browser Showdown 2015--2017. IEEE Trans. Multimedia 20, 12 (2018), 3361--3376.

Digital Library

[15]

Jakub Lokoc, Gregor Kovalík, Bernd Münzer, Klaus Schöffmann, Werner Bailer, Ralph Gasser, Stefanos Vrochidis, Phuong Anh Nguyen, Sitapa Rujikietgumjorn, and Kai Uwe Barthel. 2019. Interactive Search or Sequential Browsing? A Detailed Analysis of the Video Browser Showdown 2018. ACM Trans. Multimedia Comput. Commun. Appl. 15, 1, Article 29 (Feb. 2019), 18 pages.

Digital Library

[16]

Jakub Lokoc, Gregor Kovalík, Tomás Souek, Jaroslav Moravec, Jan Bodnár, and Premysl Cech. 2019. VIRET Tool Meets NasNet. In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part II. 597--601.

[17]

Jakub Lokoc, Tomás Soucek, and Gregor Kovalcík. 2018. Using an Interactive Video Retrieval Tool for LifeLog Data. In Proceedings of the 2018 ACM Workshop on The Lifelog Search Challenge, LSC@ICMR 2018, Yokohama, Japan, June 11, 2018. 15--19.

Digital Library

[18]

Phuong Anh Nguyen, Chong-Wah Ngo, Danny Francis, and Benoit Huet. 2019. VIREO @ Video Browser Showdown 2019. In MultiMedia Modeling - 25th International Conference,MMM2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part II. 609--615.

[19]

Luca Rossetto, Ivan Giangreco, Claudiu Tanase, and Heiko Schuldt. 2016. Vitrivr: A Flexible Retrieval Stack Supporting Multiple Query Modes for Searching in Multimedia Collections. In Proceedings of the 24th ACM International Conference on Multimedia (MM '16). ACM, New York, NY, USA, 1183--1186.

Digital Library

[20]

Luca Rossetto, Mahnaz Amiri Parian, Ralph Gasser, Ivan Giangreco, Silvan Heller, and Heiko Schuldt. 2019. Deep Learning-Based Concept Detection in vitrivr. In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part II. 616--621.

[21]

Luca Rossetto, Heiko Schuldt, George Awad, and Asad A. Butt. 2019. V3C - A Research Video Collection. In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part I. 349--360.

[22]

Klaus Schoeffmann, Bernd Münzer, Andreas Leibetseder, Jürgen Primus, and Sabrina Kletz. 2019. Autopiloting Feature Maps: The Deep Interactive Video Exploration (diveXplore) System at VBS2019. In MultiMedia Modeling - 25th International Conference, MMM 2019, Thessaloniki, Greece, January 8--11, 2019, Proceedings, Part II. 585--590.

[23]

B. Shi, X. Bai, and C. Yao. 2017. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 11 (Nov 2017), 2298--2304.

Digital Library

[24]

Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2015. Learning Spatiotemporal Features with 3D Convolutional Networks. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV) (ICCV '15). IEEE Computer Society, Washington, DC, USA, 4489--4497.

Digital Library

[25]

Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. 2017. Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge. IEEE Trans. Pattern Anal. Mach. Intell. 39, 4 (2017), 652--663.

Digital Library

[26]

Pavel Zezula. 2015. Similarity Searching for the Big Data. Mobile Networks and Applications 20, 4 (01 Aug 2015), 487--496.

Digital Library

[27]

X. Zhou, C. Yao, H. Wen, Y. Wang, S. Zhou, W. He, and J. Liang. 2017. EAST: An Efficient and Accurate Scene Text Detector. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2642--2651.

[28]

Justin Zobel and Alistair Moffat. 2006. Inverted files for text search engines. ACM Comput. Surv. 38, 2 (2006), 6.

Digital Library

[29]

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V. Le. 2017. Learning Transferable Architectures for Scalable Image Recognition. CoRR abs/1707.07012 (2017). arXiv:1707.07012 http://arxiv.org/abs/1707.07012

Cited By

Pham Gia KTran Le HNguyen Huynh PLe Tran SPham Hoang LPham Xuan TTran Ham DHuynh Ngoc THoang K(2023)An Interactive System for Multimedia Retrieval in Video Collection with Temporal IntegrationProceedings of the 12th International Symposium on Information and Communication Technology10.1145/3628797.3629019(989-996)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3628797.3629019
Tran MTran M(2023)Anomaly Event Retrieval System from TV News and Surveillance CamerasProceedings of the 12th International Symposium on Information and Communication Technology10.1145/3628797.3628891(953-959)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3628797.3628891
Alpay TMagg SBroze PSpeck D(2023)Multimodal video retrieval with CLIP: a user studyInformation Retrieval Journal10.1007/s10791-023-09425-226:1-2Online publication date: 29-Sep-2023
https://doi.org/10.1007/s10791-023-09425-2
Show More Cited By

Index Terms

VIRET: A Video Retrieval Tool for Interactive Known-item Search
1. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Video search

Recommendations

Using an Interactive Video Retrieval Tool for LifeLog Data
LSC '18: Proceedings of the 2018 ACM Workshop on The Lifelog Search Challenge

Known-item search in multimodal lifelog data represents a challenging task for present search engines. Since sequences of temporally close images represent a significant part of the provided data, an interactive video retrieval tool with few extensions ...
Cineast: A Multi-feature Sketch-Based Video Retrieval Engine
ISM '14: Proceedings of the 2014 IEEE International Symposium on Multimedia

Despite the tremendous importance and availability of large video collections, support for video retrieval is still rather limited and is mostly tailored to very concrete use cases and collections. In image retrieval, for instance, standard keyword ...
Towards Automatic Configuration of Interactive Known-Item Search Systems
Similarity Search and Applications
Abstract
Searching for one particular scene in a large annotation-free video archive becomes a common task in the multimedia age. Since the task is inherently difficult without knowledge of the scene location, multimedia management systems utilize various ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

June 2019

427 pages

ISBN:9781450367653

DOI:10.1145/3323873

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada
,
Alberto Del Bimbo
University of Florence, Italy
,
Zhongfei Zhang
Binghamton University, State University of New York, USA
,
Program Chairs:
Alexander Hauptmann
Carnegie Mellon University, USA
,
K. Selcuk Candan
Arizona State University, USA
,
Marco Bertini
University of Florence, Italy
,
Lexing Xie
Australia National University, Australia
,
Xiao-Yong Wei
Sichuan University, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Czech Science Foundation

Conference

ICMR '19

Sponsor:

SIGMM

ICMR '19: International Conference on Multimedia Retrieval

June 10 - 13, 2019

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

20
Total Citations
View Citations
333
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pham Gia KTran Le HNguyen Huynh PLe Tran SPham Hoang LPham Xuan TTran Ham DHuynh Ngoc THoang K(2023)An Interactive System for Multimedia Retrieval in Video Collection with Temporal IntegrationProceedings of the 12th International Symposium on Information and Communication Technology10.1145/3628797.3629019(989-996)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3628797.3629019
Tran MTran M(2023)Anomaly Event Retrieval System from TV News and Surveillance CamerasProceedings of the 12th International Symposium on Information and Communication Technology10.1145/3628797.3628891(953-959)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3628797.3628891
Alpay TMagg SBroze PSpeck D(2023)Multimodal video retrieval with CLIP: a user studyInformation Retrieval Journal10.1007/s10791-023-09425-226:1-2Online publication date: 29-Sep-2023
https://doi.org/10.1007/s10791-023-09425-2
Ribeiro RTrifan ANeves A(2022)Lifelog Retrieval From Daily Digital Data: Narrative ReviewJMIR mHealth and uHealth10.2196/3051710:5(e30517)Online publication date: 2-May-2022
https://doi.org/10.2196/30517
Ma ZNgo CMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)Interactive Video Corpus Moment Retrieval using Reinforcement LearningProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548277(296-306)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3548277
Lokoč JVeselý PMejzlík FKovalčík GSouček TRossetto LSchoeffmann KBailer WGurrin CSauter LSong JVrochidis SWu JJónsson B(2021)Is the Reign of Interactive Search Eternal? Findings from the Video Browser Showdown 2020ACM Transactions on Multimedia Computing, Communications, and Applications10.1145/344503117:3(1-26)Online publication date: 22-Jul-2021
https://dl.acm.org/doi/10.1145/3445031
Bommisetty RKhare AKhare MPalanisamy P(2021)Content-Based Video Retrieval Using Integration of Curvelet Transform and Simple Linear Iterative ClusteringInternational Journal of Image and Graphics10.1142/S021946782250018822:02Online publication date: 16-Jun-2021
https://doi.org/10.1142/S0219467822500188
Rossetto LGasser RLokoc JBailer WSchoeffmann KMuenzer BSoucek TNguyen PBolettieri PLeibetseder AVrochidis S(2021)Interactive Video Retrieval in the Age of Deep Learning – Detailed Evaluation of VBS 2019IEEE Transactions on Multimedia10.1109/TMM.2020.298094423(243-256)Online publication date: 2021
https://doi.org/10.1109/TMM.2020.2980944
Qi MQin JYang YWang YLuo J(2021)Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video RetrievalIEEE Transactions on Image Processing10.1109/TIP.2020.304868030(2989-3004)Online publication date: 2021
https://doi.org/10.1109/TIP.2020.3048680
Bommisetty RPalanisamy PKhare A(2021)Content Based Video Retrieval—Methods, Techniques and ApplicationsAdvanced Soft Computing Techniques in Data Science, IoT and Cloud Computing10.1007/978-3-030-75657-4_4(81-99)Online publication date: 6-Nov-2021
https://doi.org/10.1007/978-3-030-75657-4_4
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents