Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3422844.3423051acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

SoccerDB: A Large-Scale Database for Comprehensive Video Understanding

Published: 12 October 2020 Publication History

Abstract

Soccer videos can serve as a perfect research object for video understanding because soccer games are played under well-defined rules while complex and intriguing enough for researchers to study. In this paper, we propose a new soccer video database named SoccerDB, comprising 171,191 video segments from 346 high-quality soccer games. The database contains 702,096 bounding boxes, 37,709 essential event labels with time boundary, and 17,115 highlight annotations for object detection, action recognition, temporal action localization, and highlight detection tasks. To our knowledge, it is the largest database for comprehensive sports video understanding on various aspects. We further survey a collection of strong baselines on SoccerDB, which have demonstrated state-of-the-art performances on independent tasks. Our evaluation suggests that we can benefit significantly when jointly considering the inner correlations among those tasks. We believe the release of SoccerDB will tremendously advance researches around comprehensive video understanding. Our dataset and code published on https://github.com/newsdata/SoccerDB.

References

[1]
Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675 (2016).
[2]
Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, and Juan Carlos Niebles. 2015. Activitynet: A large-scale video benchmark for human activity understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 961--970.
[3]
Joao Carreira and Andrew Zisserman. 2017. Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6299--6308.
[4]
Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, et al. 2019. MMDetection: Open MMLab Detection Toolbox and Benchmark. arXiv preprint arXiv:1906.07155 (2019).
[5]
Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, and Kaiming He. 2019. Slowfast networks for video recognition. In Proceedings of the IEEE International Conference on Computer Vision. 6202--6211.
[6]
Silvio Giancola, Mohieddine Amine, Tarek Dghaily, and Bernard Ghanem. 2018. Soccernet: A scalable dataset for action spotting in soccer videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1711--1721.
[7]
Georgia Gkioxari, Ross Girshick, and Jitendra Malik. 2015. Contextual action recognition with r* cnn. In Proceedings of the IEEE international conference on computer vision. 1080--1088.
[8]
Chunhui Gu, Chen Sun, David A Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, et al. 2018. AVA: A video dataset of spatio-temporally localized atomic visual actions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6047--6056.
[9]
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1725--1732.
[10]
Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, and Shilei Wen. 2019. BMN: BoundaryMatching Network for Temporal Action Proposal Generation. In Proceedings of the IEEE International Conference on Computer Vision. 3889--3898.
[11]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.
[12]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.
[13]
Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, and Li Fei-Fei. 2016. Detecting events and key actors in multiperson videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3043--3053.
[14]
Arnau Raventos, Raul Quijada, Luis Torres, and Francesc Tarrés. 2015. Automatic summarization of soccer highlights using audio-visual descriptors. SpringerPlus 4, 1 (2015), 1--19.
[15]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.
[16]
Huang-Chia Shih. 2017. A survey of content-aware video analysis for sports. IEEE Transactions on Circuits and Systems for Video Technology 28, 5 (2017), 1212--1231.
[17]
Gunnar A Sigurdsson, Olga Russakovsky, and Abhinav Gupta. 2017. What actions are needed for understanding human actions in videos?. In Proceedings of the IEEE International Conference on Computer Vision. 2137--2146.
[18]
Yale Song, Jordi Vallmitjana, Amanda Stent, and Alejandro Jaimes. 2015. Tvsum: Summarizing web videos using titles. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5179--5187.
[19]
Rajkumar Theagarajan, Federico Pala, Xiu Zhang, and Bir Bhanu. 2018. Soccer: Who has the ball? Generating visual analytics and player statistics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1749--1757.
[20]
Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7794--7803.
[21]
Xiaolong Wang and Abhinav Gupta. 2018. Videos as space-time region graphs. In Proceedings of the European Conference on Computer Vision (ECCV). 399--417.
[22]
Zhao Zhao, Shuqiang Jiang, Qingming Huang, and Guangyu Zhu. 2006. Highlight summarization in sports video based on replay detection. In 2006 IEEE international conference on multimedia and expo. IEEE, 1613--1616.

Cited By

View all
  • (2024)Ball Tracking Based on Multiscale Feature Enhancement and Cooperative Trajectory MatchingApplied Sciences10.3390/app1404137614:4(1376)Online publication date: 7-Feb-2024
  • (2024)Efficient Action Spotting Using Saliency Feature WeightingIEICE Transactions on Information and Systems10.1587/transinf.2022EDP7210E107.D:1(105-114)Online publication date: 1-Jan-2024
  • (2024)PANet: A Large-Scale Benchmark for Dense Action Detection from Table Tennis Match Broadcasting VideosACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363351620:4(1-23)Online publication date: 11-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MMSports '20: Proceedings of the 3rd International Workshop on Multimedia Content Analysis in Sports
October 2020
66 pages
ISBN:9781450381499
DOI:10.1145/3422844
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. action recognition
  2. highlight detection
  3. object detection
  4. temporal action localization

Qualifiers

  • Research-article

Conference

MM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 29 of 49 submissions, 59%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)3
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Ball Tracking Based on Multiscale Feature Enhancement and Cooperative Trajectory MatchingApplied Sciences10.3390/app1404137614:4(1376)Online publication date: 7-Feb-2024
  • (2024)Efficient Action Spotting Using Saliency Feature WeightingIEICE Transactions on Information and Systems10.1587/transinf.2022EDP7210E107.D:1(105-114)Online publication date: 1-Jan-2024
  • (2024)PANet: A Large-Scale Benchmark for Dense Action Detection from Table Tennis Match Broadcasting VideosACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363351620:4(1-23)Online publication date: 11-Jan-2024
  • (2024)TACDECProceedings of the 15th ACM Multimedia Systems Conference10.1145/3625468.3652166(250-256)Online publication date: 15-Apr-2024
  • (2024)OSL-ActionSpotting: A Unified Library for Action Spotting in Sports Videos2024 IEEE International Workshop on Sport, Technology and Research (STAR)10.1109/STAR62027.2024.10635981(132-137)Online publication date: 8-Jul-2024
  • (2024)Small Target Detection in Soccer Scenes Based on YOLOv82024 6th International Conference on Robotics and Computer Vision (ICRCV)10.1109/ICRCV62709.2024.10758608(1-5)Online publication date: 20-Sep-2024
  • (2024)Beyond the Premier: Assessing Action Spotting Transfer Capability Across Diverse Domains2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00343(3386-3398)Online publication date: 17-Jun-2024
  • (2024)SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00334(3293-3305)Online publication date: 17-Jun-2024
  • (2024)X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Models2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW63382.2024.00332(3267-3279)Online publication date: 17-Jun-2024
  • (2024)Unsupervised Clustering in Football Analysis: A Color-Segmentation and Lighting Adaptation ApproachIEEE Access10.1109/ACCESS.2024.350682712(178127-178141)Online publication date: 2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media