research-article

Open access

Automatic thumbnail selection for soccer videos using machine learning

Authors:

Steven A. Hicks,

Michael A. Riegler,

Pål HalvorsenAuthors Info & Claims

MMSys '22: Proceedings of the 13th ACM Multimedia Systems Conference

Pages 73 - 85

https://doi.org/10.1145/3524273.3528182

Published: 05 August 2022 Publication History

Abstract

Thumbnail selection is a very important aspect of online sport video presentation, as thumbnails capture the essence of important events, engage viewers, and make video clips attractive to watch. Traditional solutions in the soccer domain for presenting highlight clips of important events such as goals, substitutions, and cards rely on the manual or static selection of thumbnails. However, such approaches can result in the selection of sub-optimal video frames as snapshots, which degrades the overall quality of the video clip as perceived by viewers, and consequently decreases viewership, not to mention that manual processes are expensive and time consuming. In this paper, we present an automatic thumbnail selection system for soccer videos which uses machine learning to deliver representative thumbnails with high relevance to video content and high visual quality in near real-time. Our proposed system combines a software framework which integrates logo detection, close-up shot detection, face detection, and image quality analysis into a modular and customizable pipeline, and a subjective evaluation framework for the evaluation of results. We evaluate our proposed pipeline quantitatively using various soccer datasets, in terms of complexity, runtime, and adherence to a pre-defined rule-set, as well as qualitatively through a user study, in terms of the perception of output thumbnails by end-users. Our results show that an automatic end-to-end system for the selection of thumbnails based on contextual relevance and visual quality can yield attractive highlight clips, and can be used in conjunction with existing soccer broadcast pipelines which require real-time operation.

Supplementary Material

ZIP File (p73-husa-suppl.zip)

Supplemental material.

Download
2.31 MB

References

[1]

Vardan Agarwal. 2021. Face Detection Models: Which to Use and Why? https://towardsdatascience.com/face-detection-models-which-to-use-and-why-d263e82c302c

[2]

Allsvenskan. 2022. Highlights. https://highlights.allsvenskan.se/.

[3]

Gary Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).

[4]

Chen-Yu Chen, Jia-Ching Wang, Jhing-Fa Wang, and Yu-Hen Hu. 2008. Motion Entropy Feature and Its Applications to Event-Based Segmentation of Sports Video. EURASIP Journal on Advances in Signal Processing 2008 (2008).

Digital Library

[5]

Anthony Cioppa, Adrien Deliege, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, and Thomas B. Moeslund. 2020. A Context-Aware Loss Function for Action Spotting in Soccer Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]

Pete Cook. 2021. react-player. https://www.npmjs.com/package/react-player.

[7]

Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, and Marc Van Droogenbroeck. 2020. SoccerNet-v2 : A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos. arXiv:2011.13367 [cs.CV]

[8]

Eliteserien. 2022. Highlights. https://highlights.eliteserien.no/.

[9]

FIFA.com. 2018. More than half the world watched record-breaking 2018 World Cup. https://www.fifa.com/worldcup/news/more-than-half-the-world-watched-record-breaking-2018-world-cup

[10]

Malek Hammou, Cise Midoglu, Steven A. Hicks, Andrea Storås, Saeed Shafiee Sabet, Inga Strümke, Michael A. Riegler, and Pål Halvorsen. 2022. Huldra: A Framework for Collecting Crowdsourced Feedback on Multimedia Assets. In 13th ACM Multimedia Systems Conference (MMSys '22), June 14--17, 2022, Athlone, Ireland. ACM, New York, NY, USA.

Digital Library

[11]

Andreas Husa. 2022. Automated Thumbnail Selection for Soccer Videos with Machine Learning. Master's thesis. University of Oslo, Oslo, Norway.

[12]

Andreas Husa, Cise Midoglu, Malek Hammou, Pål Halvorsen, and Michael A. Riegler. 2022. HOST-ATS: Automatic Thumbnail Selection with Dashboard-Controlled ML Pipeline and Dynamic User Survey. In 13th ACM Multimedia Systems Conference (MMSys '22), June 14--17, 2022, Athlone, Ireland. ACM, New York, NY, USA.

Digital Library

[13]

Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-Scale Video Classification with Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1725--1732.

Digital Library

[14]

Jongyoo Kim, Anh-Duc Nguyen, and Sanghoon Lee. 2019. Deep CNN-Based Blind Image Quality Predictor. IEEE Transactions on Neural Networks and Learning Systems 30, 1 (2019), 11--24.

[15]

Davis King. 2021. dlib C++ Library. http://dlib.net/. Last accessed 2022-01-24.

[16]

Ryan Knott. 2021. What Are Video Thumbnails and Why Do They Matter? https://www.techsmith.com/blog/what-are-video-thumbnails/

[17]

Jacek Komorowski, Grzegorz Kurzejamski, and Grzegorz Sarwas. 2019. FootAndBall: Integrated player and ball detector. CoRR abs/1912.05445 (2019). arXiv:1912.05445 http://arxiv.org/abs/1912.05445

[18]

Harilaos Koumaras, Georgios Gardikis, George Xilouris, Evangelos Pallis, and Anastasios Kourtis. 2006. Shot boundary detection without threshold parameters. J. Electronic Imaging 15 (4 2006), 020503.

[19]

Thomas J Law. 2021. The Perfect YouTube Thumbnail Size and Best Practices. https://www.oberlo.com/blog/youtube-thumbnail-size.

[20]

Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, and Shilei Wen. 2019. BMN: Boundary-Matching Network for Temporal Action Proposal Generation. In Proceedings of IEEE International Conference on Computer Vision (ICCV).

[21]

Tianwei Lin, Xu Zhao, Haisheng Su, Chongjing Wang, and Ming Yang. 2018. BSN: Boundary Sensitive Network for Temporal Action Proposal Generation. In Proceedings of the European Conference Computer Vision (ECCV).

Digital Library

[22]

MATLAB. 2021. brisque (R2021a). https://se.mathworks.com/help/images/ref/brisque.html

[23]

Pier Luigi Mazzeo, Marco Leo, Paolo Spagnolo, and Massimiliano Nitti. 2012. Soccer Ball Detection by Comparing Different Feature Extraction Methodologies. Advances in Artificial Intelligence 2012 (2012), 12.

Digital Library

[24]

Olav Andre Nergård Rongved, Markus Stige, Steven Alexander Hicks, Vajira Lasantha Thambawita, Cise Midoglu, Evi Zouganeli, Dag Johansen, Michael Alexander Riegler, and Pål Halvorsen. 2021. Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities. Machine Learning and Knowledge Extraction 3, 4 (2021), 1030--1054.

[25]

Ricardo Ocampo. 2021. Deep CNN-Based Blind Image Quality Predictor in Python. https://towardsdatascience.com/deep-image-quality-assessment-with-tensorflow-2-0-69ed8c32f195

[26]

Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. 2015. You Only Look Once: Unified, Real-Time Object Detection. CoRR abs/1506.02640 (2015). arXiv:1506.02640 http://arxiv.org/abs/1506.02640

[27]

Olav A. Nergård Rongved, Steven A. Hicks, Vajira Thambawita, Håkon K. Stensland, Evi Zouganeli, Dag Johansen, Cise Midoglu, Michael A. Riegler, and Pål Halvorsen. 2021. Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events. International Journal of Semantic Computing 15, 02 (2021), 161--187.

[28]

Olav A. Nergård Rongved, Steven A. Hicks, Vajira Thambawita, Håkon K. Stensland, Evi Zouganeli, Dag Johansen, Michael A. Riegler, and Pål Halvorsen. 2020. Real-Time Detection of Events in Soccer Videos using 3D Convolutional Neural Networks. In Proceedings of the IEEE International Symposium on Multimedia (ISM). 135--144.

[29]

Adrian Rosebrock. 2021. OpenCV Haar Cascades. https://www.pyimagesearch.com/2021/04/12/opencv-haar-cascades/

[30]

Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. In Proceedings of Advances in Neural Information Processing Systems (NIPS). 568--576.

[31]

Yale Song, Miriam Redi, Jordi Vallmitjana, and Alejandro Jaimes. 2016. To Click or Not To Click: Automatic Selection of Beautiful Thumbnails from Videos. arXiv:1609.01388 [cs.MM]

[32]

Greg Surma. 2018. Image Classifier - Cats vs Dogs. https://gsurma.medium.com/image-classifier-cats-vs-dogs-with-convolutional-neural-networks-cnns-and-google-colabs-4e9af21ae7a8

[33]

Dian Tjondronegoro, Yi-Ping Phoebe Chen, and Binh Pham. 2003. Sports video summarization using highlights and play-breaks. In Proceedings of ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR). 201--208.

Digital Library

[34]

Torrens University Australia. 2020. Why the Sports Industry is Booming in 2020 (and which key players are driving growth). https://www.torrens.edu.au/blog/why-sports-industry-is-booming-in-2020-which-key-players-driving-growth

[35]

Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. 2018. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6450--6459.

[36]

Joakim Olav Valand, Haris Kadragic, Steven Alexander Hicks, Vajira Lasantha Thambawita, Cise Midoglu, Tomas Kupka, Dag Johansen, Michael Alexander Riegler, and Pål Halvorsen. 2021. AI-Based Video Clipping of Soccer Events. Machine Learning and Knowledge Extraction 3, 4 (2021), 990--1008.

[37]

Arun Balajee Vasudevan, Michael Gygli, Anna Volokitin, and Luc Van Gool. 2017. Query-adaptive Video Summarization via Quality-aware Relevance Estimation. arXiv:1705.00581 [cs.CV]

[38]

Vimeo Livestream Blog. 2022. Streaming Stats - 47 Must-Know Live Video Streaming Statistics. https://livestream.com/blog/62-must-know-stats-live-video-streaming.

[39]

P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. I--I.

[40]

Hossam M. Zawbaa, Nashwa El-Bendary, Aboul Ella Hassanien, and Ajith Abraham. 2011. SVM-based soccer video summarization system. In Proceedings of the World Congress on Nature and Biologically Inspired Computing. 7--11.

[41]

Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks. CoRR abs/1604.02878 (2016). arXiv:1604.02878 http://arxiv.org/abs/1604.02878

[42]

Matko Šarić, Dujmić Hrvoje, and Baričević Domagoj. 2008. Shot Boundary Detection in Soccer Video using Twin-comparison Algorithm and Dominant Color Region. Journal of Information and Organizational Sciences 32 (06 2008).

Cited By

Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2024)AI-Based Cropping of Sport Videos Using SmartCropInternational Journal of Semantic Computing10.1142/S1793351X2445002818:04(637-662)Online publication date: 27-Aug-2024
https://doi.org/10.1142/S1793351X24450028
Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2023)SmartCrop: AI-Based Cropping of Soccer Videos2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00009(20-27)Online publication date: 11-Dec-2023
https://doi.org/10.1109/ISM59092.2023.00009
Midoglu CStoras ASabet SHammou MHicks SStrumke IRiegler MGriwodz CHalvorsen P(2022)Experiences and Lessons Learned from a Crowdsourced-Remote Hybrid User Survey Framework2022 IEEE International Symposium on Multimedia (ISM)10.1109/ISM55400.2022.00035(161-162)Online publication date: Dec-2022
https://doi.org/10.1109/ISM55400.2022.00035

Index Terms

Automatic thumbnail selection for soccer videos using machine learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization
  2. Machine learning

Recommendations

HOST-ATS: automatic thumbnail selection with dashboard-controlled ML pipeline and dynamic user survey
MMSys '22: Proceedings of the 13th ACM Multimedia Systems Conference

We present HOST-ATS, a holistic system for the automatic selection and evaluation of soccer video thumbnails, which is composed of a dashboard-controlled machine learning (ML) pipeline, and a dynamic user survey. The ML pipeline uses logo detection, ...
Use of deep learning in soccer videos analysis: survey
Abstract
The demand for video analysis has been rapidly increasing in the last decade. Video analysis plays a critical role in various technologies, including medical diagnosis, security surveillance, robotics, and sport. Soccer is the most popular sport ...
SSET: a dataset for shot segmentation, event detection, player tracking in soccer videos
Abstract
Soccer video analysis is the focus of sports video research as it receives widespread attention around the world. However, the lack of soccer datasets hinders the rapid development of this field. In this paper, we construct a soccer dataset named ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMSys '22: Proceedings of the 13th ACM Multimedia Systems Conference

June 2022

432 pages

ISBN:9781450392839

DOI:10.1145/3524273

General Chairs:
Niall Murray
Technological University of the Shannon: Midlands Midwest
,
Gwendal Simon
Synamedia
,
Mylene Farias
University of Brasilia
,
Program Chairs:
Irene Viola
Centrum Wiskunde & Informatica
,
Mario Montagud
i2CAT Foundation & University of Valencia

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2022

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

Research Council of Norway

Conference

MMSys '22

Sponsor:

SIGMM

MMSys '22: 13th ACM Multimedia Systems Conference

June 14 - 17, 2022

Athlone, Ireland

Acceptance Rates

Overall Acceptance Rate 176 of 530 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
681
Total Downloads

Downloads (Last 12 months)199
Downloads (Last 6 weeks)34

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2024)AI-Based Cropping of Sport Videos Using SmartCropInternational Journal of Semantic Computing10.1142/S1793351X2445002818:04(637-662)Online publication date: 27-Aug-2024
https://doi.org/10.1142/S1793351X24450028
Dorcheh SSarkhoosh MMidoglu CSabet SKupka TRiegler MJohansen DHalvorsen P(2023)SmartCrop: AI-Based Cropping of Soccer Videos2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00009(20-27)Online publication date: 11-Dec-2023
https://doi.org/10.1109/ISM59092.2023.00009
Midoglu CStoras ASabet SHammou MHicks SStrumke IRiegler MGriwodz CHalvorsen P(2022)Experiences and Lessons Learned from a Crowdsourced-Remote Hybrid User Survey Framework2022 IEEE International Symposium on Multimedia (ISM)10.1109/ISM55400.2022.00035(161-162)Online publication date: Dec-2022
https://doi.org/10.1109/ISM55400.2022.00035

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten