Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3524273.3528182acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article
Open access

Automatic thumbnail selection for soccer videos using machine learning

Published: 05 August 2022 Publication History

Abstract

Thumbnail selection is a very important aspect of online sport video presentation, as thumbnails capture the essence of important events, engage viewers, and make video clips attractive to watch. Traditional solutions in the soccer domain for presenting highlight clips of important events such as goals, substitutions, and cards rely on the manual or static selection of thumbnails. However, such approaches can result in the selection of sub-optimal video frames as snapshots, which degrades the overall quality of the video clip as perceived by viewers, and consequently decreases viewership, not to mention that manual processes are expensive and time consuming. In this paper, we present an automatic thumbnail selection system for soccer videos which uses machine learning to deliver representative thumbnails with high relevance to video content and high visual quality in near real-time. Our proposed system combines a software framework which integrates logo detection, close-up shot detection, face detection, and image quality analysis into a modular and customizable pipeline, and a subjective evaluation framework for the evaluation of results. We evaluate our proposed pipeline quantitatively using various soccer datasets, in terms of complexity, runtime, and adherence to a pre-defined rule-set, as well as qualitatively through a user study, in terms of the perception of output thumbnails by end-users. Our results show that an automatic end-to-end system for the selection of thumbnails based on contextual relevance and visual quality can yield attractive highlight clips, and can be used in conjunction with existing soccer broadcast pipelines which require real-time operation.

Supplementary Material

ZIP File (p73-husa-suppl.zip)
Supplemental material.

References

[1]
Vardan Agarwal. 2021. Face Detection Models: Which to Use and Why? https://towardsdatascience.com/face-detection-models-which-to-use-and-why-d263e82c302c
[2]
Allsvenskan. 2022. Highlights. https://highlights.allsvenskan.se/.
[3]
Gary Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).
[4]
Chen-Yu Chen, Jia-Ching Wang, Jhing-Fa Wang, and Yu-Hen Hu. 2008. Motion Entropy Feature and Its Applications to Event-Based Segmentation of Sports Video. EURASIP Journal on Advances in Signal Processing 2008 (2008).
[5]
Anthony Cioppa, Adrien Deliege, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, and Thomas B. Moeslund. 2020. A Context-Aware Loss Function for Action Spotting in Soccer Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[6]
Pete Cook. 2021. react-player. https://www.npmjs.com/package/react-player.
[7]
Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, and Marc Van Droogenbroeck. 2020. SoccerNet-v2 : A Dataset and Benchmarks for Holistic Understanding of Broadcast Soccer Videos. arXiv:2011.13367 [cs.CV]
[8]
Eliteserien. 2022. Highlights. https://highlights.eliteserien.no/.
[9]
FIFA.com. 2018. More than half the world watched record-breaking 2018 World Cup. https://www.fifa.com/worldcup/news/more-than-half-the-world-watched-record-breaking-2018-world-cup
[10]
Malek Hammou, Cise Midoglu, Steven A. Hicks, Andrea Storås, Saeed Shafiee Sabet, Inga Strümke, Michael A. Riegler, and Pål Halvorsen. 2022. Huldra: A Framework for Collecting Crowdsourced Feedback on Multimedia Assets. In 13th ACM Multimedia Systems Conference (MMSys '22), June 14--17, 2022, Athlone, Ireland. ACM, New York, NY, USA.
[11]
Andreas Husa. 2022. Automated Thumbnail Selection for Soccer Videos with Machine Learning. Master's thesis. University of Oslo, Oslo, Norway.
[12]
Andreas Husa, Cise Midoglu, Malek Hammou, Pål Halvorsen, and Michael A. Riegler. 2022. HOST-ATS: Automatic Thumbnail Selection with Dashboard-Controlled ML Pipeline and Dynamic User Survey. In 13th ACM Multimedia Systems Conference (MMSys '22), June 14--17, 2022, Athlone, Ireland. ACM, New York, NY, USA.
[13]
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-Scale Video Classification with Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1725--1732.
[14]
Jongyoo Kim, Anh-Duc Nguyen, and Sanghoon Lee. 2019. Deep CNN-Based Blind Image Quality Predictor. IEEE Transactions on Neural Networks and Learning Systems 30, 1 (2019), 11--24.
[15]
Davis King. 2021. dlib C++ Library. http://dlib.net/. Last accessed 2022-01-24.
[16]
Ryan Knott. 2021. What Are Video Thumbnails and Why Do They Matter? https://www.techsmith.com/blog/what-are-video-thumbnails/
[17]
Jacek Komorowski, Grzegorz Kurzejamski, and Grzegorz Sarwas. 2019. FootAndBall: Integrated player and ball detector. CoRR abs/1912.05445 (2019). arXiv:1912.05445 http://arxiv.org/abs/1912.05445
[18]
Harilaos Koumaras, Georgios Gardikis, George Xilouris, Evangelos Pallis, and Anastasios Kourtis. 2006. Shot boundary detection without threshold parameters. J. Electronic Imaging 15 (4 2006), 020503.
[19]
Thomas J Law. 2021. The Perfect YouTube Thumbnail Size and Best Practices. https://www.oberlo.com/blog/youtube-thumbnail-size.
[20]
Tianwei Lin, Xiao Liu, Xin Li, Errui Ding, and Shilei Wen. 2019. BMN: Boundary-Matching Network for Temporal Action Proposal Generation. In Proceedings of IEEE International Conference on Computer Vision (ICCV).
[21]
Tianwei Lin, Xu Zhao, Haisheng Su, Chongjing Wang, and Ming Yang. 2018. BSN: Boundary Sensitive Network for Temporal Action Proposal Generation. In Proceedings of the European Conference Computer Vision (ECCV).
[22]
MATLAB. 2021. brisque (R2021a). https://se.mathworks.com/help/images/ref/brisque.html
[23]
Pier Luigi Mazzeo, Marco Leo, Paolo Spagnolo, and Massimiliano Nitti. 2012. Soccer Ball Detection by Comparing Different Feature Extraction Methodologies. Advances in Artificial Intelligence 2012 (2012), 12.
[24]
Olav Andre Nergård Rongved, Markus Stige, Steven Alexander Hicks, Vajira Lasantha Thambawita, Cise Midoglu, Evi Zouganeli, Dag Johansen, Michael Alexander Riegler, and Pål Halvorsen. 2021. Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities. Machine Learning and Knowledge Extraction 3, 4 (2021), 1030--1054.
[25]
Ricardo Ocampo. 2021. Deep CNN-Based Blind Image Quality Predictor in Python. https://towardsdatascience.com/deep-image-quality-assessment-with-tensorflow-2-0-69ed8c32f195
[26]
Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. 2015. You Only Look Once: Unified, Real-Time Object Detection. CoRR abs/1506.02640 (2015). arXiv:1506.02640 http://arxiv.org/abs/1506.02640
[27]
Olav A. Nergård Rongved, Steven A. Hicks, Vajira Thambawita, Håkon K. Stensland, Evi Zouganeli, Dag Johansen, Cise Midoglu, Michael A. Riegler, and Pål Halvorsen. 2021. Using 3D Convolutional Neural Networks for Real-time Detection of Soccer Events. International Journal of Semantic Computing 15, 02 (2021), 161--187.
[28]
Olav A. Nergård Rongved, Steven A. Hicks, Vajira Thambawita, Håkon K. Stensland, Evi Zouganeli, Dag Johansen, Michael A. Riegler, and Pål Halvorsen. 2020. Real-Time Detection of Events in Soccer Videos using 3D Convolutional Neural Networks. In Proceedings of the IEEE International Symposium on Multimedia (ISM). 135--144.
[29]
Adrian Rosebrock. 2021. OpenCV Haar Cascades. https://www.pyimagesearch.com/2021/04/12/opencv-haar-cascades/
[30]
Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. In Proceedings of Advances in Neural Information Processing Systems (NIPS). 568--576.
[31]
Yale Song, Miriam Redi, Jordi Vallmitjana, and Alejandro Jaimes. 2016. To Click or Not To Click: Automatic Selection of Beautiful Thumbnails from Videos. arXiv:1609.01388 [cs.MM]
[32]
Greg Surma. 2018. Image Classifier - Cats vs Dogs. https://gsurma.medium.com/image-classifier-cats-vs-dogs-with-convolutional-neural-networks-cnns-and-google-colabs-4e9af21ae7a8
[33]
Dian Tjondronegoro, Yi-Ping Phoebe Chen, and Binh Pham. 2003. Sports video summarization using highlights and play-breaks. In Proceedings of ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR). 201--208.
[34]
Torrens University Australia. 2020. Why the Sports Industry is Booming in 2020 (and which key players are driving growth). https://www.torrens.edu.au/blog/why-sports-industry-is-booming-in-2020-which-key-players-driving-growth
[35]
Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, and Manohar Paluri. 2018. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6450--6459.
[36]
Joakim Olav Valand, Haris Kadragic, Steven Alexander Hicks, Vajira Lasantha Thambawita, Cise Midoglu, Tomas Kupka, Dag Johansen, Michael Alexander Riegler, and Pål Halvorsen. 2021. AI-Based Video Clipping of Soccer Events. Machine Learning and Knowledge Extraction 3, 4 (2021), 990--1008.
[37]
Arun Balajee Vasudevan, Michael Gygli, Anna Volokitin, and Luc Van Gool. 2017. Query-adaptive Video Summarization via Quality-aware Relevance Estimation. arXiv:1705.00581 [cs.CV]
[38]
Vimeo Livestream Blog. 2022. Streaming Stats - 47 Must-Know Live Video Streaming Statistics. https://livestream.com/blog/62-must-know-stats-live-video-streaming.
[39]
P. Viola and M. Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 1. I--I.
[40]
Hossam M. Zawbaa, Nashwa El-Bendary, Aboul Ella Hassanien, and Ajith Abraham. 2011. SVM-based soccer video summarization system. In Proceedings of the World Congress on Nature and Biologically Inspired Computing. 7--11.
[41]
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks. CoRR abs/1604.02878 (2016). arXiv:1604.02878 http://arxiv.org/abs/1604.02878
[42]
Matko Šarić, Dujmić Hrvoje, and Baričević Domagoj. 2008. Shot Boundary Detection in Soccer Video using Twin-comparison Algorithm and Dominant Color Region. Journal of Information and Organizational Sciences 32 (06 2008).

Cited By

View all
  • (2024)AI-Based Cropping of Sport Videos Using SmartCropInternational Journal of Semantic Computing10.1142/S1793351X2445002818:04(637-662)Online publication date: 27-Aug-2024
  • (2023)SmartCrop: AI-Based Cropping of Soccer Videos2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00009(20-27)Online publication date: 11-Dec-2023
  • (2022)Experiences and Lessons Learned from a Crowdsourced-Remote Hybrid User Survey Framework2022 IEEE International Symposium on Multimedia (ISM)10.1109/ISM55400.2022.00035(161-162)Online publication date: Dec-2022

Index Terms

  1. Automatic thumbnail selection for soccer videos using machine learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MMSys '22: Proceedings of the 13th ACM Multimedia Systems Conference
      June 2022
      432 pages
      ISBN:9781450392839
      DOI:10.1145/3524273
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 August 2022

      Check for updates

      Badges

      Author Tags

      1. blur detection
      2. deep learning
      3. image quality
      4. logo detection
      5. object detection
      6. shot boundary detection
      7. soccer
      8. sports analysis
      9. thumbnail generation
      10. user survey
      11. video

      Qualifiers

      • Research-article

      Funding Sources

      • Research Council of Norway

      Conference

      MMSys '22
      Sponsor:
      MMSys '22: 13th ACM Multimedia Systems Conference
      June 14 - 17, 2022
      Athlone, Ireland

      Acceptance Rates

      Overall Acceptance Rate 176 of 530 submissions, 33%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)199
      • Downloads (Last 6 weeks)34
      Reflects downloads up to 29 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)AI-Based Cropping of Sport Videos Using SmartCropInternational Journal of Semantic Computing10.1142/S1793351X2445002818:04(637-662)Online publication date: 27-Aug-2024
      • (2023)SmartCrop: AI-Based Cropping of Soccer Videos2023 IEEE International Symposium on Multimedia (ISM)10.1109/ISM59092.2023.00009(20-27)Online publication date: 11-Dec-2023
      • (2022)Experiences and Lessons Learned from a Crowdsourced-Remote Hybrid User Survey Framework2022 IEEE International Symposium on Multimedia (ISM)10.1109/ISM55400.2022.00035(161-162)Online publication date: Dec-2022

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media