Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3588444.3591012acmconferencesArticle/Chapter ViewAbstractPublication PagesmhvConference Proceedingsconference-collections
research-article
Open access

Transcoding Quality Prediction for Adaptive Video Streaming

Published: 16 June 2023 Publication History

Abstract

In recent years, video streaming applications have proliferated the demand for Video Quality Assessment (VQA). Reduced reference video quality assessment (RR-VQA) is a category of VQA where certain features (e.g., texture, edges) of the original video are provided for quality assessment. It is a popular research area for various applications such as social media, online games, and video streaming. This paper introduces a reduced reference Transcoding Quality Prediction Model (TQPM) to determine the visual quality score of the video possibly transcoded in multiple stages. The quality is predicted using Discrete Cosine Transform (DCT)-energy-based features of the video (i.e., the video's brightness, spatial texture information, and temporal activity) and the target bitrate representation of each transcoding stage. To do that, the problem is formulated, and a Long Short-Term Memory (LSTM)-based quality prediction model is presented. Experimental results illustrate that, on average, TQPM yields PSNR, SSIM, and VMAF predictions with an R2 score of 0.83, 0.85, and 0.87, respectively, and Mean Absolute Error (MAE) of 1.31 dB, 1.19 dB, and 3.01, respectively, for single-stage transcoding. Furthermore, an R2 score of 0.84, 0.86, and 0.91, respectively, and MAE of 1.32 dB, 1.33 dB, and 3.25, respectively, are observed for a two-stage transcoding scenario. Moreover, the average processing time of TQPM for 4s segments is 0.328s, making it a practical VQA method in online streaming applications.

References

[1]
A. Bentaleb, B. Taani, A. C. Begen, C. Timmerer, and R. Zimmermann. 2019. A Survey on Bitrate Adaptation Schemes for Streaming Media Over HTTP. IEEE Communications Surveys Tutorials 21, 1 (2019), 562--585.
[2]
Jill Boyce, Karsten Suehring, Xiang Li, and Vadim Seregin. 2018. JVET-J1010: JVET common test conditions and software reference configurations.
[3]
B. Bross, H. Kirchhoffer, C. Bartnik, M. Palkow, and D. Marpe. 2020. AHG4 Multiformat Berlin Test Sequences. In JVET-Q0791.
[4]
Manri Cheon and Jong-Seok Lee. 2018. Subjective and Objective Quality Assessment of Compressed 4K UHD Videos for Immersive Experience. IEEE Transactions on Circuits and Systems for Video Technology 28, 7 (2018), 1467--1480.
[5]
Shyamprasad Chikkerur, Vijay Sundaram, Martin Reisslein, and Lina Karam. 2011. Objective Video Quality Assessment Methods: A Classification, Review, and Performance Comparison. IEEE Transactions on Broadcasting 57, 2 (2011), 165--182.
[6]
Shahi Dost, Faryal Saud, Maham Shabbir, Muhammad Gufran Khan, Muhammad Shahid, and Benny Lovstrom. 2022. Reduced reference image and video quality assessments: review of methods. EURASIP Journal on Image and Video Processing (2022).
[7]
Reza Farahani. 2021. CDN and SDN support and player interaction for HTTP adaptive video streaming. In Proceedings of the 12th ACM Multimedia Systems Conference. 398--402.
[8]
Reza Farahani, Hadi Amirpour, Farzad Tashtarian, Abdelhak Bentaleb, Christian Timmerer, Hermann Hellwagner, and Roger Zimmermann. 2022. RICHTER: hybrid P2P-CDN architecture for low latency live video streaming. In Proceedings of the 1st Mile-High Video Conference. 87--88.
[9]
Reza Farahani, Mohammad Shojafar, Christian Timmerer, Farzad Tashtarian, Mohammad Ghanbari, and Hermann Hellwagner. 2022. ARARAT: A Collaborative Edge-Assisted Framework for HTTP Adaptive Video Streaming. IEEE Transactions on Network and Service Management (2022).
[10]
Reza Farahani, Farzad Tashtarian, Christian Timmerer, Mohammad Ghanbar, and Hermann Hellwagner. 2022. LEADER: A Collaborative Edge-and SDN-Assisted Framework for HTTP Adaptive Video Streaming. In ICC 2022-IEEE International Conference on Communications. IEEE, 745--750.
[11]
NB Harikrishnan, Vignesh V Menon, Manoj S Nair, and Gayathri Narayanan. 2017. Comparative evaluation of image compression techniques. In 2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET). IEEE, 1--4.
[12]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997), 1735--1780.
[13]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015.
[14]
Zhi Li et al. 2018. VMAF: The journey continues. Netflix Technology Blog 25 (2018).
[15]
Weisi Lin and C.-C. Jay Kuo. 2011. Perceptual visual quality metrics: A survey. Journal of Visual Communication and Image Representation 22, 4 (2011), 297--312.
[16]
Tsung-Jung Liu, Yu-Chieh Lin, Weisi Lin, and C.-C. Jay Kuo. 2013. Visual quality assessment: recent developments, coding applications and future trends. APSIPA Transactions on Signal and Information Processing 2 (2013), e4.
[17]
Alex Mackin, Fan Zhang, and David R. Bull. 2015. A study of subjective video quality at various frame rates. In 2015 IEEE International Conference on Image Processing (ICIP). 3407--3411.
[18]
Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer. 2022. CODA: Content-aware Frame Dropping Algorithm for High Frame-rate Video Streaming. In 2022 Data Compression Conference (DCC). 475--475.
[19]
Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer. 2022. OPTE: Online Per-Title Encoding for Live Video Streaming. In ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1865--1869.
[20]
Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer. 2022. Perceptually-Aware Per-Title Encoding for Adaptive Video Streaming. In 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE Computer Society, Los Alamitos, CA, USA, 1--6.
[21]
Vignesh V Menon, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer. 2023. EMES: Efficient Multi-Encoding Schemes for HEVC-Based Adaptive Bitrate Streaming. ACM Trans. Multimedia Comput. Commun. Appl. 19, 3s, Article 129 (mar 2023), 20 pages.
[22]
Vignesh V Menon, Hadi Amirpour, Christian Timmerer, and Mohammad Ghanbari. 2021. INCEPT: Intra CU Depth Prediction for HEVC. In 2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP). 1--6.
[23]
Vignesh V Menon, Christian Feldmann, Hadi Amirpour, Mohammad Ghanbari, and Christian Timmerer. 2022. VCA: Video Complexity Analyzer. In Proceedings of the 13th ACM Multimedia Systems Conference (Athlone, Ireland) (MMSys '22). Association for Computing Machinery, New York, NY, USA, 259--264.
[24]
Vignesh V Menon, Prajit T Rajendran, Reza Farahani, Klaus Schoeffmann, and Christian Timmerer. 2023. Video Quality Assessment with Texture Information Fusion for Streaming Applications. arXiv:2302.14465 [cs.MM]
[25]
Vignesh V Menon, Prajit T Rajendran, Christian Feldmann, Klaus Schoeffmann, Mohammad Ghanbari, and Christian Timmerer. 2023. JND-aware Two-pass Per-title Encoding Scheme for Adaptive Live Streaming. (March 2023).
[26]
Alexandre Mercat, Marko Viitanen, and Jarno Vanne. 2020. UVG Dataset: 50/120fps 4K Sequences for Video Codec Analysis and Development. Association for Computing Machinery, New York, NY, USA, 297--302.
[27]
Manish Narwaria, Weisi Lin, Ian Vince McLoughlin, Sabu Emmanuel, and Liang-Tien Chia. 2012. Fourier Transform-Based Scalable Image Quality Measure. IEEE Transactions on Image Processing 21, 8 (2012), 3364--3377.
[28]
Marta Orduna, César Díaz, Lara Muñoz, Pablo Pérez, Ignacio Benito, and Narciso García. 2020. Video Multimethod Assessment Fusion (VMAF) on 360VR Contents. IEEE Transactions on Consumer Electronics 66, 1 (2020), 22--31.
[29]
Margaret H. Pinson and Stephen Wolf. 2003. An objective method for combining multiple subjective data sets. In Visual Communications and Image Processing 2003. 583--592.
[30]
Airi Sakaushi, Kenji Kanai, Jiro Katto, and Toshitaka Tsuda. 2017. Image quality evaluations of image enhancement under various encoding rates for video surveillance system. In 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE). 1--2.
[31]
Li Song, Xun Tang, Wei Zhang, Xiaokang Yang, and Pingjian Xia. 2013. The SJTU 4K Video Sequence Dataset. Fifth International Workshop on Quality of Multimedia Experience (QoMEX2013) (July 2013).
[32]
Rajiv Soundararajan and Alan C. Bovik. 2013. Video Quality Assessment by Reduced Reference Spatio-Temporal Entropic Differencing. IEEE Transactions on Circuits and Systems for Video Technology 23, 4 (2013), 684--694.
[33]
G. J. Sullivan, J. R. Ohm, W. J. Han, and T. Wiegand. 2012. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology 22, 12 (2012), 1649--1668.
[34]
Shiqi Wang, Xiang Zhang, Siwei Ma, and Wen Gao. 2013. Reduced reference image quality assessment using entropy of primitives. In 2013 Picture Coding Symposium (PCS). 193--196.
[35]
Xu Wang, Gangyi Jiang, and Mei Yu. 2009. Reduced Reference Image Quality Assessment Based on Contourlet Domain and Natural Image Statistics. In 2009 Fifth International Conference on Image and Graphics. 45--50.
[36]
Zhou Wang and Alan C. Bovik. 2006. Modern Image Quality Assessment. Synthesis Lectures on Image, Video, and Multimedia Processing 2, 1 (2006), 1--156.
[37]
Zhou Wang and Qiang Li. 2011. Information Content Weighting for Perceptual Image Quality Assessment. IEEE Transactions on Image Processing 20, 5 (2011), 1185--1198.
[38]
Zhou Wang and Eero Simoncelli. 2005. Reduce-reference image quality assessment using a wavelet-domain natural image statistic model. Proceedings of SPIE - The International Society for Optical Engineering 5666 (03 2005).
[39]
Yong Yu, Xiaosheng Si, Changhua Hu, and Jianxun Zhang. 2019. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Computation 31, 7 (07 2019), 1235--1270.
[40]
Saman Zadtootaghaj, Steven Schmidt, Nabajeet Barman, Sebastian Möller, and Maria G. Martini. 2018. A Classification of Video Games based on Game Characteristics linked to Video Coding Complexity. In 2018 16th Annual Workshop on Network and Systems Support for Games (NetGames). 1--6.

Cited By

View all
  • (2024)Lightweight Multitask Learning for Robust JND Prediction Using Latent Space and Reconstructed FramesIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.338998834:9(8657-8671)Online publication date: Oct-2024
  • (2024)Towards ML-Driven Video Encoding Parameter Selection for Quality and Energy Optimization2024 16th International Conference on Quality of Multimedia Experience (QoMEX)10.1109/QoMEX61742.2024.10598278(80-83)Online publication date: 18-Jun-2024
  • (2023)Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Streaming2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)10.1109/VCIP59821.2023.10402699(1-5)Online publication date: 4-Dec-2023
  • Show More Cited By

Index Terms

  1. Transcoding Quality Prediction for Adaptive Video Streaming

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MHV '23: Proceedings of the 2nd Mile-High Video Conference
    May 2023
    176 pages
    ISBN:9798400701603
    DOI:10.1145/3588444
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 June 2023

    Check for updates

    Author Tags

    1. video quality assessment
    2. reduced reference
    3. transcoding
    4. VMAF prediction
    5. video streaming

    Qualifiers

    • Research-article

    Conference

    MHV '23
    Sponsor:
    MHV '23: 2nd Mile-High Video Conference
    May 7 - 10, 2023
    CO, Denver, USA

    Upcoming Conference

    MHV '25
    Mile-High Video Conference
    February 18 - 20, 2025
    Denver , CO , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)189
    • Downloads (Last 6 weeks)29
    Reflects downloads up to 28 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Lightweight Multitask Learning for Robust JND Prediction Using Latent Space and Reconstructed FramesIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.338998834:9(8657-8671)Online publication date: Oct-2024
    • (2024)Towards ML-Driven Video Encoding Parameter Selection for Quality and Energy Optimization2024 16th International Conference on Quality of Multimedia Experience (QoMEX)10.1109/QoMEX61742.2024.10598278(80-83)Online publication date: 18-Jun-2024
    • (2023)Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Streaming2023 IEEE International Conference on Visual Communications and Image Processing (VCIP)10.1109/VCIP59821.2023.10402699(1-5)Online publication date: 4-Dec-2023
    • (2023)Just Noticeable Difference-Aware Per-Scene Bitrate-Laddering for Adaptive Video Streaming2023 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME55011.2023.00288(1673-1678)Online publication date: Jul-2023
    • (2023)All-Intra Rate Control Using Low Complexity Video Features for Versatile Video Coding2023 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP49359.2023.10222792(2760-2764)Online publication date: 8-Oct-2023

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media