Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Visual Sensitivity Aware ABR Algorithm for DASH via Deep Reinforcement Learning

Published: 10 November 2023 Publication History

Abstract

In order to cope with the fluctuation of network bandwidth and provide smooth video services, adaptive video streaming technology is proposed. In particular, the adaptive bitrate (ABR) algorithm is widely used in dynamic adaptive streaming over HTTP (DASH) to improve quality of experience (QoE). However, existing ABR algorithms still ignore the inherent visual sensitivity of human visual system (HVS). As the final receiver of video, HVS has different sensitivity to the quality distortion of different video content, and video content with high visual sensitivity needs to allocate more bitrate resources. Therefore, existing ABR algorithms still have limitations in reasonably allocating bitrate and maximizing QoE. To solve this problem, this paper designs an adaptive bitrate strategy from the perspective of user vision, studies the modeling of visual sensitivity, and proposes a visual sensitivity aware ABR algorithm. We extract a set of content features and attribute features from the video, and consider the simulation of HVS to establish a total masking effect model that reflects the visual sensitivity more accurately. Further, the network status, buffer occupancy, and visual sensitivity are comprehensively considered under a deep reinforcement learning framework to select the appropriate bitrate for maximizing QoE. We implement the proposed algorithm over a realistic trace-driven evaluation and compare its performance with several latest algorithms. Experimental results show that our algorithm can align ABR strategy with visual sensitivity to achieve better QoE in high visual sensitivity content, and improves the average perceptual video quality and overall user QoE by 18.3% and 22.8%, respectively. Additionally, we prove the feasibility of our algorithm through subjective evaluation in the real environment.

References

[1]
Cisco. 2017. Cisco visual networking index: Forecast and methodology, 2016–2021.
[2]
T. Stockhammer. 2011. Dynamic adaptive streaming over HTTP–standards and design principles. In Proceedings of the Second Annual ACM Conference on Multimedia Systems (MMSys’11). San Jose, CA, USA, 133–144.
[3]
Yi Sun, Xiaoqi Yin, Junchen Jiang, Vyas Sekar, et al. 2016. CS2P: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 ACM SIGCOMM Conference (SIGCOMM’16). Association for Computing Machinery, New York, NY, USA, 272–285.
[4]
J. Jiang, V. Sekar, and H. Zhang. 2014. Improving fairness, efficiency, and stability in HTTP-based adaptive video streaming with Festive. IEEE/ACM Transactions on Networking 22, 1 (2014), 326–340.
[5]
T. Huang, R. Johari, N. McKeown, M. Trunnell, and M. Watson. 2014. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM’14). Association for Computing Machinery, New York, NY, USA, 187–198.
[6]
K. Spiteri, R. Urgaonkar, and R. K. Sitaraman. 2016. BOLA: Near-optimal bitrate adaptation for online videos. The 35th Annual IEEE International Conference on Computer Communications (INFOCOM’16). San Francisco, CA, USA, 1–9.
[7]
Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control-theoretic approach for dynamic adaptive video streaming over HTTP. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication (SIGCOMM’15). Association for Computing Machinery, New York, NY, USA, 325–338.
[8]
A. Bokani, M. Hassan, S. Kanhere, and X. Zhu. 2015. Optimizing HTTP-based adaptive streaming in vehicular environment using Markov decision process. IEEE Transactions on Multimedia 17, 12 (2015), 2297–2309.
[9]
C. Zhou, C.-W. Lin, and Z. Guo. 2016. mDASH: A Markov decision-based rate adaptation approach for dynamic HTTP streaming. IEEE Transactions on Multimedia 18, 4 (2016), 738–751.
[10]
M. Gadaleta, F. Chiariotti, M. Rossi, and A. Zanella. 2017. D-DASH: A deep Q-Learning framework for DASH video streaming. IEEE Transactions on Cognitive Communications and Networking 3, 4 (2017), 703–718.
[11]
Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with Pensieve. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’17). Association for Computing Machinery, New York, NY, USA, 197–210.
[12]
T. Huang, C. Zhou, R. Zhang, C. Wu, X. Yao, and L. Sun. 2019. Comyco: Quality-aware adaptive video streaming via imitation learning. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19). Association for Computing Machinery, New York, NY, USA, 429–437.
[13]
T. Huang, X. Yao, C. Wu, et al. 2019. Tiyuntsong: A self-play reinforcement learning approach for ABR video streaming. 2019 IEEE International Conference on Multimedia and Expo (ICME). Shanghai, China, 1678–1683.
[14]
S. Hu, L. Sun, C. Gui, E. Jammeh, and I. Mkwawa. 2014. Content-aware adaptation scheme for QoE optimized DASH applications. 2014 IEEE Global Communications Conference. Austin, TX, 1336–1341.
[15]
Stefan Wilk, Denny Stohr, and Wolfgang Effelsberg. 2016. A content-aware video adaptation service to support mobile video. ACM Trans. Multimedia Comput. Commun. Appl 12, 5, Article 82 (November 2016), 1–23.
[16]
B. Ciubotaru, G. Ghinea, and G. Muntean. 2014. Subjective assessment of region of interest-aware adaptive multimedia streaming quality. IEEE Transactions on Broadcasting 60, 1 (March 2014), 50–60.
[17]
Maarten Wijnants, Sven Coppers, Gustavo Rovelo Ruiz, Peter Quax, and Wim Lamotte. 2019. Talking video heads: Saving streaming bitrate by adaptively applying object-based video principles to interview-like footage. In Proceedings of the 27th ACM International Conference on Multimedia (MM’19). Association for Computing Machinery, New York, NY, USA, 2449–2458.
[18]
G. Gao et al. 2018. Optimizing quality of experience for adaptive bitrate streaming via viewer interest inference. IEEE Transactions on Multimedia 20, 12 (Dec. 2018), 3399–3413.
[19]
Shenghong Hu, Lingfen Sun, Chunxia Xiao, and Chao Gui. 2017. Semantic-aware adaptation scheme for soccer video over MPEG-DASH. In Proceedings of the IEEE International Conference on Multimedia & Expo (ICME’17). Hong Kong, China, 493–498.
[20]
Shenghong Hu, Min Xu, Haimin Zhang, Chunxia Xiao, and Chao Gui. 2020. Affective content-aware adaptation scheme on QoE optimization of adaptive streaming over HTTP. ACM Trans. Multimedia Comput. Commun. Appl 15, 3, Article 100 (January 2020), 1–18.
[21]
H. V. Mnih et al. 2016. Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning (ICML’16). New York, NY, USA, 1928–1937.
[22]
A. B. Watson, R. Borthwick, and M. Taylor. 1997. Image quality and entropy masking. Electronic Imaging’97. International Society for Optics and Photonics, 2–12.
[23]
P. Gao, P. Zhang, and A. Smolic. 2022. Quality assessment for omnidirectional video: A spatio-temporal distortion modeling approach. IEEE Transactions on Multimedia, 24, 1–16.
[24]
L. K. Choi and A. C. Bovik. 2018. Video quality assessment accounting for temporal visual masking of local flicker. Signal Processing Image Communication 67 (Sep. 2018), 182–198.
[25]
H. Roodaki, Z. Iravani, M. R. Hashemi, and S. Shirmohammadi. 2016. A view-level rate distortion model for multi-view/3D video. IEEE Transactions on Multimedia 18, 1 (Jan. 2016), 14–24.
[26]
H. Liu et al. 2020. Deep learning-based picture-wise just noticeable distortion prediction model for image compression. IEEE Transactions on Image Processing, 29, 641–656.
[27]
Q. Huang, H. Wang, S. C. Lim, H. Y. Kim, S. Y. Jeong, and C.-C.-J. Kuo. 2017. Measure and prediction of HEVC perceptually lossy/lossless boundary QP values. In 2017 Data Compression Conference (DCC’17). Snowbird, UT, USA, 42–51.
[28]
L. Jin, J. Lin, S. Hu, et al. 2016. Statistical study on perceived JPEG image quality via MCL-JCI dataset construction and analysis. IS&T/SPIE Electronic Imaging, International Society for Optics and Photonics, 13, 1–9.
[29]
X. Shen, Z. Ni, W. Yang, X. Zhang, S. Wang, and S. Kwong. 2020. A JND dataset based on VVC compressed images. In IEEE International Conference on Multimedia & Expo Workshops (ICMEW’20). London, UK, 1–6.
[30]
H. Wang et al. 2016. MCL-JCV: A JND-based H.264/AVC video quality assessment dataset. In 2016 IEEE International Conference on Image Processing (ICIP’16). Phoenix, AZ, USA, 1509–1513.
[31]
H. Wang et al. 2017. VideoSet: A large-scale compressed video quality dataset based on JND measurement. J. Vis. Commun. Image Represent 46 (Jul. 2017), 292–302.
[32]
H. Wang, I. Katsavounidis, Q. Huang, X. Zhou, and C.-C. J. Kuo. 2018. Prediction of satisfied user ratio for compressed video. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’18). Calgary, AB, Canada, 6747–6751.
[33]
X. Zhang, C. Yang, H. Wang, W. Xu, and C. -C. J. Kuo. 2020. Satisfied-user-ratio modeling for compressed video. IEEE Transactions on Image Processing, 29, 3777–3789.
[34]
Meng Dan, Jin Ye, Wenchao Jiang, and Yuanchao Shan. 2021. Visual sensitivity aware rate adaptation for video streaming via deep reinforcement learning. In 23rd IEEE International Conference on High Performance Computing and Communications (HPCC’21). To appear.
[35]
W. Zhou, Y. Zhu, J. Lei, J. Wan, and L. Yu. 2022. CCAFNet: Crossflow and cross-scale adaptive fusion network for detecting salient objects in RGB-D images. IEEE Transactions on Multimedia, 24, 2192–2204.
[36]
Kai Lin, Chuanmin Jia, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao. 2022. NR-CNN: Nested-residual guided CNN In-loop filtering for video coding. ACM Trans. Multimedia Comput. Commun. Appl 18, 4 (2022), 1–22.
[37]
D. Zhang, L. Yao, K. Chen, S. Wang, X. Chang, and Y. Liu. 2020. Making sense of spatio-temporal preserving representations for EEG-based human intention recognition. IEEE Transactions on Cybernetics 50, 7 (2020), 3033–3044.
[38]
M. Luo, X. Chang, L. Nie, Y. Yang, A. G. Hauptmann, and Q. Zheng. 2018. An adaptive semisupervised feature analysis for video semantic recognition. IEEE Transactions on Cybernetics 48, 2 (2018), 648–660.
[39]
K. Chen, L. Yao, D. Zhang, X. Wang, X. Chang, and F. Nie. 2020. A semisupervised recurrent convolutional attention model for human activity recognition. IEEE Transactions on Neural Networks and Learning Systems 31, 5 (2020), 1747–1756.
[40]
W. Kim, A.-D. Nguyen, S. Lee, and A. C. Bovik. 2020. Dynamic receptive field generation for full-reference image quality assessment. IEEE Transactions on Image Processing, 29, 4219–4231.
[41]
J. Kim and S. Lee. 2017. Deep learning of human visual sensitivity in image quality assessment framework. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA, 1676–1684.
[42]
N. Kruger et al. 2013. Deep hierarchies in the primate visual cortex: What can we learn for computer vision? IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 8 (Aug. 2013), 1847–1871.
[43]
T. S. Lee and D. Mumford. 2003. Hierarchical Bayesian inference in the visual cortex. JOSA A 20, 7, 1434–1448.
[44]
R. M. Cichy, D. Pantazis, and A. Oliva. 2014. Resolving human object recognition in space and time. Nature Publishing Group 17, 3 (Jan. 2014), 455–462.
[45]
DASH Industry Form. 2016. Reference Client 2.4.0. Retrieved 2016 from http://mediapm.edgesuite.net/dash/public/nightly/samples/dash-if-reference-player/index.html.
[46]
X. Liu, X. Tao, M. Xu, Y. Zhan, and J. Lu. 2020. An EEG-based study on perception of video distortion under various content motion conditions. IEEE Transactions on Multimedia 22, 4 (April 2020), 949–960.
[47]
Netflix. 2018. VMAF - Video Multi-Method Assessment Fusion. Retrieved December, 2018 from https://github.com/Netflix/vmaf.
[48]
Nabajeet Barman, Steven Schmidt, Saman Zadtootaghaj, Maria G. Martini, and Sebastian Möller. 2018. An evaluation of video quality assessment metrics for passive gaming video streaming. In Proceedings of the 23rd Packet Video Workshop (PV’18). Amsterdam, the Netherlands, 7–12.
[49]
H. Riiser et al. 2013. Commute path bandwidth traces from 3G networks: Analysis and applications. In Proceedings of the 4th ACM Multimedia Systems Conference (MMSys’13). Association for Computing Machinery, New York, NY, USA, 114–118.
[50]
Y.-F. Ou, Y. Xue, and Y. Wang. 2014. Q-star: A perceptual video quality model considering impact of spatial, temporal, and amplitude resolutions. IEEE Transactions on Image Processing 23, 6, 2473–2486.
[51]
R. Achanta, S. Hemami, F. Estrada, and S. Susstrunk. 2009. Frequency-tuned salient region detection. 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA, 1597–1604.
[52]
Chun Hsien Chou and Yun Chin Li. 1995. A perceptually tuned subband image coder based on the measure of just-noticeable-distortion profile. IEEE Transactions on Circuits and Systems for Video Technology 5, 6, 467–476.
[53]
Zhou Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (April 2004), 600–612.
[54]
J. Long, E. Shelhamer, and T. Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). Boston, MA, USA, 3431–3440.
[55]
D. P. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. Retrieved 2014 from https://arxiv.org/abs/1412.6980.
[56]
Federal Communications Commission. 2016. Raw Data - Measuring Broadband America. Retrieved 2016 from https://www.fcc.gov/reportsresearch/reports/.
[57]
Akamai. 2016. dash.js. Retrieved 2016 from https://github.com/Dash-Industry-Forum/dash.js/.
[58]
R. Netravali et al. 2015. Mahimahi: Accurate record-and-replay for HTTP. In Proceedings of USENIX ATC.

Index Terms

  1. A Visual Sensitivity Aware ABR Algorithm for DASH via Deep Reinforcement Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 3
    March 2024
    665 pages
    EISSN:1551-6865
    DOI:10.1145/3613614
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 November 2023
    Online AM: 29 June 2023
    Accepted: 26 March 2023
    Revised: 28 December 2022
    Received: 16 April 2022
    Published in TOMM Volume 20, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ABR
    2. DASH
    3. QoE
    4. visual sensitivity
    5. deep reinforcement learning

    Qualifiers

    • Research-article

    Funding Sources

    • Project of End to End Transmission Theory and Key Technologies Ensuring Deterministic Delay
    • Research on Load Balancing Mechanism for Heterogeneous Traffic in Data Center Network
    • Key Project of Guangxi Science & Technology
    • Ministry of Education, Singapore, under its Academic Research Fund Tier 2
    • National Research Foundation, Singapore and Infocomm Media Development Authority under its Future Communications Research & Development Programme; and the Key Project of Guangxi Science & Technology

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 279
      Total Downloads
    • Downloads (Last 12 months)176
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media