Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3625468.3653068acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article
Open access

ACM MMSys 2024 Bandwidth Estimation in Real Time Communications Challenge

Published: 17 April 2024 Publication History
  • Get Citation Alerts
  • Abstract

    The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From the first bandwidth estimation challenge which was hosted at ACM MMSys 2021, we learned that bandwidth estimation models trained with reinforcement learning (RL) in simulations to maximize network-based reward functions may not be optimal in reality due to the sim-to-real gap and the difficulty of aligning network-based rewards with user-perceived QoE. This grand challenge aims to advance bandwidth estimation model design by aligning reward maximization with user-perceived QoE optimization using offline RL and a real-world dataset with objective rewards which have high correlations with subjective audio/video quality in Microsoft Teams. All models submitted to the grand challenge underwent initial evaluation on our emulation platform. For a comprehensive evaluation under diverse network conditions with temporal fluctuations, top models were further evaluated on our geographically distributed testbed by using each model to conduct 600 calls within a 12-day period. The winning model is shown to deliver comparable performance to the top behavior policy in the released dataset. By leveraging real-world data and integrating objective audio/video quality scores as rewards, offline RL can therefore facilitate the development of competitive bandwidth estimators for RTC.

    References

    [1]
    Abdelhak Bentaleb, Mehmet N Akcay, May Lim, Ali C Begen, and Roger Zimmermann. 2022. BoB: Bandwidth prediction for real-time communications using heuristic and reinforcement learning. IEEE Transactions on Multimedia (2022).
    [2]
    Ekrem Cetinkaya, Ahmet Pehlivanoglu, Ihsan U. Ayten, Basar Yumakogullari, Mehmet E. Ozgun, Yigit K. Erinc, Enes Deniz, and Ali C. Begen. 2024. Offline Reinforcement Learning for Bandwidth Estimation in RTC Using a Fast Actor and not-So-Furious Critic. In Proceedings of the 15th ACM Multimedia Systems Conference.
    [3]
    Jeongyoon Eo, Zhixiong Niu, Wenxue Cheng, Francis Y Yan, Rui Gao, Jorina Kardhashi, Scott Inglis, Michael Revow, Byung-Gon Chun, Peng Cheng, et al. 2022. OpenNetLab: Open platform for RL-based congestion control for real-time communications. Proc. of APNet (2022).
    [4]
    Yuwei Fu, Di Wu, and Benoit Boulet. 2022. A closer look at offline rl agents. Advances in Neural Information Processing Systems 35 (2022), 8591--8604.
    [5]
    Scott Fujimoto and Shixiang Shane Gu. 2021. A minimalist approach to offline reinforcement learning. Advances in neural information processing systems 34 (2021), 20132--20145.
    [6]
    Scott Fujimoto, David Meger, and Doina Precup. 2019. Off-policy deep reinforcement learning without exploration. In International conference on machine learning. PMLR, 2052--2062.
    [7]
    Aashish Gottipati, Sami Khairy, Gabriel Mittag, Vishak Gopal, and Ross Cutler. 2023. Real-time Bandwidth Estimation from Offline Expert Demonstrations. arXiv preprint arXiv:2309.13481 (2023).
    [8]
    Ilya Kostrikov, Rob Fergus, Jonathan Tompson, and Ofir Nachum. 2021. Offline reinforcement learning with fisher divergence critic regularization. In International Conference on Machine Learning. PMLR, 5774--5783.
    [9]
    Aviral Kumar, Justin Fu, Matthew Soh, George Tucker, and Sergey Levine. 2019. Stabilizing off-policy q-learning via bootstrapping error reduction. Advances in Neural Information Processing Systems 32 (2019).
    [10]
    Aviral Kumar, Joey Hong, Anikait Singh, and Sergey Levine. 2022. When should we prefer offline reinforcement learning over behavioral cloning? arXiv preprint arXiv:2204.05618 (2022).
    [11]
    Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conservative q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems 33 (2020), 1179--1191.
    [12]
    Sascha Lange, Thomas Gabel, and Martin Riedmiller. 2012. Batch reinforcement learning. In Reinforcement learning: State-of-the-art. Springer, 45--73.
    [13]
    Haoyong Li, Bingcong Lu, Jun Xu, Li Song, Wenjun Zhang, Lin Li, and Yaoyao Yin. 2022. Reinforcement learning based cross-layer congestion control for real-time communication. In 2022 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE, 01--06.
    [14]
    Haoyong Li, Bingcong Lu, Jun Xu, Li Song, Wenjun Zhang, Lin Li, and Yaoyao Yin. 2022. Reinforcement Learning Based Cross-Layer Congestion Control for Real-Time Communication. In 2022 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). 01--06. https://doi.org/10.1109/BMSB55706.2022.9828569
    [15]
    Haoyong Li, Bingcong Lu, Jun Xu, Li Song, Wenjun Zhang, Lin Li, and Yaoyao Yin. 2022. Reinforcement learning based cross-layer congestion control for real-time communication. In 2022 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE, 01--06.
    [16]
    Bingcong Lu, Keyu Wang, Jun Xu, Li Song, Rong Xie, and Wenjun Zhang. 2024. Pioneer: Offline Reinforcement Learning based Bandwidth Estimation for Real-Time Communication. In Proceedings of the 15th ACM Multimedia Systems Conference.
    [17]
    Dena Markudova and Michela Meo. 2023. ReCoCo: Reinforcement learning-based Congestion control for Real-time applications. In 2023 IEEE 24th International Conference on High Performance Switching and Routing (HPSR). IEEE, 68--74.
    [18]
    Gabriel Mittag, Babak Naderi, Vishak Gopal, and Ross Cutler. 2023. LSTM-Based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.
    [19]
    Mitsuhiko Nakamoto, Simon Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, and Sergey Levine. 2024. Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning. Advances in Neural Information Processing Systems 36 (2024).
    [20]
    Henning Schulzrinne, Stephen Casner, Ron Frederick, and Van Jacobson. 2003. RTP: A transport protocol for real-time applications. Technical Report.
    [21]
    Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, and Sergey Levine. 2024. ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints. Advances in Neural Information Processing Systems 36 (2024).
    [22]
    Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
    [23]
    Qingyue Tan, Gerui Lv, Xing Fang, Jiaxing Zhang, Zejun Yang, Yuan Jiang, and Qinghua and Wu. 2024. Accurate Bandwidth Prediction for Real-Time Media Streaming with Offline Reinforcement Learning. In Proceedings of the 15th ACM Multimedia Systems Conference.
    [24]
    Bo Wang, Yuan Zhang, Size Qian, Zipeng Pan, and Yuhong Xie. 2021. A hybrid receiver-side congestion control scheme for web real-time communication. In Proceedings of the 12th ACM Multimedia Systems Conference. 332--338.
    [25]
    Shibo Wang, Hairong Su, Qiang Su, Shusen Yang, and Hong Xu. 2024. Drawing Insights from Congestion Control Research: An Effortless Bitrate Adaptation Approach for Real-Time Communication.
    [26]
    Ziyu Wang, Alexander Novikov, Konrad Zolna, Josh S Merel, Jost Tobias Springenberg, Scott E Reed, Bobak Shahriari, Noah Siegel, Caglar Gulcehre, Nicolas Heess, et al. 2020. Critic regularized regression. Advances in Neural Information Processing Systems 33 (2020), 7768--7778.
    [27]
    Yifan Wu, George Tucker, and Ofir Nachum. 2019. Behavior regularized offline reinforcement learning. arXiv preprint arXiv:1911.11361 (2019).
    [28]
    Chen-Yu Yen, Soheil Abbasloo, and H Jonathan Chao. 2023. Computers Can Learn from the Heuristic Designs and Master Internet Congestion Control. In Proceedings of the ACM SIGCOMM 2023 Conference. 255--274.
    [29]
    Huanhuan Zhang, Anfu Zhou, Jiamin Lu, Ruoxuan Ma, Yuhan Hu, Cong Li, Xinyu Zhang, Huadong Ma, and Xiaojiang Chen. 2020. OnRL: improving mobile video telephony via online reinforcement learning. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking. 1--14.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MMSys '24: Proceedings of the 15th ACM Multimedia Systems Conference
    April 2024
    557 pages
    ISBN:9798400704123
    DOI:10.1145/3625468
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 April 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Bandwidth estimation
    2. offline reinforcement learning
    3. real-time communication

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    MMSys '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 176 of 530 submissions, 33%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 183
      Total Downloads
    • Downloads (Last 12 months)183
    • Downloads (Last 6 weeks)59
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media