research-article

Karma: Adaptive Video Streaming via Causal Sequence Modeling

Authors:

Zhan MaAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 1527 - 1535

https://doi.org/10.1145/3581783.3612177

Published: 27 October 2023 Publication History

Abstract

Optimal adaptive bitrate (ABR) decision depends on a comprehensive characterization of state transitions that involve interrelated modalities over time including environmental observations, returns, and actions. However, state-of-the-art learning-based ABR algorithms solely rely on past observations to decide the next action. This paradigm tends to cause a chain of deviations from optimal action when encountering unfamiliar observations, which consequently undermines the model generalization.

This paper presents Karma, an ABR algorithm that utilizes causal sequence modeling to improve generalization by comprehending the interrelated causality among past observations, returns, and actions and timely refining action when deviation occurs. Unlike direct observation-to-action mapping, Karma recurrently maintains a multi-dimensional time series of observations, returns, and actions as input and employs causal sequence modeling via a decision transformer to determine the next action. In the input sequence, Karma uses the maximum cumulative future quality of experience (QoE) (a.k.a, QoE-to-go) as an extended return signal, which is periodically estimated based on current network conditions and playback status. We evaluate Karma through trace-driven simulations and real-world field tests, demonstrating superior performance compared to existing state-of-the-art ABR algorithms, with an average QoE improvement ranging from 10.8% to 18.7% across diverse network conditions. Furthermore, Karma exhibits strong generalization capabilities, showing leading performance under unseen networks in both simulations and real-world tests.

Supplemental Material

MP4 File

The presentation video of "Karma: Adaptive Video Streaming via Causal Sequence Modeling".

Download
198.97 MB

References

[1]

Akamai. 2016. dash.js. https://github.com/Dash-Industry-Forum/dash.js/. (2016).

[2]

Zahaib Akhtar, Yun Seong Nam, Ramesh Govindan, Sanjay Rao, Jessica Chen, Ethan Katz-Bassett, Bruno Ribeiro, Jibin Zhan, and Hui Zhang. 2018. Oboe: Auto-tuning video ABR algorithms to network conditions. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. 44--58.

Digital Library

[3]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).

[4]

Hari Balakrishnan, Mark Stemm, Srinivasan Seshan, and Randy H Katz. 1997. Analyzing stability in wide-area network performance. In Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems. 2--12.

Digital Library

[5]

Marc G Bellemare, Yavar Naddaf, Joel Veness, and Michael Bowling. 2013. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, Vol. 47 (2013), 253--279.

[6]

Abdelhak Bentaleb, Bayan Taani, Ali C Begen, Christian Timmerer, and Roger Zimmermann. 2018. A survey on bitrate adaptation schemes for streaming media over HTTP. IEEE Communications Surveys & Tutorials, Vol. 21, 1 (2018), 562--585.

[7]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).

[8]

Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch. 2021. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, Vol. 34 (2021), 15084--15097.

[9]

Visual Network Index Cisco. 2017. Cisco visual networking index: forecast and methodology 2016--2021. CISCO White paper (2017).

[10]

Ronald H Coase. 2013. The federal communications commission. The Journal of Law and Economics, Vol. 56, 4 (2013), 879--915.

[11]

Pim De Haan, Dinesh Jayaraman, and Sergey Levine. 2019. Causal confusion in imitation learning. Advances in Neural Information Processing Systems, Vol. 32 (2019).

[12]

Florin Dobrian, Vyas Sekar, Asad Awan, Ion Stoica, Dilip Joseph, Aditya Ganjam, Jibin Zhan, and Hui Zhang. 2011. Understanding the impact of video quality on user engagement. ACM SIGCOMM computer communication review, Vol. 41, 4 (2011), 362--373.

[13]

Kawin Ethayarajh. 2019. How contextual are contextualized word representations? comparing the geometry of BERT, ELMo, and GPT-2 embeddings. arXiv preprint arXiv:1909.00512 (2019).

[14]

Matteo Gadaleta, Federico Chiariotti, Michele Rossi, and Andrea Zanella. 2017. D-DASH: A deep Q-learning framework for DASH video streaming. IEEE Transactions on Cognitive Communications and Networking, Vol. 3, 4 (2017), 703--718.

[15]

Tianchi Huang. 2022. oboe-reproduce. https://github.com/godka/oboe-reproduce. (2022).

[16]

Tianchi Huang, Xin Yao, Chenglei Wu, Rui-Xiao Zhang, and Lifeng Sun. 2018. Tiyuntsong: A Self-Play Reinforcement Learning Approach for ABR Video Streaming. arXiv preprint arXiv:1811.06166 (2018).

[17]

Tianchi Huang, Chao Zhou, Rui-Xiao Zhang, Chenglei Wu, Xin Yao, and Lifeng Sun. 2019. Comyco: Quality-aware adaptive video streaming via imitation learning. In Proceedings of the 27th ACM International Conference on Multimedia. 429--437.

Digital Library

[18]

Te-Yuan Huang, Nikhil Handigol, Brandon Heller, Nick McKeown, and Ramesh Johari. 2012. Confused, timid, and unstable: picking a video streaming rate is hard. In Proceedings of the 2012 internet measurement conference. 225--238.

Digital Library

[19]

Te-Yuan Huang, Ramesh Johari, Nick McKeown, Matthew Trunnell, and Mark Watson. 2014. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. In Proceedings of the 2014 ACM conference on SIGCOMM. 187--198.

Digital Library

[20]

Chia-Chun Hung, Timothy Lillicrap, Josh Abramson, Yan Wu, Mehdi Mirza, Federico Carnevale, Arun Ahuja, and Greg Wayne. 2019. Optimizing agent behavior over long time scales by transporting value. Nature communications, Vol. 10, 1 (2019), 5223.

[21]

Cisco Visual Networking Index. 2015. Cisco visual networking index: Forecast and methodology 2015--2020. White paper, CISCO (2015).

[22]

Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z Leibo, David Silver, and Koray Kavukcuoglu. 2016. Reinforcement learning with unsupervised auxiliary tasks. arXiv preprint arXiv:1611.05397 (2016).

[23]

Michael Janner, Qiyang Li, and Sergey Levine. 2021. Offline reinforcement learning as one big sequence modeling problem. Advances in neural information processing systems, Vol. 34 (2021), 1273--1286.

[24]

Junchen Jiang, Vyas Sekar, and Hui Zhang. 2012. Improving fairness, efficiency, and stability in http-based adaptive video streaming with festive. In Proceedings of the 8th international conference on Emerging networking experiments and technologies. 97--108.

Digital Library

[25]

Nuowen Kan, Yuankun Jiang, Chenglin Li, Wenrui Dai, Junni Zou, and Hongkai Xiong. 2022. Improving Generalization for Neural Adaptive Video Streaming via Meta Reinforcement Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 3006--3016.

Digital Library

[26]

István Ketykó, Katrien De Moor, Toon De Pessemier, Adrián Juan Verdejo, Kris Vanhecke, Wout Joseph, Luc Martens, and Lieven De Marez. 2010. QoE measurement of mobile YouTube video streaming. Proceedings of the 3rd workshop on Mobile video delivery. 27--32.

Digital Library

[27]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[28]

S Shunmuga Krishnan and Ramesh K Sitaraman. 2012. Video stream quality impacts viewer behavior: inferring causality using quasi-experimental designs. In Proceedings of the 2012 Internet Measurement Conference. 211--224.

Digital Library

[29]

Yuxiang Lin, Yi Gao, and Wei Dong. 2022. Bandwidth Prediction for 5G Cellular Networks. In 2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS). IEEE, 1--10.

[30]

Hongzi Mao. 2017. pensieve. https://github.com/hongzimao/pensieve. (2017).

[31]

Hongzi Mao, Mohammad Alizadeh, Ishai Menache, and Srikanth Kandula. 2016. Resource management with deep reinforcement learning. In Proceedings of the 15th ACM workshop on hot topics in networks. 50--56.

Digital Library

[32]

Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with pensieve. In Proceedings of the conference of the ACM special interest group on data communication. 197--210.

Digital Library

[33]

Ricky KP Mok, Edmond WW Chan, and Rocky KC Chang. 2011a. Measuring the quality of experience of HTTP video streaming. In 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops. IEEE, 485--492.

[34]

Ricky KP Mok, Edmond WW Chan, Xiapu Luo, and Rocky KC Chang. 2011b. Inferring the QoE of HTTP video streaming from user-viewing activities. In Proceedings of the first ACM SIGCOMM workshop on Measurements up the stack. 31--36.

Digital Library

[35]

Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J Andrew Bagnell, Pieter Abbeel, Jan Peters, et al. 2018. An algorithmic perspective on imitation learning. Foundations and Trends® in Robotics, Vol. 7, 1--2 (2018), 1--179.

Digital Library

[36]

Kandaraj Piamrat, Cesar Viho, Jean-Marie Bonnin, and Adlen Ksentini. 2009. Quality of experience measurements for video streaming over wireless networks. In 2009 Sixth International Conference on Information Technology: New Generations. IEEE, 1184--1189.

Digital Library

[37]

Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al. 2018. Improving language understanding by generative pre-training. (2018).

[38]

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-shot text-to-image generation. In International Conference on Machine Learning. PMLR, 8821--8831.

[39]

Haakon Riiser, Paul Vigmostad, Carsten Griwodz, and Pål Halvorsen. 2013. Commute path bandwidth traces from 3G networks: analysis and applications. In Proceedings of the 4th ACM Multimedia Systems Conference. 114--118.

Digital Library

[40]

Bojan Rikic, Dragan Samardvz ija, Ognjen vC adovski, and Tomislav Maruna. 2021. Cellular network bandwidth prediction in consumer applications. In 2021 IEEE International Conference on Consumer Electronics (ICCE). IEEE, 1--3.

[41]

Kevin Spiteri, Rahul Urgaonkar, and Ramesh K Sitaraman. 2020. BOLA: Near-optimal bitrate adaptation for online videos. IEEE/ACM Transactions On Networking, Vol. 28, 4 (2020), 1698--1711.

Digital Library

[42]

Yi Sun, Xiaoqi Yin, Junchen Jiang, Vyas Sekar, Fuyuan Lin, Nanshu Wang, Tao Liu, and Bruno Sinopoli. 2016. CS2P: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 ACM SIGCOMM Conference. 272--285.

Digital Library

[43]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.

Digital Library

[44]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).

[45]

Zhengxu Xia, Yajie Zhou, Francis Y Yan, and Junchen Jiang. 2022. Genet: automatic curriculum generation for learning adaptation in networking. In Proceedings of the ACM SIGCOMM 2022 Conference. 397--413.

Digital Library

[46]

Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. 2015. A control-theoretic approach for dynamic adaptive video streaming over HTTP. In Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication. 325--338.

Digital Library

[47]

Danfu Yuan, Yuanhong Zhang, Weizhan Zhang, Xuncheng Liu, Haipeng Du, and Qinghua Zheng. 2022. PRIOR: deep reinforced adaptive video streaming with attention-based throughput prediction. In Proceedings of the 32nd Workshop on Network and Operating Systems Support for Digital Audio and Video. 36--42.

Digital Library

[48]

Yin Zhang and Nick Duffield. 2001. On the constancy of Internet path properties. In Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement. 197--211.

Digital Library

[49]

Xuan Kelvin Zou, Jeffrey Erman, Vijay Gopalakrishnan, Emir Halepovic, Rittwik Jana, Xin Jin, Jennifer Rexford, and Rakesh K Sinha. 2015. Can accurate predictions improve video streaming in cellular networks?. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications. 57--62.

Digital Library

Index Terms

Karma: Adaptive Video Streaming via Causal Sequence Modeling
1. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia streaming

Recommendations

Smooth control of adaptive media playout for video streaming

Client-side data buffering is a common technique to deal with media playout interruptions of streaming video caused by network jitters and packet losses of best-effort networks. How-ever, stronger playout interruption protection inevitably amounts to ...
Optimize adaptive media playout using dynamic fuzzy logic control for video streaming

Adaptive Media Playout (AMP) controls adapt playout rate to prevent buffer outage and to reduce delay in playout. Most AMP techniques use buffer fullness or its variation as indicator to adapt to playout rate. Nonetheless, selecting a convenient buffer ...
Modeling best-effort and FEC streaming of scalable video in lossy network channels

Video applications that transport delay-sensitive multimedia over best-effort networks usually require special mechanisms that can overcome packet loss without using retransmission. In response to this demand, forward-error correction (FEC) is often ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Jiangsu Provincial Double-Innovation Doctor Program
National Natural Science Foundation of China

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
194
Total Downloads

Downloads (Last 12 months)129
Downloads (Last 6 weeks)12

Reflects downloads up to 26 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten