research-article

Resource Management with Deep Reinforcement Learning

Authors:

Mohammad Alizadeh,

Srikanth KandulaAuthors Info & Claims

HotNets '16: Proceedings of the 15th ACM Workshop on Hot Topics in Networks

Pages 50 - 56

https://doi.org/10.1145/3005745.3005750

Published: 09 November 2016 Publication History

Abstract

Resource management problems in systems and networking often manifest as difficult online decision making tasks where appropriate solutions depend on understanding the workload and environment. Inspired by recent advances in deep reinforcement learning for AI problems, we consider building systems that learn to manage resources directly from experience. We present DeepRM, an example solution that translates the problem of packing tasks with multiple resource demands into a learning problem. Our initial results show that DeepRM performs comparably to state-of-the-art heuristics, adapts to different conditions, converges quickly, and learns strategies that are sensible in hindsight.

References

[1]

Terminator, http://www.imdb.com/title/tt0088247/.

[2]

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow. org, 2015.

[3]

P. Abbeel, A. Coates, M. Quigley, and A. Y. Ng. An application of reinforcement learning to aerobatic helicopter flight. Advances in neural information processing systems, page 1, 2007.

Digital Library

[4]

S. Agarwal, S. Kandula, N. Bruno, M.-C. Wu, I. Stoica, and J. Zhou. Reoptimizing data parallel computing. In NSDI, pages 281-294, San Jose, CA, 2012. USENIX.

Digital Library

[5]

G. Ananthanarayanan, S. Kandula, A. G. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the outliers in map-reduce clusters using mantri. In OSDI, number 1, page 24, 2010.

Digital Library

[6]

M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, et al. A view of cloud computing. Communications of the ACM, (4), 2010.

Digital Library

[7]

D. P. Bertsekas and J. N. Tsitsiklis. Neuro-dynamic programming: an overview. In Decision and Control, IEEE, 1995.

[8]

J. A. Boyan and M. L. Littman. Packet routing in dynamically changing networks: A reinforcement learning approach. Advances in neural information processing systems, 1994.

Digital Library

[9]

A. R. Cassandra and L. P. Kaelbling. Learning policies for partially observable environments: Scaling up. In Machine Learning Proceedings 1995, page 362. Morgan Kaufmann, 2016.

[10]

T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In OSDI, pages 571-582, Broomfield, CO, Oct. 2014. USENIX Association.

Digital Library

[11]

J. Dean and L. A. Barroso. The tail at scale. Communications of the ACM, pages 74-80, 2013.

Digital Library

[12]

C. Delimitrou and C. Kozyrakis. Quasar: Resource-efficient and qos-aware cluster management. ASPLOS '14, pages 127-144, New York, NY, USA, 2014. ACM.

Digital Library

[13]

M. Dong, Q. Li, D. Zarchy, P. B. Godfrey, and M. Schapira. Pcc: Re-architecting congestion control for consistent high performance. In NSDI, pages 395-408, Oakland, CA, May 2015. USENIX Association.

Digital Library

[14]

A. D. Ferguson, P. Bodik, S. Kandula, E. Boutin, and R. Fonseca. Jockey: guaranteed job latency in data parallel clusters. In Proceedings of the 7th ACM european conference on Computer Systems. ACM, 2012.

Digital Library

[15]

J. Gao and R. Evans. Deepmind ai reduces google data centre cooling bill by 40%. https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-40/.

[16]

A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. NSDI'11, pages 323-336, Berkeley, CA, USA, 2011. USENIX Association.

Digital Library

[17]

R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-resource packing for cluster schedulers. SIGCOMM '14, pages 455-466, New York, NY, USA, 2014. ACM.

Digital Library

[18]

M. T. Hagan, H. B. Demuth, M. H. Beale, and O. De Jesús. Neural network design. PWS publishing company Boston, 1996.

Digital Library

[19]

W. K. Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, (1), 1970.

[20]

B. Heller, S. Seetharaman, P. Mahadevan, Y. Yiakoumis, P. Sharma, S. Banerjee, and N. McKeown. Elastictree: Saving energy in data center networks. NSDI'10, Berkeley, CA, USA, 2010. USENIX Association.

Digital Library

[21]

G. Hinton. Overview of mini-batch gradient descent. Neural Networks for Machine Learning.

[22]

M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In ACM SIGOPS, 2009.

Digital Library

[23]

J. Junchen, D. Rajdeep, A. Ganesh, C. Philip, P. Venkata, S. Vyas, D. Esbjorn, G. Marcin, K. Dalibor, V. Renat, and Z. Hui. A control-theoretic approach for dynamic adaptive video streaming over http. SIGCOMM '15, New York, NY, USA, 2015. ACM.

Digital Library

[24]

L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of artificial intelligence research, 1996.

Digital Library

[25]

J. Kober, J. A. Bagnell, and J. Peters. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 2013.

Digital Library

[26]

S. Mahadevan and G. Theocharous. Optimizing production manufacturing using reinforcement learning. In FLAIRS Conference, 1998.

Digital Library

[27]

I. Menache, S. Mannor, and N. Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, (1), 2005.

[28]

V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. P. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu. Asynchronous methods for deep reinforcement learning. CoRR, 2016.

[29]

V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. A. Riedmiller. Playing atari with deep reinforcement learning. CoRR, 2013.

[30]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. H. I. Antonoglou, D. Wierstra, and M. A. Riedmiller. Human-level control through deep reinforcement learning. Nature, 2015.

[31]

G. E. Monahan. State of the art - a survey of partially observable markov decision processes: theory, models, and algorithms. Management Science, (1), 1982.

[32]

J. Schulman, S. Levine, P. Moritz, M. I. Jordan, and P. Abbeel. Trust region policy optimization. CoRR, abs/1502.05477, 2015.

[33]

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershevlvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis. Mastering the game of go with deep neural networks and tree search. Nature, 2016.

[34]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.

Digital Library

[35]

R. S. Sutton, D. A. McAllester, S. P. Singh, Y. Mansour, et al. Policy gradient methods for reinforcement learning with function approximation. In NIPS, 1999.

Digital Library

[36]

V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, B. Saha, C. Curino, O. O'Malley, S. Radia, B. Reed, and E. Baldeschwieler. Apache hadoop yarn: Yet another resource negotiator. SOCC '13, pages 5:1-5:16, New York, NY, USA, 2013. ACM.

Digital Library

[37]

K. Winstein and H. Balakrishnan. TCP Ex Machina: Computer-generated Congestion Control. In SIGCOMM, 2013.

Digital Library

[38]

K. Winstein, A. Sivaraman, and H. Balakrishnan. Stochastic forecasts achieve high throughput and low delay over cellular networks. In NSDI, pages 459-471, Lombard, IL, 2013. USENIX.

Digital Library

[39]

S. Yi, Y. Xiaoqi, J. Junchen, S. Vyas, L. Fuyuan, W. Nanshu, L. Tao, and B. Sinopoli. Cs2p: Improving video bitrate selection and adaptation with data-driven throughput prediction. SIGCOMM, New York, NY, USA, 2016. ACM.

Digital Library

[40]

X. Yin, A. Jindal, V. Sekar, and B. Sinopoli. Via: Improving internet telephony call quality using predictive relay selection. In SIGCOMM, SIGCOMM '16, 2016.

[41]

M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In EuroSys, 2010.

Digital Library

[42]

W. Zhang and T. G. Dietterich. A reinforcement learning approach to job-shop scheduling. In IJCAI. Citeseer, 1995.

Digital Library

Cited By

Srinivasan N(2024)Artificial Intelligence in IoT Security: Review of Advancements, Challenges, and Future DirectionsInternational Journal of Innovative Technology and Exploring Engineering10.35940/ijitee.G9911.1307062413:7(14-20)Online publication date: 30-Jun-2024
https://doi.org/10.35940/ijitee.G9911.13070624
Wang FHe QLi S(2024)Solving Combinatorial Optimization Problems with Deep Neural Network: A SurveyTsinghua Science and Technology10.26599/TST.2023.901007629:5(1266-1282)Online publication date: Oct-2024
https://doi.org/10.26599/TST.2023.9010076
Xiao ZLiu KHu MWu D(2024)DeepCTS: A Deep Reinforcement Learning Approach for AI Container Task SchedulingProceedings of the 2024 3rd Asia Conference on Algorithms, Computing and Machine Learning10.1145/3654823.3654885(342-347)Online publication date: 22-Mar-2024
https://dl.acm.org/doi/10.1145/3654823.3654885
Show More Cited By

Recommendations

Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Conversational Recommender System Using Deep Reinforcement Learning
RecSys '22: Proceedings of the 16th ACM Conference on Recommender Systems

Deep Reinforcement Learning (DRL) uses the best of both Reinforcement Learning and Deep Learning for solving problems which cannot be addressed by them individually. Deep Reinforcement Learning has been used widely for games, robotics etc. Limited work ...
Reinforcement Learning: With Open AI, TensorFlow and Keras Using Python

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HotNets '16: Proceedings of the 15th ACM Workshop on Hot Topics in Networks

November 2016

217 pages

ISBN:9781450346610

DOI:10.1145/3005745

General Chair:
Ellen Zegura
Georgia Tech
,
Program Chairs:
Bryan Ford
EPFL
,
Alex C. Snoeren
UC San Diego

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

HotNets-XV

Sponsor:

SIGCOMM

HotNets-XV: The 15th ACM Workshop on Hot Topics in Networks

November 9 - 10, 2016

GA, Atlanta, USA

Acceptance Rates

HotNets '16 Paper Acceptance Rate 30 of 108 submissions, 28%;

Overall Acceptance Rate 110 of 460 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

699
Total Citations
View Citations
7,556
Total Downloads

Downloads (Last 12 months)785
Downloads (Last 6 weeks)58

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Srinivasan N(2024)Artificial Intelligence in IoT Security: Review of Advancements, Challenges, and Future DirectionsInternational Journal of Innovative Technology and Exploring Engineering10.35940/ijitee.G9911.1307062413:7(14-20)Online publication date: 30-Jun-2024
https://doi.org/10.35940/ijitee.G9911.13070624
Wang FHe QLi S(2024)Solving Combinatorial Optimization Problems with Deep Neural Network: A SurveyTsinghua Science and Technology10.26599/TST.2023.901007629:5(1266-1282)Online publication date: Oct-2024
https://doi.org/10.26599/TST.2023.9010076
Xiao ZLiu KHu MWu D(2024)DeepCTS: A Deep Reinforcement Learning Approach for AI Container Task SchedulingProceedings of the 2024 3rd Asia Conference on Algorithms, Computing and Machine Learning10.1145/3654823.3654885(342-347)Online publication date: 22-Mar-2024
https://dl.acm.org/doi/10.1145/3654823.3654885
Nouma SYavuz A(2024)Trustworthy and Efficient Digital Twins in Post-Quantum Era with Hybrid Hardware-Assisted SignaturesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363825020:6(1-30)Online publication date: 8-Mar-2024
https://dl.acm.org/doi/10.1145/3638250
Li PSong MXing MXiao ZDing QGuan SLong JChua TNgo CKa-Wei Lee RKumar RLauw H(2024)SPRING: Improving the Throughput of Sharding Blockchain via Deep Reinforcement Learning Based State PlacementProceedings of the ACM on Web Conference 202410.1145/3589334.3645386(2836-2846)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645386
Pasandi HNadeem T(2024)Autonomous On-Device Protocols: Empowering Wireless with Self-Driven Capabilities2024 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC57260.2024.10571037(1-6)Online publication date: 21-Apr-2024
https://doi.org/10.1109/WCNC57260.2024.10571037
Cai JZhou ZHuang ZDai WYu F(2024)Privacy-Preserving Deployment Mechanism for Service Function Chains Across Multiple DomainsIEEE Transactions on Network and Service Management10.1109/TNSM.2023.331158721:1(1241-1256)Online publication date: Feb-2024
https://doi.org/10.1109/TNSM.2023.3311587
Schneider SKarl HKhalili RHecker A(2024)Multi-Agent Deep Reinforcement Learning for Coordinated Multipoint in Mobile NetworksIEEE Transactions on Network and Service Management10.1109/TNSM.2023.330096221:1(908-924)Online publication date: Feb-2024
https://doi.org/10.1109/TNSM.2023.3300962
Lu SWu JShi JFang JZhang JLiu H(2024)Towards Dynamic Request Updating With Elastic Scheduling for Multi-Tenant Cloud-Based Data Center NetworkIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.334190711:2(2223-2237)Online publication date: Mar-2024
https://doi.org/10.1109/TNSE.2023.3341907
Wei JQiu ZWang FLin WGui NGui W(2024)Understanding via Exploration: Discovery of Interpretable Features With Deep Reinforcement LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318495635:2(1696-1707)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3184956
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents