Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3152434.3152435acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

Harvesting Randomness to Optimize Distributed Systems

Published: 30 November 2017 Publication History

Abstract

We view randomization through the lens of statistical machine learning: as a powerful resource for offline optimization. Cloud systems make randomized decisions all the time (e.g., in load balancing), yet this randomness is rarely used for optimization after-the-fact. By casting system decisions in the framework of reinforcement learning, we show how to collect data from existing systems, without modifying them, to evaluate new policies, without deploying them. Our methodology, called harvesting randomness, has the potential to accurately estimate a policy's performance without the risk or cost of deploying it on live traffic. We quantify this optimization power and apply it to a real machine health scenario in Azure Compute. We also apply it to two prototyped scenarios, for load balancing (Nginx) and caching (Redis), with much less success, and use them to identify the systems and machine learning challenges to achieving our goal.
Our long-term agenda is to harvest the randomness in distributed systems to develop non-invasive and efficient techniques for optimizing them. Like CPU cycles and bandwidth, we view randomness as a valuable resource being wasted by the cloud, and we seek to remedy this.

Supplementary Material

MP4 File (lecuyer.mp4)

References

[1]
Alekh Agarwal, Sarah Bird, Markus Cozowicz, Luong Hoang, John Langford, Stephen Lee, Jiaji Li, Dan Melamed, Gal Oshri, Oswaldo Ribas, Siddhartha Sen, and Alex Slivkins. 2017. The Power of Offline Evaluation in Online Decision Making. CoRR abs/1606.03966v2.
[2]
Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, and Robert Schapire. 2014. Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits. In 31st Intl. Conf. on Machine Learning (ICML).
[3]
Enda Barrett, Enda Howley, and Jim Duggan. 2013. Applying reinforcement learning towards automating resource allocation and application scalability in the cloud. Concurrency and Computation: Practice and Experience (2013).
[4]
Léon Bottou, Jonas Peters, Joaquin Quiñonero Candela, Denis Xavier Charles, Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Y. Simard, and Ed Snelson. 2013. Counterfactual reasoning and learning systems: the example of computational advertising. J. Mach. Learn. Res. (JMLR) (2013).
[5]
Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2009. A reinforcement learning approach to online web systems auto-configuration. In International Conference on Distributed Computing Systems (ICDCS).
[6]
Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2013. Coordinated Self-Configuration of Virtual Machines and Appliances Using a Model-Free Learning Approach. IEEE Trans. Parallel Distrib. Syst. (2013).
[7]
Miroslav Dudík, John Langford, and Lihong Li. 2011. Doubly Robust Policy Evaluation and Learning. In Intl. Conf. on Machine Learning (ICML).
[8]
Alan S. Gerber and Donald P. Green. 2012. Field Experiments: Design, Analysis, and Interpretation. W.W. Norton&Co, Inc.
[9]
D. G. Horvitz and D. J. Thompson. 1952. A Generalization of Sampling Without Replacement from a Finite Universe. J. Amer. Statist. Assoc.
[10]
Kevin G Jamieson, Lalit Jain, Chris Fernandez, Nicholas J Glattard, and Rob Nowak. 2015. NEXT: A System for Real-World Development, Evaluation, and Application of Active Learning. In Advances in Neural Information Processing Systems (NIPS).
[11]
Junchen Jiang, Vyas Sekar, Ion Stoica, and Hui Zhang. 2010. Unleashing the Potential of Data-Driven Networking. In International Conference on Communication Systems and Networks (COMSNETS).
[12]
Junchen Jiang, Shijie Sun, Vyas Sekar, and Hui Zhang. 2017. Pytheas: Enabling Data-Driven Quality of Experience Optimization Using Group-Based Exploration-Exploitation. In USENIX Symposium on Networked Systems Design and Implementation (NSDI).
[13]
Nan Jiang and Lihong Li. 2016. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning. In Intl. Conf. on Machine Learning (ICML).
[14]
Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann. 2013. Online controlled experiments at large scale. In ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD).
[15]
Ron Kohavi and Roger Longbotham. 2015. Online Controlled Experiments and A/B Tests. In Encyclopedia of Machine Learning and Data Mining, Claude Sammut and Geoff Webb (Ed.). Springer. To appear.
[16]
Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M. Henne. 2009. Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Discov. (2009).
[17]
John Langford, Alexander Strehl, and Jennifer Wortman. 2008. Exploration Scavenging. In Intl. Conf. on Machine Learning (ICML).
[18]
John Langford and Tong Zhang. 2007. The Epoch-Greedy Algorithm for Contextual Multi-armed Bandits. In Advances in Neural Information Processing Systems (NIPS).
[19]
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Intl. World Wide Web Conf. (WWW).
[20]
Lihong Li, Wei Chu, John Langford, and Xuanhui Wang. 2011. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms. In ACM Intl. Conf. on Web Search and Data Mining (WSDM).
[21]
Konstantinos Lolos, Ioannis Konstantinou, Verena Kantere, and Nectarios Koziris. 2017. Elastic Resource Management with Adaptive State Space Partitioning of Markov Decision Processes. arXiv preprint arXiv:1702.02978 (2017).
[22]
Travis Mandel, Yun-En Liu, Sergey Levine, Emma Brunskill, and Zoran Popovic. 2014. Offline Policy Evaluation Across Representations with Applications to Educational Games. In International Conference on Autonomous Agents and Multi-agent Systems.
[23]
Shie Mannor, Duncan Simester, Peng Sun, and John N Tsitsiklis. 2007. Bias and variance approximation in value function estimates. Management Science (2007).
[24]
Hongzi Mao, Mohammad Alizadeh, Ishai Menache, and Srikanth Kandula. 2016. Resource Management with Deep Reinforcement Learning. In ACM Workshop on Hot Topics in Networks (HotNets).
[25]
Microsoft. Accessed in 2017. Custom Decision Service. http://ds.microsoft.com. (Accessed in 2017).
[26]
Azalia Mirhoseini, Hieu Pham, Quoc V Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean. 2017. Device Placement Optimization with Reinforcement Learning. arXiv preprint arXiv:1706.04972 (2017).
[27]
Netflix. 2011. The Netflix Simian Army. https://medium.com/netflix-techblog/the-netflix-simian-army-16e57fbab116. (2011).
[28]
Nginx. Accessed in 2017. https://www.nginx.com/. (Accessed in 2017).
[29]
Nginx. Accessed in 2017. Nginx list or variables. https://nginx.org/en/docs/varindex.html. (Accessed in 2017).
[30]
Optimizely. Accessed in 2017. Optimizely: A/B Testing & Personalization Platform. https://www.optimizely.com/. (Accessed in 2017).
[31]
Cosmin Paduraru. 2012. Off-policy evaluation in Markov decision processes. Ph.D. Dissertation. McGill University.
[32]
Barry Porter, Matthew Grieves, Roberto Rodrigues Filho, and David Leslie. 2016. REX: A Development Platform and Online Learning Approach for Runtime Emergent Software Systems. In USENIX Symposium on Operating Systems Design and Implementation (OSDI).
[33]
Doina Precup. 2000. Eligibility traces for off-policy policy evaluation. Computer Science Department Faculty Publication Series (2000).
[34]
Jia Rao, Xiangping Bu, Cheng-Zhong Xu, Leyi Wang, and George Yin. 2009. VCONF: a reinforcement learning approach to virtual machines auto-configuration. In International conference on Autonomic computing.
[35]
Redis. Accessed in 2017. Redis Key-Value Store. http://http://redis.io. (Accessed in 2017).
[36]
D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. Machine Learning: The High-Interest Credit Card of Technical Debt. In SE4ML: Software Engineering 4 Machine Learning.
[37]
Stephen M. Stigler. 1992. A Historical View of Statistical Concepts in Psychology and Educational Research. American Journal of Education (1992).
[38]
Richard S Sutton and Andrew G Barto. 2017. Reinforcement learning: An introduction.
[39]
Richard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, and Eric Wiewiora. 2009. Fast Gradient-descent Methods for Temporal-difference Learning with Linear Function Approximation. In Intl. Conf. on Machine Learning (ICML).
[40]
Philip S Thomas, Georgios Theocharous, and Mohammad Ghavamzadeh. 2015. High-Confidence Off-Policy Evaluation. In AAAI Conference on Artificial Intelligence.
[41]
Dimitrios Tsoumakos, Ioannis Konstantinou, Christina Boumpouka, Spyros Sioutas, and Nectarios Koziris. 2013. Automated, elastic resource provisioning for nosql clusters using tiramola. In Cluster, Cloud and Grid Computing (CCGrid).

Cited By

View all
  • (2020)Learning relaxed Belady for content distribution network cachingProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388281(529-544)Online publication date: 25-Feb-2020
  • (2020)Tools Of Causal Inference: Review and ProspectsControl Systems and Computers10.15407/csc.2020.05.052(52-63)Online publication date: Dec-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HotNets '17: Proceedings of the 16th ACM Workshop on Hot Topics in Networks
November 2017
206 pages
ISBN:9781450355698
DOI:10.1145/3152434
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

HotNets-XVI
Sponsor:
HotNets-XVI: The 16th ACM Workshop on Hot Topics in Networks
November 30 - December 1, 2017
CA, Palo Alto, USA

Acceptance Rates

HotNets '17 Paper Acceptance Rate 28 of 124 submissions, 23%;
Overall Acceptance Rate 110 of 460 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Learning relaxed Belady for content distribution network cachingProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388281(529-544)Online publication date: 25-Feb-2020
  • (2020)Tools Of Causal Inference: Review and ProspectsControl Systems and Computers10.15407/csc.2020.05.052(52-63)Online publication date: Dec-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media