Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3545008.3545018acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

DeepCAT: A Cost-Efficient Online Configuration Auto-Tuning Approach for Big Data Frameworks

Published: 13 January 2023 Publication History

Abstract

To support different application scenarios, big data frameworks usually provide a large number of performance-related configuration parameters. Online auto-tuning these parameters based on deep reinforcement learning to achieve a better performance has shown their advantages over search-based and machine learning-based approaches. Unfortunately, the time consumption during the online tuning phase of conventional DRL-based methods is still heavy, especially for big data applications. Therefore, in this paper, we propose DeepCAT, a cost-efficient deep reinforcement learning-based approach to achieve online configuration auto-tuning for big data frameworks. To reduce the total online tuning cost: 1) DeepCAT utilizes the TD3 algorithm instead of DDPG to alleviate value overestimation; 2) DeepCAT modifies the conventional experience replay to fully utilize the rare but valuable transitions via a novel reward-driven prioritized experience replay mechanism; 3) DeepCAT designs a Twin-Q Optimizer to estimate the execution time of each action without the costly configuration evaluation and optimize the sub-optimal ones to achieve a low-cost exploration-exploitation trade off. Experimental results based on a local 3-node Spark cluster and HiBench benchmark applications show that DeepCAT is able to speed up the best execution time by a factor of 1.45 × and 1.65 × on average respectively over CDBTune and OtterTune, while consuming up to 50.08% and 53.39% less total tuning time.

References

[1]
Omid Alipourfard, Hongqiang Harry Liu, Jianshu Chen, Shivaram Venkataraman, Minlan Yu, and Ming Zhang. 2017. {CherryPick}: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). 469–482.
[2]
Liang Bao, Xin Liu, Fangzheng Wang, and Baoyin Fang. 2019. ACTGAN: automatic configuration tuning for software systems with generative adversarial networks. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 465–476.
[3]
Liang Bao, Xin Liu, Ziheng Xu, and Baoyin Fang. 2018. Autoconfig: Automatic configuration tuning for distributed message systems. In 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 29–40.
[4]
Zhendong Bei, Zhibin Yu, Ni Luo, Chuntao Jiang, Chengzhong Xu, and Shengzhong Feng. 2018. Configuring in-memory cluster computing using random forest. Future Generation Computer Systems 79 (2018), 1–15.
[5]
Xiangping Bu, Jia Rao, and Cheng-Zhong Xu. 2009. A reinforcement learning approach to online web systems auto-configuration. In 2009 29th IEEE International Conference on Distributed Computing Systems. IEEE, 2–11.
[6]
Maria Casimiro, Diego Didona, Paolo Romano, Luis Rodrigues, Willy Zwaenepoel, and David Garlan. 2020. Lynceus: Cost-efficient tuning and provisioning of data analytic jobs. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). IEEE, 56–66.
[7]
Hui Dou, Pengfei Chen, and Zibin Zheng. 2020. Hdconfigor: automatically tuning high dimensional configuration parameters for log search engines. IEEE Access 8(2020), 80638–80653.
[8]
Ayat Fekry, Lucian Carata, Thomas Pasquier, Andrew Rice, and Andy Hopper. 2020. To tune or not to tune? in search of optimal configurations for data analytics. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2494–2504.
[9]
Scott Fujimoto, Herke Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. In International conference on machine learning. PMLR, 1587–1596.
[10]
Jia-Ke Ge, Yan-Feng Chai, and Yun-Peng Chai. 2021. WATuning: A Workload-Aware Tuning System with Attention-Based Deep Reinforcement Learning. Journal of Computer Science and Technology 36, 4 (2021), 741–761.
[11]
Yijin Guo, Huasong Shan, Shixin Huang, Kai Hwang, Jianping Fan, and Zhibin Yu. 2021. GML: Efficiently Auto-Tuning Flink’s Configurations Via Guided Machine Learning. IEEE Transactions on Parallel and Distributed Systems 32, 12 (2021), 2921–2935.
[12]
Xue Han and Tingting Yu. 2020. Automated performance tuning for highly-configurable software systems. arXiv preprint arXiv:2010.01397(2020).
[13]
Haochen He, Zhouyang Jia, Shanshan Li, Yue Yu, Chenglong Zhou, Qing Liao, Ji Wang, and Xiangke Liao. 2021. Multi-Intention Aware Configuration Selection for Performance Tuning. (2021).
[14]
Yigong Hu, Gongqi Huang, and Peng Huang. 2020. Automated reasoning and detection of specious configuration in large systems with symbolic execution. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 719–734.
[15]
Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang. 2010. The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. In 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010). IEEE, 41–51.
[16]
Pooyan Jamshidi and Giuliano Casale. 2016. An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems. In 2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS). 39–48. https://doi.org/10.1109/MASCOTS.2016.17
[17]
Md Muhib Khan and Weikuan Yu. 2021. ROBOTune: High-Dimensional Configuration Tuning for Cluster-Based Data Analytics. In 50th International Conference on Parallel Processing. 1–10.
[18]
Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. Qtune: A query-aware database tuning system with deep reinforcement learning. Proceedings of the VLDB Endowment 12, 12 (2019), 2118–2130.
[19]
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971(2015).
[20]
Chen Lin, Junqing Zhuang, Jiadong Feng, Hui Li, Xuanhe Zhou, and Guoliang Li. 2022. Adaptive Code Learning for Spark Configuration Tuning. ICDE.
[21]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602(2013).
[22]
Ting-Yu Mu, Ala Al-Fuqaha, and Khaled Salah. 2019. Automating the configuration of MapReduce: A reinforcement learning scheme. IEEE Transactions on Systems, Man, and Cybernetics: Systems 50, 11(2019), 4183–4196.
[23]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
[24]
David Buchaca Prats, Felipe Albuquerque Portella, Carlos HA Costa, and Josep Lluis Berral. 2020. You Only Run Once: Spark Auto-Tuning From a Single Run. IEEE Transactions on Network and Service Management 17, 4(2020), 2039–2051.
[25]
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952(2015).
[26]
Dana Van Aken, Andrew Pavlo, Geoffrey J Gordon, and Bohan Zhang. 2017. Automatic database management system tuning through large-scale machine learning. In Proceedings of the 2017 ACM international conference on management of data. 1009–1024.
[27]
Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine learning 8, 3 (1992), 279–292.
[28]
Jinhan Xin, Kai Hwang, and Zhibin Yu. 2022. LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications [Extended Version]. arXiv preprint arXiv:2203.14889(2022).
[29]
Zhibin Yu, Zhendong Bei, and Xuehai Qian. 2018. Datasize-aware high dimensional configurations auto-tuning of in-memory cluster computing. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. 564–577.
[30]
Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, 2019. An end-to-end automatic cloud database tuning system using deep reinforcement learning. In Proceedings of the 2019 International Conference on Management of Data. 415–432.
[31]
Xinyi Zhang, Hong Wu, Zhuo Chang, Shuowei Jin, Jian Tan, Feifei Li, Tieying Zhang, and Bin Cui. 2021. Restune: Resource oriented tuning boosted by meta-learning for cloud databases. In Proceedings of the 2021 International Conference on Management of Data. 2102–2114.
[32]
Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, and Yingchun Yang. 2017. Bestconfig: tapping the performance potential of systems via automatic configuration tuning. In Proceedings of the 2017 Symposium on Cloud Computing. 338–350.

Cited By

View all
  • (2024)FaaSConf: QoS-aware Hybrid Resources Configuration for Serverless WorkflowsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695477(957-969)Online publication date: 27-Oct-2024
  • (2024)CTuner: Automatic NoSQL Database Tuning with Causal Reinforcement LearningProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674809(269-278)Online publication date: 24-Jul-2024
  • (2024)DeepCAT+: A Low-Cost and Transferrable Online Configuration Auto-Tuning Approach for Big Data FrameworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345988935:11(2114-2131)Online publication date: 1-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '22: Proceedings of the 51st International Conference on Parallel Processing
August 2022
976 pages
ISBN:9781450397339
DOI:10.1145/3545008
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big Data Framework
  2. Deep Reinforcement Learning
  3. Online Configuration Tuning
  4. Performance Optimization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICPP '22
ICPP '22: 51st International Conference on Parallel Processing
August 29 - September 1, 2022
Bordeaux, France

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)7
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FaaSConf: QoS-aware Hybrid Resources Configuration for Serverless WorkflowsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695477(957-969)Online publication date: 27-Oct-2024
  • (2024)CTuner: Automatic NoSQL Database Tuning with Causal Reinforcement LearningProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674809(269-278)Online publication date: 24-Jul-2024
  • (2024)DeepCAT+: A Low-Cost and Transferrable Online Configuration Auto-Tuning Approach for Big Data FrameworksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345988935:11(2114-2131)Online publication date: 1-Nov-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media