Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3361525.3361545acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

Self-adaptive Executors for Big Data Processing

Published: 09 December 2019 Publication History

Abstract

The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

References

[1]
Henri Bal, Dick Epema, Cees de Laat, Rob van Nieuwpoort, John Romein, Frank Seinstra, Cees Snoek, and Harry Wijshoff. 2016. A medium-scale distributed system for computer science research: Infrastructure for the long term. Computer 49, 5 (2016), 54--63.
[2]
I-H Chung and Jeffrey K Hollingsworth. 2004. Automated cluster-based web service performance tuning. In Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004. IEEE, 36--44.
[3]
Cloudera Blog. 2015. How to Tune your Apache Spark Jobs. https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/. Accessed: 2019-05-03.
[4]
Frank Dabek, Nickolai Zeldovich, Frans Kaashoek, David Mazières, and Robert Morris. 2002. Event-driven programming for robust software. In Proceedings of the 10th workshop on ACM SIGOPS European workshop. ACM, 186--189.
[5]
Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham Wood. 2005. Automatic Performance Diagnosis and Tuning in Oracle. In CIDR. 84--94.
[6]
Anastasios Gounaris and Jordi Torres. 2018. A Methodology for Spark Parameter Tuning. Big Data Research 11 (March 2018), 22--32. https://doi.org/10.1016/j.bdr.2017.05.001
[7]
Holger H Hoos. 2011. Automated algorithm configuration and parameter tuning. In Autonomous search. Springer, 37--71.
[8]
Shengsheng Huang, Jie Huang, Yan Liu, Lan Yi, and Jinquan Dai. 2010. Hibench: A representative and comprehensive hadoop benchmark suite. In Proc. ICDE Workshops. 41--51.
[9]
Thomas Karcher and Victor Pankratius. 2011. Run-time automatic performance tuning for multicore applications. In European Conference on Parallel Processing. Springer, 3--14.
[10]
Jeffrey O Kephart and David M Chess. 2003. The vision of autonomic computing. Computer 1 (2003), 41--50.
[11]
Laszlo B Kish. 2002. End of Moore's law: thermal (noise) death of integration in micro and nano electronics. Physics Letters A 305, 3-4 (2002), 144--149.
[12]
Woo-Hyun Lee, Hee-Gook Jun, and Hyoung-Joo Kim. 2015. Hadoop Mapreduce Performance Enhancement Using In-Node Combiners. International Journal of Computer Science and Information Technology 7, 5 (Oct. 2015), 1--17. https://doi.org/10.5121/ijcsit.2015.7501
[13]
Min Li, Liangzhao Zeng, Shicong Meng, Jian Tan, Li Zhang, Ali R Butt, and Nicholas Fuller. 2014. Mronline: Mapreduce online performance tuning. In Proceedings of the 23rd international symposium on Highperformance parallel and distributed computing. ACM, 165--176.
[14]
REA Group. 2017. How We Optimise Apache Spark Jobs. https://www.rea-group.com/blog/how-we-optimize-apache-spark-apps/. Accessed: 2019-05-03.
[15]
Dennis M Ritchie and Ken Thompson. 1978. The UNIX time-sharing system. Bell System Technical Journal 57, 6 (1978), 1905--1929.
[16]
Kazuki Sakamoto and Tomohiko Furumoto. 2012. Grand central dispatch. In Pro Multithreading and Memory Management for iOS and OS X. Springer, 139--145.
[17]
Jerome Howard Saltzer. 1966. Traffic control in a multiplexed computer system. Ph.D. Dissertation. Massachusetts Institute of Technology.
[18]
Charles E. Skinner and Jonathan R. Asher. 1969. Effects of storage contention on system performance. IBM Systems Journal 8, 4 (1969), 319--333.
[19]
StackOverflow. 2010. GNU make: should the number of jobs equal the number of CPU cores in a system? https://stackoverflow.com/questions/2499070/gnu-make-should-the-number-of-jobs-equal-the-number-of-cpu-cores-in-a-system.
[20]
Stefan Tilkov and Steve Vinoski. 2010. Node. js: Using JavaScript to build high-performance network programs. IEEE Internet Computing 14, 6 (2010), 80--83.
[21]
TuneUp.ai. [n. d.]. Performance Tuning as a Service. https://tuneup.ai.
[22]
Unix StackExchange. 2015. How to determine the maximum number to pass to make -j option? https://unix.stackexchange.com/questions/208568/how-to-determine-the-maximum-number-to-pass-to-mak e-j-option.
[23]
Alexandru Uta and Harry Obaseki. 2018. A Performance Study of Big Data Workloads in Cloud Datacenters with Network Variability. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering. ACM, 113--118.
[24]
J Robert Von Behren, Jeremy Condit, and Eric A Brewer. 2003. Why Events Are a Bad Idea (for High-Concurrency Servers). In HotOS. 19--24.
[25]
Rob Von Behren, Jeremy Condit, Feng Zhou, George C Necula, and Eric Brewer. 2003. Capriccio: scalable threads for internet services. In ACM SIGOPS Operating Systems Review, Vol. 37. ACM, 268--281.
[26]
Nezih Yigitbasi, Theodore L. Willke, Guangdeng Liao, and Dick Epema. 2013. Towards Machine Learning-Based Auto-tuning of MapReduce. In 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems. IEEE. https://doi.org/10.1109/mascots.2013.9
[27]
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI'12). USENIX Association, Berkeley, CA, USA, 2--2. http://dl.acm.org/citation.cfm?id=2228298.2228301
[28]
Yan Zhang, Wei Qu, and Anna Liu. 2005. Automatic performance tuning for j2ee application server systems. In International Conference on Web Information Systems Engineering. Springer, 520--527.

Cited By

View all
  • (2024)An Enhanced Physical-Locality Deduplication System for Space EfficiencyJournal of Computer Science and Technology10.1007/s11390-023-2646-739:6(1361-1379)Online publication date: 1-Nov-2024
  • (2022)ShadowSyncProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3565251(281-294)Online publication date: 7-Nov-2022
  • (2022)A Theoretical Approach to Determine the Optimal Size of a Thread Pool for Real-Time Systems2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00016(66-78)Online publication date: Dec-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '19: Proceedings of the 20th International Middleware Conference
December 2019
342 pages
ISBN:9781450370097
DOI:10.1145/3361525
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Apache Spark
  2. Big Data
  3. Self-Adaptive Executors

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

Middleware '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 203 of 948 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An Enhanced Physical-Locality Deduplication System for Space EfficiencyJournal of Computer Science and Technology10.1007/s11390-023-2646-739:6(1361-1379)Online publication date: 1-Nov-2024
  • (2022)ShadowSyncProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3565251(281-294)Online publication date: 7-Nov-2022
  • (2022)A Theoretical Approach to Determine the Optimal Size of a Thread Pool for Real-Time Systems2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00016(66-78)Online publication date: Dec-2022
  • (2021)Courier: Real-Time Optimal Batch Size Prediction for Latency SLOs in BigDLProceedings of the ACM/SPEC International Conference on Performance Engineering10.1145/3427921.3450233(133-144)Online publication date: 9-Apr-2021
  • (2021)WIRE: Resource-efficient Scaling with Online Prediction for DAG-based Workflows2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00025(35-46)Online publication date: Sep-2021
  • (2020)Improving the Restore Performance via Physical-Locality Middleware for Backup SystemsProceedings of the 21st International Middleware Conference10.1145/3423211.3425691(341-355)Online publication date: 7-Dec-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media