research-article

Self-adaptive Executors for Big Data Processing

Authors:

Sobhan Omranian Khorasani,

Jan S. Rellermeyer,

Dick EpemaAuthors Info & Claims

Middleware '19: Proceedings of the 20th International Middleware Conference

Pages 176 - 188

https://doi.org/10.1145/3361525.3361545

Published: 09 December 2019 Publication History

Abstract

The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

References

[1]

Henri Bal, Dick Epema, Cees de Laat, Rob van Nieuwpoort, John Romein, Frank Seinstra, Cees Snoek, and Harry Wijshoff. 2016. A medium-scale distributed system for computer science research: Infrastructure for the long term. Computer 49, 5 (2016), 54--63.

Digital Library

[2]

I-H Chung and Jeffrey K Hollingsworth. 2004. Automated cluster-based web service performance tuning. In Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004. IEEE, 36--44.

Digital Library

[3]

Cloudera Blog. 2015. How to Tune your Apache Spark Jobs. https://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/. Accessed: 2019-05-03.

[4]

Frank Dabek, Nickolai Zeldovich, Frans Kaashoek, David Mazières, and Robert Morris. 2002. Event-driven programming for robust software. In Proceedings of the 10th workshop on ACM SIGOPS European workshop. ACM, 186--189.

Digital Library

[5]

Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham Wood. 2005. Automatic Performance Diagnosis and Tuning in Oracle. In CIDR. 84--94.

[6]

Anastasios Gounaris and Jordi Torres. 2018. A Methodology for Spark Parameter Tuning. Big Data Research 11 (March 2018), 22--32. https://doi.org/10.1016/j.bdr.2017.05.001

[7]

Holger H Hoos. 2011. Automated algorithm configuration and parameter tuning. In Autonomous search. Springer, 37--71.

[8]

Shengsheng Huang, Jie Huang, Yan Liu, Lan Yi, and Jinquan Dai. 2010. Hibench: A representative and comprehensive hadoop benchmark suite. In Proc. ICDE Workshops. 41--51.

[9]

Thomas Karcher and Victor Pankratius. 2011. Run-time automatic performance tuning for multicore applications. In European Conference on Parallel Processing. Springer, 3--14.

[10]

Jeffrey O Kephart and David M Chess. 2003. The vision of autonomic computing. Computer 1 (2003), 41--50.

Digital Library

[11]

Laszlo B Kish. 2002. End of Moore's law: thermal (noise) death of integration in micro and nano electronics. Physics Letters A 305, 3-4 (2002), 144--149.

[12]

Woo-Hyun Lee, Hee-Gook Jun, and Hyoung-Joo Kim. 2015. Hadoop Mapreduce Performance Enhancement Using In-Node Combiners. International Journal of Computer Science and Information Technology 7, 5 (Oct. 2015), 1--17. https://doi.org/10.5121/ijcsit.2015.7501

[13]

Min Li, Liangzhao Zeng, Shicong Meng, Jian Tan, Li Zhang, Ali R Butt, and Nicholas Fuller. 2014. Mronline: Mapreduce online performance tuning. In Proceedings of the 23rd international symposium on Highperformance parallel and distributed computing. ACM, 165--176.

Digital Library

[14]

REA Group. 2017. How We Optimise Apache Spark Jobs. https://www.rea-group.com/blog/how-we-optimize-apache-spark-apps/. Accessed: 2019-05-03.

[15]

Dennis M Ritchie and Ken Thompson. 1978. The UNIX time-sharing system. Bell System Technical Journal 57, 6 (1978), 1905--1929.

[16]

Kazuki Sakamoto and Tomohiko Furumoto. 2012. Grand central dispatch. In Pro Multithreading and Memory Management for iOS and OS X. Springer, 139--145.

[17]

Jerome Howard Saltzer. 1966. Traffic control in a multiplexed computer system. Ph.D. Dissertation. Massachusetts Institute of Technology.

[18]

Charles E. Skinner and Jonathan R. Asher. 1969. Effects of storage contention on system performance. IBM Systems Journal 8, 4 (1969), 319--333.

Digital Library

[19]

StackOverflow. 2010. GNU make: should the number of jobs equal the number of CPU cores in a system? https://stackoverflow.com/questions/2499070/gnu-make-should-the-number-of-jobs-equal-the-number-of-cpu-cores-in-a-system.

[20]

Stefan Tilkov and Steve Vinoski. 2010. Node. js: Using JavaScript to build high-performance network programs. IEEE Internet Computing 14, 6 (2010), 80--83.

Digital Library

[21]

TuneUp.ai. [n. d.]. Performance Tuning as a Service. https://tuneup.ai.

[22]

Unix StackExchange. 2015. How to determine the maximum number to pass to make -j option? https://unix.stackexchange.com/questions/208568/how-to-determine-the-maximum-number-to-pass-to-mak e-j-option.

[23]

Alexandru Uta and Harry Obaseki. 2018. A Performance Study of Big Data Workloads in Cloud Datacenters with Network Variability. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering. ACM, 113--118.

Digital Library

[24]

J Robert Von Behren, Jeremy Condit, and Eric A Brewer. 2003. Why Events Are a Bad Idea (for High-Concurrency Servers). In HotOS. 19--24.

[25]

Rob Von Behren, Jeremy Condit, Feng Zhou, George C Necula, and Eric Brewer. 2003. Capriccio: scalable threads for internet services. In ACM SIGOPS Operating Systems Review, Vol. 37. ACM, 268--281.

Digital Library

[26]

Nezih Yigitbasi, Theodore L. Willke, Guangdeng Liao, and Dick Epema. 2013. Towards Machine Learning-Based Auto-tuning of MapReduce. In 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems. IEEE. https://doi.org/10.1109/mascots.2013.9

[27]

Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-tolerant Abstraction for In-memory Cluster Computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI'12). USENIX Association, Berkeley, CA, USA, 2--2. http://dl.acm.org/citation.cfm?id=2228298.2228301

Digital Library

[28]

Yan Zhang, Wei Qu, and Anna Liu. 2005. Automatic performance tuning for j2ee application server systems. In International Conference on Web Information Systems Engineering. Springer, 520--527.

Digital Library

Cited By

Li PHua YCao Q(2024)An Enhanced Physical-Locality Deduplication System for Space EfficiencyJournal of Computer Science and Technology10.1007/s11390-023-2646-739:6(1361-1379)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s11390-023-2646-7
Zhang SWang QKanemasa YMichaelis JLiu JPu CBellavista PZhang KGherbi ABagchi SPatiño MDi Modica GGascon-Samson J(2022)ShadowSyncProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3565251(281-294)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3528535.3565251
Casini D(2022)A Theoretical Approach to Determine the Optimal Size of a Thread Pool for Real-Time Systems2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00016(66-78)Online publication date: Dec-2022
https://doi.org/10.1109/RTSS55097.2022.00016
Show More Cited By

Index Terms

Self-adaptive Executors for Big Data Processing
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Multithreading
    2. Extra-functional properties
      1. Software performance

Recommendations

Managing Variant Calling Files the Big Data Way: Using HDFS and Apache Parquet
BDCAT '17: Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies

Big Data has been seen as a remedy for the efficient management of the ever-increasing genomic data. In this paper, we investigate the use of Apache Spark to store and process Variant Calling Files (VCF) on a Hadoop cluster. We demonstrate Tomatula, a ...
A novel big data analytics framework for smart cities
Abstract
The emergence of smart cities aims at mitigating the challenges raised due to the continuous urbanization development and increasing population density in cities. To face these challenges, governments and decision makers undertake ...
Big Data Network Flow Processing Using Apache Spark
ECBS '19: Proceedings of the 6th Conference on the Engineering of Computer Based Systems

The increasing amount of traffic flows captured as a part of network monitoring activities makes the analysis more complicated. One of the goals for network traffic analysis is to identify malicious communication. In the paper, we present a new system ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

Middleware '19: Proceedings of the 20th International Middleware Conference

December 2019

342 pages

ISBN:9781450370097

DOI:10.1145/3361525

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

Middleware '19

Sponsor:

ACM

Middleware '19: 20th International Middleware Conference

December 9 - 13, 2019

CA, Davis, USA

Acceptance Rates

Overall Acceptance Rate 203 of 948 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
254
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li PHua YCao Q(2024)An Enhanced Physical-Locality Deduplication System for Space EfficiencyJournal of Computer Science and Technology10.1007/s11390-023-2646-739:6(1361-1379)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s11390-023-2646-7
Zhang SWang QKanemasa YMichaelis JLiu JPu CBellavista PZhang KGherbi ABagchi SPatiño MDi Modica GGascon-Samson J(2022)ShadowSyncProceedings of the 23rd ACM/IFIP International Middleware Conference10.1145/3528535.3565251(281-294)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3528535.3565251
Casini D(2022)A Theoretical Approach to Determine the Optimal Size of a Thread Pool for Real-Time Systems2022 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS55097.2022.00016(66-78)Online publication date: Dec-2022
https://doi.org/10.1109/RTSS55097.2022.00016
Albo Martínez DBobde SMotyka TChen LBourcier JJiang ZBezemer CCortellessa V(2021)Courier: Real-Time Optimal Batch Size Prediction for Latency SLOs in BigDLProceedings of the ACM/SPEC International Conference on Performance Engineering10.1145/3427921.3450233(133-144)Online publication date: 9-Apr-2021
https://dl.acm.org/doi/10.1145/3427921.3450233
Xie BCao QKunjir MWan LChase JMandal ARynge M(2021)WIRE: Resource-efficient Scaling with Online Prediction for DAG-based Workflows2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00025(35-46)Online publication date: Sep-2021
https://doi.org/10.1109/Cluster48925.2021.00025
Li PHua YCao QZhang M(2020)Improving the Restore Performance via Physical-Locality Middleware for Backup SystemsProceedings of the 21st International Middleware Conference10.1145/3423211.3425691(341-355)Online publication date: 7-Dec-2020
https://dl.acm.org/doi/10.1145/3423211.3425691

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten