Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3173162.3173187acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Datasize-Aware High Dimensional Configurations Auto-Tuning of In-Memory Cluster Computing

Published: 19 March 2018 Publication History

Abstract

In-Memory cluster Computing (IMC) frameworks (e.g., Spark) have become increasingly important because they typically achieve more than 10× speedups over the traditional On-Disk cluster Computing (ODC) frameworks for iterative and interactive applications. Like ODC, IMC frameworks typically run the same given programs repeatedly on a given cluster with similar input dataset size each time. It is challenging to build performance model for IMC program because: 1) the performance of IMC programs is more sensitive to the size of input dataset, which is known to be difficult to be incorporated into a performance model due to its complex effects on performance; 2) the number of performance-critical configuration parameters in IMC is much larger than ODC (more than 40 vs. around 10), the high dimensionality requires more sophisticated models to achieve high accuracy. To address this challenge, we propose DAC, a datasize-aware auto-tuning approach to efficiently identify the high dimensional configuration for a given IMC program to achieve optimal performance on a given cluster. DAC is a significant advance over the state-of-the-art because it can take the size of input dataset and 41 configuration parameters as the parameters of the performance model for a given IMC program, --- unprecedented in previous work. It is made possible by two key techniques: 1) Hierarchical Modeling (HM), which combines a number of individual sub-models in a hierarchical manner; 2) Genetic Algorithm (GA) is employed to search the optimal configuration. To evaluate DAC, we use six typical Spark programs, each with five different input dataset sizes. The evaluation results show that DAC improves the performance of six typical Spark programs, each with five different input dataset sizes compared to default configurations by a factor of 30.4x on average and up to 89x. We also report that the geometric mean speedups of DAC over configurations by default, expert, and RFHOC are 15.4x, 2.3x, and 1.5x, respectively.

References

[1]
Faraz Ahmad, Srimat T Chakradhar, Anand Raghunathan, and TN Vijaykumar. 2014. ShuffleWatcher: Shuffle-aware Scheduling in Multitenant MapReduce Clusters. In Proceedings of USENIX Annual Technical Conference (ATC) (ATC'14). USENIX Association, Philadelphia, PA, 1-12.
[2]
Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. 2014. OpenTuner: An Extensible Framework for Program Autotuning. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT) (PACT'14). ACM Press, Edmonton, Canada, 303-316.
[3]
Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational Data Processing in Spark. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1383-1394.
[4]
Zhendong Bei, Zhibin Yu, Huiling Zhang, Wen Xiong, Chengzhong Xu, Lieven Eeckhout, and Shengzhong Feng. 2016. RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration. IEEE Transactions on Parallel and Distributed Systems 27, 5 (June 2016), 1470-1483.
[5]
Dazhao Cheng, Jia Rao, Yanfei Guo, and Xiaobo Zhou. 2014. Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning. In Proceedings of the 15th International Middleware Conference (Middleware) (Middleware'14). USENIX Association, Bordeaux, France, 97-108.
[6]
Tatsuhiro Chiba and Tamiya Onodera. 2015. Workload Characterization and Optimization of TPC-H Queries on Apache Spark. Technical Report. IBM Research - Tokyo, IBM Japan, Ltd.
[7]
Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the International Conference on Operating Systems Design and Implementation (OSDI) (OSDI'12). USENIX Association, San Francisco, CA, 137-150.
[8]
Christina Delimitrou and Christos Kozyrakis. 2013. Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters. In Proceedings of the 18th International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS) (ASPLOS'13). ACM Press, Houston, TX, 77-88.
[9]
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-Efficient and QoS-Aware Cluster Management. In Proceedings of the 19th International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS) (ASPLOS'14). ACM Press, Salt Lake City, UT, 1-12.
[10]
Adem Efe Gencer, David Bindel, Emin Gun Sirer, and Robbert van Renesse. 2015. Configuring Distributed Computations Using Response Surfaces. In Proceedings of the annual ACM/IFIP/USENIX Middleware conference (Middleware) (Middleware'15). USENIX Association, Vancouver, Canada, 235-246
[11]
Robert Gentleman and Ross Ihaka. 2016. The R Project for Statistical Computing. (Sept. 2016). Retrieved Januray 20, 2018 from https://www.r-project.org/
[12]
Herodotos Herodotou. 2011. Hadoop Performance Models. Technical Report CS-2011-05. Duke University, Durham, NC.
[13]
Herodotos Herodotou and Shivnath Babu. 2011. Profiling, What-If Analysis, and Cost-Based Optimization of MapReduce programs. Journal of VLDB Endowment 4, 11 (Jan. 2011), 1111-1122.
[14]
Herodotos Herodotou, Harold Lim, Gang Luo, Nedyalko Borisov, Liang Dong, Fatma Bilgen Cetin, and Shivnath Babu. 2011. Starfish: A Self-tuning System for Big Data Analytics. In Proceedings of the Biennial International Conference on Innovative Data Systems Research (CIDR'11). CIDRDB, 261-272.
[15]
Peng Huang, William J. Bolosky, Abhishek Singh, and Yuanyuan Zhou. 2015. Conf Valley: A Systematic Configuration Validation Framework for Cloud Services. In Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys) (EuroSys'15). USENIX Association, Bordeaux, France, 1-16.
[16]
Cloudera Inc. 2016. Tuning Spark Applications. (June 2016). Retrieved Januray 20, 2018 from https://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_spark_tuning.html
[17]
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, and Dennis Fetterly. 2007. Dryad: Distributed Data-Parallel Programs form Sequential Building Blocks. In Proceedings of the ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys) (EuroSys'07). USENIX Association, Lisbon, Portugal, 59-72.
[18]
Manoj Kumar, Mohammad Husian, Naveen Upreti, and Deepti Gupta. 2010. Genetic algorithm: Review and Application. International Journal of Information Technology and Knowledge Management 2, 2 (Jan. 2010), 451-454.
[19]
Palden Lama and Xiaobo Zhou. 2012. AROMA: Automated Resource Allocation and Configuration of MapReduce Environment in the Cloud. In Proceedings of the 9th ACM International Conference on Autonomic Computing (ICAC) (ICAC'12). ACM Press, San Jose, CA, 63-72.
[20]
Jacek Laskowski. 2016. Mastering Apache Spark. (Jan. 2016). Retrieved Januray 20, 2018 from https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-dagscheduler-stages.html
[21]
Benjamin C. Lee and David Brooks. 2010. Applied Inference: Case Studies in Micro-architectural Design. ACM Transactions on Architecture and Code Optimization (TACO) 7, 2 (Sept. 2010), 8:1-8:35.
[22]
Roger J Lewis. 2000. An introduction to classification and regression tree (CART) analysis. In Proceedings of Annual Meeting of the Society for Academic Emergency Medicine. San Francisco, CA, 1-14.
[23]
Haoyuan Li, Ali Ghodsi, Matei Zaharia, Scott Shenker, and Ion Stoica. 2014. Tachyon: Reliable, memory speed storage for cluster computing frameworks. In Proceedings of the ACM Symposium on Cloud Computing (SoCC) (SoCC'14). ACM Press, Seattle, WA, 1-15.
[24]
Shen Li, Shaohan Hu, Shiguang Wang, Lu Su, Tarek Abdelzaher, Indranil Gupta, and Richard Pace. 2014. Woha: Deadline-aware map-reduce workflow scheduling framework over hadoop clusters. In Proceedings of the 2014 IEEE 34th International Conference on Distributed Computing Systems (ICDCS) (ICDCS'14). IEEE, Madrid, Spain, 93-103.
[25]
Guangdeng Liao, Kushal Datta, and Theodore L Willke. 2013. Gunther: Search-Based Auto-Tuning of MapReduce. In Proceedings of Euro-Par 2013 Parallel Processing (EuroPar'13). Springer, Berlin, Heidelberg, 406-419.
[26]
Luo Lie. 2010. Heuristic Artificial Intelligent Algorithm for Genetic Algorithm. Key Engineering Materials 439 (May 2010), 516-521.
[27]
Weiqing Liu, Jiannong Cao, Lei Yang, Lin Xu, Xuanjia Qiu, and Jing Li. 2017. AppBooster: Boosting the Performance of Interactive Mobile Applications with Computation Offloading and Parameter Tuning. IEEE Transactions on Parallel and Distributed Systems 28, 6 (June 2017), 1593-1606.
[28]
Zhaolei Liu and TS Eugene Ng. 2017. Leaky Buffer: A Novel Abstraction for Relieving Memory Pressure from Cluster Data Processing Frameworks. IEEE Transactions on Parallel and Distributed Systems 28, 1 (March 2017), 128-140.
[29]
Martin Maas, Tim Harris, Krste Asanovic, and John Kubiatowicz. 2015. Trash Day: Coordinating Garbage Collection in Distributed Systems. In Proceedings of the 15th USENIX Workshop on Hot Topics in Operating Systems (HotOS) (HotOS XV). USENIX Association, Kartause Ittingen, Switzerland, 1-6.
[30]
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, and Ameet Talwalkar. 2016. MLlib: Machine Learning in Apach Spark. The Journal of Machine Learning Research 17, 1 (Jan. 2016), 1-7.
[31]
Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Shan Lu, Sanazsadat Alamian, and Onur Mutlu. 2016. Yak: A high-performance big-data-friendly garbage collector. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI) (OSDI'16). USENIX Association, Savannah, GA, 349-365.
[32]
Andrew Or and Josh Rosen. 2016. Unified Memor Management in Spark 1.6. (Jan. 2016). Retrieved Januray 20, 2018 from https://issues.apache.org/jira/secure/attachment/12765646/unified-memory-management-spark-10000.pdf
[33]
Kay Ousterhout, Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, and Byung-Gon Chun. 2015. Making Sense of Performance in Data Analytics Frameworks. In Proceedings of the 12nd USENIX Symposium on Networked Systems Design and Implementation (NSDI) (NSDI'15). USENIX Association, Oakland, CA, 293-307.
[34]
Pankaj. 2017. Java (JVM) Memory Model - Memory Management in Java. (March 2017). Retrieved Januray 20, 2018 from http://www.journaldev.com/2856/java-jvm-memory-model-memory-management-in-java
[35]
Simone Pellegrini, Radu Prodan, and Thomas Fahringer. 2012. Tuning MPI Runtime Parameter Setting for High Performance Computing. In Proceedings of IEEE International Conference on Cluster Computing Workshops. IEEE Computer Society, Washington, DC, 213-221.
[36]
Zujie Ren, Xianghua Xu, Jian Wan, Weisong Shi, and Min Zhou. 2012. Workload Characterization on a Production Hadoop Cluster: A Case Study on Taobao. In Proceedings of IEEE International Symposium on Workload Characterization (IISWC) (IISWC'12). IEEE Computer Society, San Diego, CA, 1-11.
[37]
Anooshiravan Saboori, Guofei Jiang, and Haifeng Chen. 2008. Autotuning Configurations in Distributed Systems for Performance Improvements using Evolutionary Strategies. In Proceedings of the 28th International Conference on Distributed Computing Systems (ICDCS) (ICDCS'08). IEEE Computer Society, Beijing, China, 769-776.
[38]
Juwei Shi, Yunjie Qiu, Umar Farooq Minhas, Limei Jiao, Chen Wang, Berthold Reinwald, and Fatma Ozcan. 2015. Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics. In Proceedings of the 42nd International Conference on Very Large Data Bases (VLDB Endowment), Vol.8, No.13 (VLDB'15), Vol. 8. Hawai'i, USA, 2110-2121.
[39]
Xueyuan Su, Garret Swart, Brian Goetz, Brian Oliver, and Paul Sandoz. 2014. Changing engines in midstream: A java stream computational model for big data processing. Proceedings of the VLDB Endowment 7, 13 (Sept. 2014), 1343-1354.
[40]
Apache HBase Team. 2016. Apache HBase. (June 2016). Retrieved Januray 20, 2018 from http://hadoop.apache.org/hbase/
[41]
Aparch Spark Team. 2016. Aparch Spark. (March 2016). Retrieved Januray 20, 2018 from http://spark.apache.org/
[42]
Aparch Spark Team. 2016. Spark Configuration. (May 2016). Retrieved Januray 20, 2018 from http://spark.apache.org/docs/latest/configuration.html
[43]
Aparch Spark Team. 2016. Tuning Spark. (June 2016). Retrieved Januray 20, 2018 from http://spark.apache.org/docs/latest/tuning.html
[44]
Spark Streaming Team. 2016. Spark Streaming. (March 2016). Retrieved Januray 20, 2018 from http://spark.apache.org/streaming/
[45]
White Tom. 2012. Hadoop: The definitive guide. O'Reilly Media, Inc.
[46]
Virginia Torczon and Michael W Trosset. {n. d.}. From Evolutionary Operation to Parallel Direct Search: Pattern Search Algorithms for Numerical Optimization. Computing Science and Statistics 29 ({n. d.}).
[47]
Inc. TypeSafe. 2015. Apache Spark Survey from Typesafe. (Jan. 2015). Retrieved Januray 20, 2018 from https://dzone.com/articles/apache-spark-survey-typesafe-0
[48]
Md. Wasi ur Rahman, Nusrat Sharmin Islam, Xiaoyi Lu, Dipti Shankar, and Dhabaleswar K. Panda. 2016. MR-Advisor: A Comprehensive Tuning Tool for Advising HPC Users to Accelerate MapReduce Applications on Supercomputers. In Proceedings of 2016 IEEE 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD'16). IEEE Computer Society, Los Angeles, CA, 198-205.
[49]
Guolu Wang, Jungang Xu, and Ben He. 2016. A Novel Method for Tuning Configuration parameters of Spark Based on Machine Learning. In Proceedings of the 2016 IEEE 18th International Conference on High Performance Computing and Communications (HPCC) (HPCC'16). IEEE Computer Society, Sydney, Australia, 586-593.
[50]
Jingjing Wang and Magdalena Balazinska. 2016. Toward elastic memory management for cloud data analytics. In Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond. ACM Press, San Francisco, CA, 1-7.
[51]
Reynold S. Xin, Joseph E. Gonzalez, Michael J. Franklin, and Ion Stoica. 2013. GraphX: A Resilient Distributed Graph System on Spark. In Proceedings of the First International Workshop on Graph Data Management Experimence and System. 1-5.
[52]
Wen Xiong, Zhibin Yu, Lieven Eeckhout, Zhengdong Bei, Fan Zhang, and Chengzhong Xu. 2015. SZTS: A Novel Big Data Transportation System Benchmark Suite. In Proceedings of the 44th International Conference on Parallel Processing (ICPP) (ICPP'15). IEEE, Beijin, China, 819-828.
[53]
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker. 2015. Hey, You Have Given Me Too Many Knobs. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering. ACM Press, Bergamo, Italy, 307-319.
[54]
Tianyin Xu, Jiaqi Zhang, Peng Huang, Jing Zheng, Tianwei Sheng, Ding Yuan, Yuanyuan Zhou, and Shankar Pasupathy. 2013. Do Not Blame Users for Misconfigurations. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP) (SOSP'13). USENIX Association, Farmington, Pennsylvania, 244-259.
[55]
Tianyin Xu and Yuanyuan Zhou. 2015. Systems Approaches to Tackling Configuration Errors: A Survey. Comput. Surveys 47, 4 (July 2015), 1-41.
[56]
Tao Ye and Shivkumar Kalyanaraman. {n. d.}. A Recursive Random Search Algorithm for Large-Scale Network Parameter Configuration. ACM SIGMETRICS Performance Evaluation Review 31, 1 ({n. d.}).
[57]
Nezih Yigitbasi, Theodore L. Willke, Guangdeng Liao, and Dick H. J. Epema. 2013. Towards Machine Learning-Based Auto-tuning of MapReduce. In Proceedings of the 21st International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) (MASCOTS'13). IEEE Computer Society, San Francisco, CA, 11-20.
[58]
Zuoning Yin, Xiao Ma, Jing Zheng, Yuanyuan Zhou, Lakshmi N. Bairavasundaram, and Shankar Pasupathy. 2011. An Empirical Study on Configuration Errors in Commercial and Open Source Systems. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP) (SOSP'11). USENIX Association, Cascais, Portugal, 159-172.
[59]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud) (HotCloud'10). USENIX Association, Boston, MA, 1-8.
[60]
Jiaqi Zhang, Lakshminarayanan Renganarayana, Xiaolan Zhang, Niyu Ge, Vasanth Bala, Tianyin Xu, and Yuanyuan Zhou. 2014. EnCore: Exploiting System Environment and Correlation Information for Misconfiguration Detection. In Proceedings of the 19th International Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS) (ASPLOS'14). ACM Press, Salt Lake City, UT, 687-700
[61]
Yao Zhao, Fei Hu, and Haopeng Chen. 2016. An Adaptive Tuning Strategy on Spark Based on In-memory Computation Characteristics. In Proceedings of the 18th International Conference on Advanced Communication Technology (ICACT) (ICACT'16). PyeongChang, Korea (South), 484-488.

Cited By

View all
  • (2024)TIE: Fast Experiment-Driven ML-Based Configuration Tuning for In-Memory Data AnalyticsIEEE Transactions on Computers10.1109/TC.2024.336593773:5(1233-1247)Online publication date: May-2024
  • (2024)Semantic Feature-Driven Automatic Parameter Optimization of Apache Spark2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498516(316-319)Online publication date: 19-Jan-2024
  • (2024)Guser: A GPGPU Power Stressmark Generator2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00087(1111-1124)Online publication date: 2-Mar-2024
  • Show More Cited By

Index Terms

  1. Datasize-Aware High Dimensional Configurations Auto-Tuning of In-Memory Cluster Computing

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
      March 2018
      827 pages
      ISBN:9781450349116
      DOI:10.1145/3173162
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 53, Issue 2
        ASPLOS '18
        February 2018
        809 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/3296957
        Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 March 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. big data
      2. in-memory computing
      3. performance tuning

      Qualifiers

      • Research-article

      Funding Sources

      • Key Technique Research on Haiyun Data System of NICT, CAS
      • Outstanding Technicial Talent Program of CAS
      • Shenzhen Technology Research Project
      • The National Key R&D Program of China
      • NSFC
      • The Major Scientific and Technological Project of Guangdong Province

      Conference

      ASPLOS '18

      Acceptance Rates

      ASPLOS '18 Paper Acceptance Rate 56 of 319 submissions, 18%;
      Overall Acceptance Rate 535 of 2,713 submissions, 20%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)85
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 18 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)TIE: Fast Experiment-Driven ML-Based Configuration Tuning for In-Memory Data AnalyticsIEEE Transactions on Computers10.1109/TC.2024.336593773:5(1233-1247)Online publication date: May-2024
      • (2024)Semantic Feature-Driven Automatic Parameter Optimization of Apache Spark2024 4th International Conference on Neural Networks, Information and Communication (NNICE)10.1109/NNICE61279.2024.10498516(316-319)Online publication date: 19-Jan-2024
      • (2024)Guser: A GPGPU Power Stressmark Generator2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00087(1111-1124)Online publication date: 2-Mar-2024
      • (2024)Enabling Large Dynamic Neural Network Training with Learning-based Memory Management2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00066(788-802)Online publication date: 2-Mar-2024
      • (2023)Towards General and Efficient Online Tuning for SparkProceedings of the VLDB Endowment10.14778/3611540.361154816:12(3570-3583)Online publication date: 12-Sep-2023
      • (2023)DiagConfig: Configuration Diagnosis of Performance Violations in Configurable Software SystemsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616300(566-578)Online publication date: 30-Nov-2023
      • (2023)EFTuner: A Bi-Objective Configuration Parameter Auto-Tuning Method Towards Energy-Efficient Big Data ProcessingProceedings of the 14th Asia-Pacific Symposium on Internetware10.1145/3609437.3609443(292-301)Online publication date: 4-Aug-2023
      • (2023)Interference-aware Multiplexing for Deep Learning in GPU Clusters: A Middleware ApproachProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607060(1-15)Online publication date: 12-Nov-2023
      • (2023)Rover: An Online Spark SQL Tuning Service via Generalized Transfer LearningProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599953(4800-4812)Online publication date: 6-Aug-2023
      • (2023)A Prediction System ServiceProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575714(48-60)Online publication date: 27-Jan-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media