Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2872362.2872386acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Taurus: A Holistic Language Runtime System for Coordinating Distributed Managed-Language Applications

Published: 25 March 2016 Publication History

Abstract

Many distributed workloads in today's data centers are written in managed languages such as Java or Ruby. Examples include big data frameworks such as Hadoop, data stores such as Cassandra or applications such as the SOLR search engine. These workloads typically run across many independent language runtime systems on different nodes. This setup represents a source of inefficiency, as these language runtime systems are unaware of each other. For example, they may perform Garbage Collection at times that are locally reasonable but not in a distributed setting.
We address these problems by introducing the concept of a Holistic Runtime System that makes runtime-level decisions for the entire distributed application rather than locally. We then present Taurus, a Holistic Runtime System prototype. Taurus is a JVM drop-in replacement, requires almost no configuration and can run unmodified off-the-shelf Java applications. Taurus enforces user-defined coordination policies and provides a DSL for writing these policies.
By applying Taurus to Garbage Collection, we demonstrate the potential of such a system and use it to explore coordination strategies for the runtime systems of real-world distributed applications, to improve application performance and address tail-latencies in latency-sensitive workloads.

References

[1]
"The Apache Cassandra Project." [Online]. Available: http://cassandra.apache.org/
[2]
"Apache Harmony." [Online]. Available: http://harmony.apache.org/
[3]
"ART vs Dalvik - introducing the new Android runtime in KitKat." [Online]. Available: http://www.infinum.co/the-capsized-eight/articles/art-vs-dalvik-introducing-the-new-android-runtime-in-kit-kat
[4]
"Credit Suisse Case Study." [Online]. Available: http://www.azulsystems.com/customers/creditsuisse
[5]
"G1: One Garbage Collector To Rule Them All." [Online]. Available: http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All
[6]
"Garbage Collection Notifications." [Online]. Available: https://msdn.microsoft.com/en-us/library/cc713687(v=vs.110).aspx
[7]
"Google App Engine: Platform as a Service."
[8]
"Hack: a new programming language for HHVM." [Online]. Available: https://code.facebook.com/posts/264544830379293/hack-a-new-programming-language-for-hhvm/
[9]
"HDFS Issue 7244: "Reduce Namenode memory using Flyweight pattern"." [Online]. Available: https://issues.apache.org/jira/browse/HDFS-7244
[10]
"Inside .NET Native (Channel 9)." [Online]. Available: http://channel9.msdn.com/Shows/Going
[11]
Deep/Inside-NET-Native
[12]
"JSR-000121 Application Isolation API Specification." [Online]. Available: https://jcp.org/aboutJava/communityprocess/final/jsr121/
[13]
"LogCabin (GitHub)." [Online]. Available: http://github.com/logcabin/logcabin
[14]
"Microsoft Windows Azure." [Online]. Available: http://www.windowsazure.com/
[15]
"On Garbage Collection." [Online]. Available: http://hhvm.com/blog/431/on-garbage-collection
[16]
"Predictable Low Latency: "Cinnober on GC pause-free Java applications through orchestrated memory management"," Tech. Rep. [Online]. Available: http://www.cinnober.com/sites/cinnober.com/files/news/Cinnober%20on%20GC%20pause%20free%20Java%20applications.pdf
[17]
"Project Tungsten: Bringing Spark Closer to Bare Metal." [Online]. Available: https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html
[18]
"Twitter Shifting More Code to JVM, Citing Performance and Encapsulation As Primary Drivers." [Online]. Available: http://www.infoq.com/articles/twitter-java-use
[19]
"ZooKeeper SessionExpired events," in Apache HBase Reference Guide.\hskip 1em plus 0.5em minus 0.4em\relax Apache HBase Team. [Online]. Available: http://hbase.apache.org/book.html
[20]
O. Anderson, E. Fortuna, L. Ceze, and S. Eggers, "Checked Load: Architectural Support for JavaScript Type-checking on Mobile Processors," in Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture, 2011.
[21]
J. Andersson, S. Weber, E. Cecchet, C. Jensen, and V. Cahill, "Kaffemik - a Distributed JVM Featuring a Single Address Space Architecture," in Proceedings of the 2001 Symposium on Java Virtual Machine Research and Technology Symposium, 2001.
[22]
Y. Aridor, M. Factor, and A. Teperman, "cJVM: A single system image of a JVM on a cluster," in Proceedings of the 1999 International Conference on Parallel Processing, 1999.
[23]
S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann, "The DaCapo Benchmarks: Java Benchmarking Development and Analysis," in Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programing, Systems, Languages, and Applications, 2006.
[24]
J. Bonér and E. Kuleshov, "Clustering the Java Virtual Machine using Aspect-Oriented Programming," in Proceedings of the 6th International Conference on Aspect-Oriented Software Development, 2007.
[25]
C. Cameron, J. Singer, and D. Vengerov, "The Judgment of Forseti: Economic Utility for Dynamic Heap Sizing of Multiple Runtimes," in Proceedings of the 2015 ACM SIGPLAN International Symposium on Memory Management, 2015.
[26]
T. Cao, S. M. Blackburn, T. Gao, and K. S. McKinley, "The Yin and Yang of Power and Performance for Asymmetric Hardware and Managed Software," in Proceedings of the 39th Annual International Symposium on Computer Architecture, 2012.
[27]
D. Cheriton, "The V Distributed System," Commun. ACM, vol. 31, no. 3, pp. 314--333, Mar. 1988.
[28]
J. A. Colmenares, G. Eads, S. Hofmeyr, S. Bird, M. Moretó, D. Chou, B. Gluzman, E. Roman, D. B. Bartolini, N. Mor, K. Asanović, and J. D. Kubiatowicz, "Tessellation: Refactoring the OS Around Explicit Resource Containers with Continuous Adaptation," in Proceedings of the 50th Annual Design Automation Conference, 2013.
[29]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, "Benchmarking Cloud Serving Systems with YCSB," in Proceedings of the 1st ACM Symposium on Cloud Computing, 2010.
[30]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, "Dynamo: Amazon's Highly Available Key-value Store," in Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles, 2007.
[31]
C. Delimitrou and C. Kozyrakis, "Quasar: Resource-efficient and QoS-aware Cluster Management," in Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, 2014.
[32]
H. Fan, A. Ramaraju, M. McKenzie, W. Golab, and B. Wong, "Understanding the Causes of Consistency Anomalies in Apache Cassandra," Proceedings of the VLDB Endowment, vol. 8, no. 7, 2015.
[33]
L. Gidra, G. Thomas, J. Sopena, M. Shapiro, and N. Nguyen, "NumaGiC: A Garbage Collector for Big Data on Big NUMA Machines," in Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015.
[34]
I. Gog, J. Giceva, M. Schwarzkopf, K. Viswani, D. Vytiniotis, G. Ramalingan, M. Costa, D. Murray, S. Hand, and M. Isard, "Broom: sweeping out Garbage Collection from Big Data systems," in Proceedings of the 15th USENIX/ACM Workshop on Hot Topics in Operating Systems (HotOS 2015), 2015.
[35]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica, "GraphX: Graph Processing in a Distributed Dataflow Framework," in Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, 2014.
[36]
T. Harris, M. Maas, and V. J. Marathe, "Callisto: Co-scheduling Parallel Runtime Systems," in Proceedings of the Ninth European Conference on Computer Systems, 2014.
[37]
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica, "Mesos: a platform for fine-grained resource sharing in the data center," in Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, 2011.
[38]
P. Hintjens, "ZeroMQ: The Guide," Tech. Rep., 2010. [Online]. Available: http://zguide.zeromq.org/page:all
[39]
G. C. Hunt and J. R. Larus, "Singularity: Rethinking the Software Stack," SIGOPS Oper. Syst. Rev., vol. 41, no. 2, pp. 37--49, Apr. 2007.
[40]
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed, "ZooKeeper: Wait-free Coordination for Internet-scale Systems," in Proceedings of the 2010 USENIX Annual Technical Conference, 2010.
[41]
R. Jones and R. Lins, Garbage Collection: Algorithms for Automatic Dynamic Memory Management.\hskip 1em plus 0.5em minus 0.4em\relax Wiley, Sep. 1996.
[42]
M. Jordan, L. Daynès, G. Czajkowski, M. Jarzab, and C. Bryce, "Scaling J2EE Application Servers with the Multi-tasking Virtual Machine," Sun Microsystems, Inc., Mountain View, CA, USA, Tech. Rep., 2004.
[43]
M. Kornacker, A. Behm, V. Bittorf, T. Bobrovytsky, C. Ching, A. Choi, J. Erickson, M. Grund, D. Hecht, M. Jacobs, I. Joshi, L. Kuff, D. Kumar, A. Leblang, N. Li, I. Pandis, H. Robinson, D. Rorke, S. Rus, J. Russell, D. Tsirogiannis, S. Wanderman-Milne, and M. Yoder, "Impala: A modern, open-source SQL engine for hadoop," in Seventh Biennial Conference on Innovative Data Systems Research, 2015.
[44]
M. A. Laurenzano, Y. Zhang, L. Tang, and J. Mars, "Protean Code: Achieving Near-Free Online Code Transformations for Warehouse Scale Computers," in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014.
[45]
E. D. Lazowska, H. M. Levy, G. T. Almes, M. J. Fischer, R. J. Fowler, and S. C. Vestal, "The Architecture of the Eden System," in Proceedings of the Eighth ACM Symposium on Operating Systems Principles, 1981.
[46]
M. Maas, K. Asanovic, T. Harris, and J. Kubiatowicz, "The Case for the Holistic Language Runtime System," in First International Workshop on Rack-scale Computing (WRSC '14), 2014.
[47]
M. Maas, T. Harris, K. Asanovic, and J. Kubiatowicz, "Trash Day: Coordinating Garbage Collection in Distributed Systems," in Proceedings of the 15th USENIX/ACM Workshop on Hot Topics in Operating Systems (HotOS 2015), 2015.
[48]
M. Maas and R. McIlroy, "A JVM for the Barrelfish Operating System," in 2nd Workshop on Systems for Future Multi-core Architectures (SFMA '12), 2012.
[49]
L. A. Meyerovich and A. S. Rabkin, "Empirical Analysis of Programming Language Adoption," in Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications, 2013.
[50]
S. Mullender, G. van Rossum, A. Tananbaum, R. van Renesse, and H. van Staveren, "Amoeba: a distributed operating system for the 1990s," Computer, vol. 23, no. 5, pp. 44--53, May 1990.
[51]
K. Nguyen, K. Wang, Y. Bu, L. Fang, J. Hu, and G. Xu, "FACADE: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications," in Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, 2015.
[52]
D. Ongaro and J. Ousterhout, "In Search of an Understandable Consensus Algorithm," in Proceedings of the 2014 USENIX Annual Technical Conference, 2014.
[53]
J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman, "The Case for RAMClouds: Scalable High-performance Storage Entirely in DRAM," SIGOPS Oper. Syst. Rev., vol. 43, no. 4, pp. 92--105, Jan. 2010.
[54]
J. K. Ousterhout, A. R. Cherenson, F. Douglis, M. N. Nelson, and B. B. Welch, "The Sprite Network Operating System," Computer, vol. 21, no. 2, pp. 23--36, Feb. 1988.
[55]
K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, and B.-G. Chun, "Making Sense of Performance in Data Analytics Frameworks," in 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), 2015.
[56]
A. Portillo-Dominguez, M. Wang, J. Murphy, and D. Magoni, "Adaptive GC-Aware Load Balancing Strategy for High-Assurance Java Distributed Systems," in 16th International Symposium on High Assurance Systems Engineering (HASE), 2015.
[57]
A. O. Portillo-Domínguez, M. Wang, D. Magoni, P. Perry, and J. Murphy, "Load balancing of Java applications by forecasting garbage collections," 2014.
[58]
M. Schwarzkopf, M. P. Grosvenor, and S. Hand, "New Wine in Old Skins: The Case for Distributed Operating Systems in the Data Center," in Proceedings of the 4th Asia-Pacific Workshop on Systems, 2013.
[59]
M. Schwarzkopf, A. Konwinski, M. Abd-El-Malek, and J. Wilkes, "Omega: flexible, scalable schedulers for large compute clusters," in Proceedings of the 8th European Conference on Computer Systems, 2013.
[60]
J. Simão, J. Lemos, and L. Veiga, "A2-VM : A Cooperative Java VM with Support for Resource-Awareness and Cluster-Wide Thread Scheduling," in On the Move to Meaningful Internet Systems: O™ 2011, ser. Lecture Notes in Computer Science, 2011.
[61]
D. Smiley and D. E. Pugh, Apache Solr 3 Enterprise Search Server.\hskip 1em plus 0.5em minus 0.4em\relax Packt Publishing Ltd, 2011.
[62]
G. Tene, B. Iyengar, and M. Wolf, "C4: The Continuously Concurrent Compacting Collector," in Proceedings of the International Symposium on Memory Management, 2011.
[63]
D. Terei and A. Levy, "Blade: A Data Center Garbage Collector," arXiv:1504.02578 [cs], Apr. 2015, arXiv: 1504.02578.
[64]
D. Tsafrir, Y. Etsion, D. G. Feitelson, and S. Kirkpatrick, "System Noise, OS Clock Ticks, and Fine-grained Parallel Applications," in Proceedings of the 19th Annual International Conference on Supercomputing, 2005.
[65]
K. Varda, Protocol buffers: Google's data interchange format, 2008.
[66]
N. Wakart, "Correcting YCSB's Coordinated Omission problem," Mar. 2015. [Online]. Available: http://psy-lob-saw.blogspot.com/2015/03/fixing-ycsb-coordinated-omission.html
[67]
T. White, Hadoop: The Definitive Guide: The Definitive Guide.\hskip 1em plus 0.5em minus 0.4em\relax O'Reilly Media, 2009.
[68]
R. S. Xin, J. Rosen, M. Zaharia, M. J. Franklin, S. Shenker, and I. Stoica, "Shark: SQL and Rich Analytics at Scale," in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, 2013.
[69]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, "Spark: Cluster Computing with Working Sets," in Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 2010.
[70]
M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica, "Discretized Streams: An Efficient and Fault-tolerant Model for Stream Processing on Large Clusters," in Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Ccomputing, 2012.
[71]
W. Zhu, C.-L. Wang, and F. Lau, "JESSICA2: a distributed Java Virtual Machine with transparent thread migration support," in Proceedings of the IEEE International Conference on Cluster Computing, 2002.
[72]
J. N. Zigman and R. Sankaranarayana, "dJVM-A distributed JVM on a Cluster," Australian National University, Tech. Rep., 2002.

Cited By

View all
  • (2024)Polar: A Managed Runtime with Hotness-Segregated Heap for Far MemoryProceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3678015.3680490(15-22)Online publication date: 4-Sep-2024
  • (2023)RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-DesignProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613170(182-199)Online publication date: 23-Oct-2023
  • (2023)Extending and Programming the NVMe I/O Determinism Interface for Flash ArraysACM Transactions on Storage10.1145/356842719:1(1-33)Online publication date: 11-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
March 2016
824 pages
ISBN:9781450340915
DOI:10.1145/2872362
  • General Chair:
  • Tom Conte,
  • Program Chair:
  • Yuanyuan Zhou
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DSL
  2. JVM
  3. cassandra
  4. garbage collection
  5. garbage collection coordination
  6. holistic runtime systems
  7. managed languages
  8. raft
  9. runtime systems
  10. spark

Qualifiers

  • Research-article

Funding Sources

  • DOE
  • DARPA

Conference

ASPLOS '16

Acceptance Rates

ASPLOS '16 Paper Acceptance Rate 53 of 232 submissions, 23%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)175
  • Downloads (Last 6 weeks)16
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Polar: A Managed Runtime with Hotness-Segregated Heap for Far MemoryProceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3678015.3680490(15-22)Online publication date: 4-Sep-2024
  • (2023)RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-DesignProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613170(182-199)Online publication date: 23-Oct-2023
  • (2023)Extending and Programming the NVMe I/O Determinism Interface for Flash ArraysACM Transactions on Storage10.1145/356842719:1(1-33)Online publication date: 11-Jan-2023
  • (2023)Efficient Synchronization-Light Work StealingProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591099(39-49)Online publication date: 17-Jun-2023
  • (2022)Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid MemoriesACM Transactions on Computer Systems10.1145/351121139:1-4(1-38)Online publication date: 5-Jul-2022
  • (2022)PokéMem: Taming Wild Memory Consumers in Apache Spark2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS53621.2022.00015(59-69)Online publication date: May-2022
  • (2022)Layered Contention Mitigation for Cloud Storage2022 IEEE 15th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD55607.2022.00036(167-178)Online publication date: Jul-2022
  • (2021)IODAProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483573(263-279)Online publication date: 26-Oct-2021
  • (2021)M3Proceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456256(507-522)Online publication date: 21-Apr-2021
  • (2020)SemeruProceedings of the 14th USENIX Conference on Operating Systems Design and Implementation10.5555/3488766.3488781(261-280)Online publication date: 4-Nov-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media