Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Safe-by-default Concurrency for Modern Programming Languages

Published: 03 September 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Modern “safe” programming languages follow a design principle that we call safety by default and performance by choice. By default, these languages enforce important programming abstractions, such as memory and type safety, but they also provide mechanisms that allow expert programmers to explicitly trade some safety guarantees for increased performance. However, these same languages have adopted the inverse design principle in their support for multithreading. By default, multithreaded programs violate important abstractions, such as program order and atomic access to individual memory locations to admit compiler and hardware optimizations that would otherwise need to be restricted. Not only does this approach conflict with the design philosophy of safe languages, but very little is known about the practical performance cost of providing a stronger default semantics.
    In this article, we propose a safe-by-default and performance-by-choice multithreading semantics for safe languages, which we call volatile-by-default. Under this semantics, programs have sequential consistency (SC) by default, which is the natural “interleaving” semantics of threads. However, the volatile-by-default design also includes annotations that allow expert programmers to avoid the associated overheads in performance-critical code. We describe the design, implementation, optimization, and evaluation of the volatile-by-default semantics for two different safe languages: Java and Julia. First, we present VBD-HotSpot and VBDA-HotSpot, modifications of Oracle’s HotSpot JVM that enforce the volatile-by-default semantics on Intel x86-64 hardware and ARM-v8 hardware. Second, we present SC-Julia, a modification to the just-in-time compiler within the standard Julia implementation that provides best-effort enforcement of the volatile-by-default semantics on x86-64 hardware for the purpose of performance evaluation. We also detail two different implementation techniques: a baseline approach that simply reuses existing mechanisms in the compilers for handling atomic accesses, and a speculative approach that avoids the overhead of enforcing the volatile-by-default semantics until there is the possibility of an SC violation. Our results show that the cost of enforcing SC is significant but arguably still acceptable for some use cases today. Further, we demonstrate that compiler optimizations as well as programmer annotations can reduce the overhead considerably.

    References

    [1]
    [n.d.]. LLVM Atomic Instructions and Concurrency Guide: Atomic orderings. Retrieved on June 2021 from https://llvm.org/docs/Atomics.html#atomic-orderings.
    [2]
    [n.d.]. LLVM Language Reference Manual. Retrieved on June 2021 from https://releases.llvm.org/3.3/docs/LangRef.html.
    [3]
    [n.d.]. “tbaa” Metadata. Retrieved on June 2021 from https://llvm.org/docs/LangRef.html#tbaa-metadata.
    [4]
    Sarita V. Adve and H.-J. Boehm. 2010. Memory models: A case for rethinking parallel languages and hardware. Commun. ACM 53, 8 (Aug. 2010), 90–101.
    [5]
    S. V. Adve and M. D. Hill. 1990. Weak ordering-A new definition. In Proceedings of the 17th International Symposium on Computer Architecture. ACM, 2–14.
    [6]
    Wonsun Ahn, Shanxiang Qi, Jae-Woo Lee, Marios Nicolaides, Xing Fang, Josep Torrellas, David Wong, and Samuel Midkiff. 2009. BulkCompiler: High-performance sequential consistency through cooperative compiler and hardware support. In Proceedings of the 42nd International Symposium on Microarchitecture.
    [7]
    Jade Alglave, Daniel Kroening, Vincent Nimal, and Daniel Poetzl. 2014. Don’t sit on the fence—A static analysis approach to automatic fence insertion. In Proceedings of the 26th International Conference on Computer-aided Verification. 508–524.
    [8]
    Bowen Alpern, Steve Augart, Stephen M. Blackburn, Maria A. Butrico, Anthony Cocchi, Perry Cheng, Julian Dolby, Stephen J. Fink, David Grove, Michael Hind, Kathryn S. McKinley, Mark F. Mergen, J. Eliot B. Moss, Ton Anh Ngo, Vivek Sarkar, and Martin Trapp. 2005. The Jikes research virtual machine project: Building an open-source research community. IBM Syst. J. 44, 2 (2005), 399–418.
    [9]
    ARMv8 2018. ARM Cortex-A Series Programmer’s Guide for ARMv8-A Version: 1.0, Section 13.2.1. Retrieved on June 2021 from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CHDCJBGA.html.
    [10]
    Mark Batty, Kayvan Memarian, Kyndylan Nienhuis, Jean Pichon-Pharabod, and Peter Sewell. 2015. The problem of programming language concurrency semantics. In Programming Languages and Systems24th European Symposium on Programming (Lecture Notes in Computer Science), Jan Vitek (Ed.), Vol. 9032. Springer, 283–307.
    [11]
    Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. 2011. Mathematizing C++ concurrency. SIGPLAN Not. 46, 1 (Jan. 2011), 55–66.
    [12]
    Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah. 2017. Julia: A fresh approach to numerical computing. SIAM Rev. 59, 1 (2017), 65–98.
    [13]
    Jeff Bezanson, Jameson Nash, and Kiran Pamnany. [n.d.]. Announcing Composable Multi-threaded Parallelism in Julia. Retrieved on June 2021 from https://julialang.org/blog/2019/07/multithreading/.
    [14]
    Swarnendu Biswas, Minjia Zhang, Michael D. Bond, and Brandon Lucia. 2015. Valor: Efficient, software-only region conflict exceptions. In Proceedings of the ACM SIGPLAN International Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA’15). ACM, 241–259.
    [15]
    S. M. Blackburn, R. Garner, C. Hoffman, A. M. Khan, K. S. McKinley, R. Bentzur, A. Diwan, D. Feinberg, D. Frampton, S. Z. Guyer, M. Hirzel, A. Hosking, M. Jump, H. Lee, J. E. B. Moss, A. Phansalkar, D. Stefanović, T. VanDrunen, D. von Dincklage, and B. Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st ACM SIGPLAN Conference on Object-Oriented Programing, Systems, Languages, and Applications. ACM Press, New York, NY, 169–190.
    [16]
    Hans-J. Boehm. 2011. How to miscompile programs with “Benign” data races. In Proceedings of the 3rd USENIX Conference on Hot Topic in Parallelism (HotPar’11). USENIX Association, Berkeley, CA.
    [17]
    Hans-J. Boehm. 2012. Position paper: Nondeterminism is unavoidable, but data races are pure evil. In Proceedings of the ACM Workshop on Relaxing Synchronization for Multicore and Manycore Scalability (RACES’12). ACM, 9–14.
    [18]
    Hans-J. Boehm and Brian Demsky. 2014. Outlawing ghosts: Avoiding out-of-thin-air results. In Proceedings of the Workshop on Memory Systems Performance and Correctness (MSPC’14). ACM.
    [19]
    Pietro Cenciarelli, Alexander Knapp, and Eleonora Sibilio. 2007. The Java memory model: Operationally, denotationally, axiomatically. In Programming Languages and Systems, 16th European Symposium on Programming (Lecture Notes in Computer Science), Rocco De Nicola (Ed.), Vol. 4421. Springer, 331–346.
    [20]
    Luis Ceze, James Tuck, Pablo Montesinos, and Josep Torrellas. 2007. BulkSC: Bulk enforcement of sequential consistency. In Proceedings of the 34th International Symposium on Computer Architecture. 278–289.
    [21]
    Delphine Demange, Vincent Laporte, Lei Zhao, Suresh Jagannathan, David Pichardie, and Jan Vitek. 2013. Plan B: A buffered memory model for Java. In Proceedings of the 40th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’13). ACM, New York, NY, 329–342.
    [22]
    Yuelu Duan, Abdullah Muzahid, and Josep Torrellas. 2013. WeeFence: Toward making fences free in TSO. In Proceedings of the 40th International Symposium on Computer Architecture (ISCA’13), Avi Mendelson (Ed.). ACM, 213–224. Retrieved on June 2021 from http://dl.acm.org/citation.cfm?id=2485922.
    [23]
    Eric Eide and John Regehr. 2008. Volatiles are miscompiled, and what to do about it. In Proceedings of the International Conference on Embedded Software (EMSOFT’08), Luca de Alfaro and Jens Palsberg (Eds.). ACM, 255–264.
    [24]
    Cormac Flanagan and Stephen N. Freund. 2010. Adversarial memory for detecting destructive races. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’10). ACM, 244–254.
    [25]
    Andy Georges, Dries Buytaert, and Lieven Eeckhout. 2007. Statistically rigorous Java performance evaluation. In Proceedings of the 22nd ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications (OOPSLA’07). ACM, 57–76.
    [26]
    Mohammad Majharul Islam and Abdullah Muzahid. 2016. Detecting, exposing, and classifying sequential consistency violations. In Proceedings of the 27th IEEE International Symposium on Software Reliability Engineering (ISSRE’16). IEEE Computer Society, 241–252.
    [27]
    Java Virtual Machine Specification 2017. Retrieved on June 2021 from https://docs.oracle.com/javase/specs/jvms/se8/html.
    [28]
    Alan Jeffrey and James Riely. 2016. On thin air reads towards an event structures model of relaxed memory. In Proceedings of the 31st ACM/IEEE Symposium on Logic in Computer Science (LICS’16). ACM, New York, NY, 759–767.
    [29]
    JSR133 2018. JSR-133 Cookbook for Compiler Writers. Retrieved on June 2021 from http://g.oswego.edu/dl/jmm/cookbook.html.
    [30]
    JuliaLang. 2020. The Julia Language. Retrieved on June 2021 from https://github.com/JuliaLang/julia/commits/v1.4.1.
    [31]
    Jan-Oliver Kaiser, Hoang-Hai Dang, Derek Dreyer, Ori Lahav, and Viktor Vafeiadis. 2017. Strong logic for weak memory: Reasoning about release-acquire consistency in iris. In Proceedings of the 31st European Conference on Object-Oriented Programming (ECOOP’17) (Leibniz International Proceedings in Informatics (LIPIcs)), Peter Müller (Ed.), Vol. 74. 17:1–17:29.
    [32]
    A. Kamil, J. Su, and K. Yelick. 2005. Making sequential consistency practical in Titanium. In Proceedings of the ACM/IEEE Conference on Supercomputing. IEEE Computer Society.
    [33]
    Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. 2017. A promising semantics for relaxed-memory concurrency. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL’17). ACM, 175–189.
    [34]
    Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. 2017. Repairing sequential consistency in C/C++11. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’17). ACM, New York, NY, 618–632.
    [35]
    L. Lamport. 1979. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. 100, 28 (1979), 690–691.
    [36]
    C. Lattner and V. Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-directed and Runtime Optimization. IEEE Computer Society.
    [37]
    Lun Liu. 2020. Safe and Efficient Concurrency for Modern Programming Languages.Ph.D. Dissertation, University of California, Los Angeles.
    [38]
    Lun Liu, Todd Millstein, and Madanlal Musuvathi. 2017. A volatile-by-default JVM for server applications. Proc. ACM Program. Lang. 1, OOPSLA (Oct. 2017).
    [39]
    Lun Liu, Todd D. Millstein, and Madanlal Musuvathi. 2019. Accelerating sequential consistency for Java with speculative compilation. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’19), Kathryn S. McKinley and Kathleen Fisher (Eds.). ACM, 16–30.
    [40]
    Brandon Lucia, Luis Ceze, Karin Strauss, Shaz Qadeer, and Hans Boehm. 2010. Conflict exceptions: Providing simple parallel language semantics with precise hardware exceptions. In Proceedings of the 37th International Symposium on Computer Architecture.
    [41]
    Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev Alur, Milo M. K. Martin, Peter Sewell, and Derek Williams. 2012. An axiomatic memory model for POWER multiprocessors. In Proceedings of the 24th International Conference on Computer-aided Verification, P. Madhusudan and Sanjit A. Seshia (Eds.), Vol. 7358. Springer, 495–512.
    [42]
    J. Manson, W. Pugh, and S. Adve. 2005. The Java memory model. In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages. ACM, 378–391.
    [43]
    Daniel Marino, Todd Millstein, Madanlal Musuvathi, Satish Narayanasamy, and Abhayendra Singh. 2015. The silently shifting semicolon. In Proceedings of the 1st Summit on Advances in Programming Languages (SNAPL’15) (Leibniz International Proceedings in Informatics (LIPIcs)), Thomas Ball, Rastislav Bodik, Shriram Krishnamurthi, Benjamin S. Lerner, and Greg Morrisett (Eds.), Vol. 32. 177–189.
    [44]
    Daniel Marino, Abhayendra Singh, Todd Millstein, Madanlal Musuvathi, and Satish Narayanasamy. 2010. DRFx: A simple and efficient memory model for concurrent programming languages. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 351–362.
    [45]
    Daniel Marino, Abhayendra Singh, Todd Millstein, Madanlal Musuvathi, and Satish Narayanasamy. 2011. A case for an SC-preserving compiler. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation.
    [46]
    Xiangrui Meng, Joseph K. Bradley, Burak Yavuz, Evan R. Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, D. B. Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, and Ameet Talwalkar. 2015. MLlib: Machine learning in apache spark. CoRR abs/1505.06807 (2015).
    [47]
    Robin Morisset, Pankaj Pawan, and Francesco Zappa Nardelli. 2013. Compiler testing via a theory of sound optimisations in the C11/C++11 memory model. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13)Hans-Juergen Boehm and Cormac Flanagan (Eds.). ACM, 187–196.
    [48]
    Peizhao Ou and Brian Demsky. 2018. Towards understanding the costs of avoiding out-of-thin-air results. Proc. ACM Program. Lang. 2, OOPSLA (Oct. 2018).
    [49]
    Jessica Ouyang, Peter M. Chen, Jason Flinn, and Satish Narayanasamy. 2013. ...And region serializability for all. In Proceedings of the 5th USENIX Workshop on Hot Topics in Parallelism (HotPar’13), Emery D. Berger and Kim M. Hazelwood (Eds.). USENIX Association.
    [50]
    Scott Owens, Susmit Sarkar, and Peter Sewell. 2009. A better x86 memory model: x86-TSO. In Proceedings of the 22nd International Conference on Theorem Proving in Higher Order Logics (TPHOLs’09) (Lecture Notes in Computer Science), Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel (Eds.), Vol. 5674. Springer, 391–407.
    [51]
    Jean Pichon-Pharabod and Peter Sewell. 2016. A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions. In Proceedings of the 43rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’16). ACM, New York, NY, 622–633.
    [52]
    Benjamin Pierce. 2002. Types and Programming Languages. The MIT Press. Retrieved on June 2021 from http://www.cis.upenn.edu/ bcpierce/tapl/index.html.
    [53]
    Filip Pizlo, Lukasz Ziarek, Ethan Blanton, Petr Maj, and Jan Vitek. 2010. High-level programming of embedded hard real-time devices. In Proceedings of the 5th European Conference on Computer Systems (EuroSys’10). 69–82.
    [54]
    Carl G. Ritson and Scott Owens. 2016. Benchmarking weak memory models. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’16).
    [55]
    Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. 2011. Understanding POWER multiprocessors. SIGPLAN Not. 46, 6 (June 2011), 175–186.
    [56]
    Douglas C. Schmidt and Tim Harrison. 1997. Double-checked locking: An optimization pattern for efficiently initializing and accessing thread-safe objects. In Pattern Languages of Program Design 3, Robert C. Martin, Dirk Riehle, and Frank Buschmann (Eds.). Addison-Wesley Longman Publishing Co., Inc., 363–375.
    [57]
    Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond, and Milind Kulkarni. 2015. Hybrid static–dynamic analysis for statically bounded region serializability. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). 561–575.
    [58]
    Aritra Sengupta, Man Cao, Michael D. Bond, and Milind Kulkarni. 2015. Toward efficient strong memory model support for the Java platform via hybrid synchronization. In Proceedings of the Principles and Practices of Programming on the Java Platform (PPPJ’15), Ryan Stansifer and Andreas Krall (Eds.). ACM, 65–75.
    [59]
    Jaroslav Sevcík and David Aspinall. 2008. On validity of program transformations in the Java memory model. In Proceedings of the 31st European Conference on Object-oriented Programming (ECOOP’08). 27–51.
    [60]
    D. Shasha and M. Snir. 1988. Efficient and correct execution of parallel programs that share memory. ACM Trans. Prog. Lang. Syst. 10, 2 (1988), 282–312.
    [61]
    Abhayendra Singh, Daniel Marino, Satish Narayanasamy, Todd Millstein, and Madan Musuvathi. 2011. Efficient processor support for DRFx, a memory model with exceptions. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’11). ACM, 53–66.
    [62]
    Abhayendra Singh, S. Narayanasamy, D. Marino, T. Millstein, and M. Musuvathi. 2012. End-to-end Sequential Consistency. In Proceedings of the 39th International Symposium on Computer Architecture. 524 –535.
    [63]
    Z. Sura, X. Fang, C. L. Wong, S. P. Midkiff, J. Lee, and D. Padua. 2005. Compiler techniques for high performance sequentially consistent Java programs. In Proceedings of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2–13.
    [64]
    Michael Vollmer, Ryan G. Scott, Madanlal Musuvathi, and Ryan R. Newton. 2017. SC-Haskell: Sequential consistency in languages that minimize mutable shared heap. In Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’17). ACM, 283–298.
    [65]
    Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: A unified engine for big data processing. Commun. ACM 59, 11 (2016), 56–65.
    [66]
    Minjia Zhang, Swarnendu Biswas, and Michael D. Bond. 2017. Avoiding consistency exceptions under strong memory models. In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM’17). ACM, 115–127.

    Cited By

    View all
    • (2023)Putting Weak Memory in Order via a Promising Intermediate RepresentationProceedings of the ACM on Programming Languages10.1145/35912977:PLDI(1872-1895)Online publication date: 6-Jun-2023
    • (2022)The leaky semicolon: compositional semantic dependencies for relaxed-memory concurrencyProceedings of the ACM on Programming Languages10.1145/34987166:POPL(1-30)Online publication date: 12-Jan-2022
    • (2022)The Safety and Performance of Prominent Programming LanguagesInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402250023132:05(713-744)Online publication date: 17-May-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Programming Languages and Systems
    ACM Transactions on Programming Languages and Systems  Volume 43, Issue 3
    September 2021
    239 pages
    ISSN:0164-0925
    EISSN:1558-4593
    DOI:10.1145/3481687
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 September 2021
    Accepted: 01 April 2021
    Revised: 01 March 2021
    Received: 01 September 2020
    Published in TOPLAS Volume 43, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Memory consistency models
    2. sequential consistency
    3. just-in-time compilers

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • National Science Foundation

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)316
    • Downloads (Last 6 weeks)52
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Putting Weak Memory in Order via a Promising Intermediate RepresentationProceedings of the ACM on Programming Languages10.1145/35912977:PLDI(1872-1895)Online publication date: 6-Jun-2023
    • (2022)The leaky semicolon: compositional semantic dependencies for relaxed-memory concurrencyProceedings of the ACM on Programming Languages10.1145/34987166:POPL(1-30)Online publication date: 12-Jan-2022
    • (2022)The Safety and Performance of Prominent Programming LanguagesInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402250023132:05(713-744)Online publication date: 17-May-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media