Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Disentanglement in nested-parallel programs

Published: 20 December 2019 Publication History

Abstract

Nested parallelism has proved to be a popular approach for programming the rapidly expanding range of multicore computers. It allows programmers to express parallelism at a high level and relies on a run-time system and a scheduler to deliver efficiency and scalability. As a result, many programming languages and extensions that support nested parallelism have been developed, including in C/C++, Java, Haskell, and ML. Yet, writing efficient and scalable nested parallel programs remains challenging, primarily due to difficult concurrency bugs arising from destructive updates or effects. For decades, researchers have argued that functional programming can simplify writing parallel programs by allowing more control over effects but functional programs continue to underperform in comparison to parallel programs written in lower-level languages. The fundamental difficulty with functional languages is that they have high demand for memory, and this demand only grows with parallelism.
In this paper, we identify a memory property, called disentanglement, of nested-parallel programs, and propose memory management techniques for improved efficiency and scalability. Disentanglement allows for (destructive) effects as long as concurrently executing threads do not gain knowledge of the memory objects allocated by each other. We formally define disentanglement by considering an ML-like higher-order language with mutable references and presenting a dynamic semantics for it that enables reasoning about computation graphs of nested parallel programs. Based on this graph semantics, we formalize a classic correctness property---determinacy race freedom---and prove that it implies disentanglement. This establishes that disentanglement applies to a relatively broad class of parallel programs. We then propose memory management techniques for nested-parallel programs that take advantage of disentanglement for improved efficiency and scalability. We show that these techniques are practical by extending the MLton compiler for Standard ML to support this form of nested parallelism. Our empirical evaluation shows that our techniques are efficient and scale well.

Supplementary Material

Auxiliary Archive (popl20main-p169-p-aux.zip)
This companion appendix includes a proof of the main theorem.
WEBM File (a47-westrick.webm)

References

[1]
Umut A. Acar, Vitaly Aksenov, Arthur Charguéraud, and Mike Rainey. 2018a. Performance Challenges in Modular Parallel Programs. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’18). 381–382.
[2]
Umut A. Acar, Vitaly Aksenov, Arthur Charguéraud, and Mike Rainey. 2019. Provably and Practically Efficient Granularity Control. In Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (PPoPP ’19). 214–228.
[3]
Umut A. Acar, Guy Blelloch, Matthew Fluet, Stefan K. Muller, and Ram Raghunathan. 2015a. Coupling Memory and Computation for Locality Management. In Summit on Advances in Programming Languages (SNAPL).
[4]
Umut A. Acar, Guy E. Blelloch, and Robert D. Blumofe. 2002. The Data Locality of Work Stealing. Theory of Computing Systems 35, 3 (2002), 321–347.
[5]
Umut A. Acar, Arthur Charguéraud, Adrien Guatto, Mike Rainey, and Filip Sieczkowski. 2018b. Heartbeat Scheduling: Provable Efficiency for Nested Parallelism. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). 769–782.
[6]
Umut A. Acar, Arthur Charguéraud, and Mike Rainey. 2013. Scheduling Parallel Programs by Work Stealing with Private Deques. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’13).
[7]
Umut A. Acar, Arthur Chargueraud, and Mike Rainey. 2015b. A Work-Efficient Algorithm for Parallel Unordered Depth-First Search. In ACM/IEEE Conference on High Performance Computing (SC). ACM, New York, NY, USA, Article 67, 12 pages.
[8]
Umut A. Acar, Arthur Charguéraud, and Mike Rainey. 2016. Oracle-guided scheduling for controlling granularity in implicitly parallel languages. Journal of Functional Programming (JFP) 26 (2016), e23.
[9]
Sarita V. Adve. 2010. Data races are evil with no exceptions: technical perspective. Commun. ACM 53, 11 (2010), 84.
[10]
T. R. Allen and D. A. Padua. 1987. Debugging Fortran on a Shared Memory Machine. In Proceedings of the 1987 International Conference on Parallel Processing. 721–727.
[11]
B. Alpern, L. Carter, and E. Feig. 1990. Uniform memory hierarchies. In Proceedings 31st Annual Symposium on Foundations of Computer Science. 600–608 vol.2.
[12]
Todd A. Anderson. 2010. Optimizations in a private nursery-based garbage collector. In Proceedings of the 9th International Symposium on Memory Management, ISMM 2010, Toronto, Ontario, Canada, June 5-6, 2010. 21–30.
[13]
Andrew W. Appel. 1989. Simple Generational Garbage Collection and Fast Allocation. Software Prac. Experience 19, 2 (1989), 171–183. http://www.cs.princeton.edu/fac/~appel/papers/143.ps
[14]
Andrew W. Appel and Zhong Shao. 1996. Empirical and analytic study of stack versus heap cost for languages with closures. Journal of Functional Programming 6, 1 (Jan. 1996), 47–74. ftp://daffy.cs.yale.edu/pub/papers/shao/stack.ps
[15]
Nimar S. Arora, Robert D. Blumofe, and C. Greg Plaxton. 2001. Thread Scheduling for Multiprogrammed Multiprocessors. Theory of Computing Systems 34, 2 (2001), 115–144.
[16]
Arvind, Rishiyur S. Nikhil, and Keshav K. Pingali. 1989. I-structures: Data Structures for Parallel Computing. ACM Trans. Program. Lang. Syst. 11, 4 (Oct. 1989), 598–632.
[17]
Sven Auhagen, Lars Bergstrom, Matthew Fluet, and John H. Reppy. 2011. Garbage collection for multicore NUMA machines. In Proceedings of the 2011 ACM SIGPLAN workshop on Memory Systems Performance and Correctness (MSPC). 51–57.
[18]
Stephanie Balzer, Bernardo Toninho, and Frank Pfenning. 2019. Manifest Deadlock-Freedom for Shared Session Types. In Programming Languages and Systems - 28th European Symposium on Programming, ESOP 2019, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019, Prague, Czech Republic, April 6-11, 2019, Proceedings. 611–639.
[19]
M. Bauer, S. Treichler, E. Slaughter, and A. Aiken. 2012. Legion: Expressing locality and independence with logical regions. In SC ’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 1–11.
[20]
Ales Bizjak, Daniel Gratzer, Robbert Krebbers, and Lars Birkedal. 2019. Iron: managing obligations in higher-order concurrent separation logic. PACMPL 3, POPL (2019), 65:1–65:30.
[21]
Guy E. Blelloch. 1996. Programming Parallel Algorithms. Commun. ACM 39, 3 (1996), 85–97.
[22]
Guy E. Blelloch, Rezaul A. Chowdhury, Phillip B. Gibbons, Vijaya Ramachandran, Shimin Chen, and Michael Kozuch. 2008. Provably good multicore cache performance for divide-and-conquer algorithms. In In the Proceedings of the 19th ACM-SIAM Symposium on Discrete Algorithms. 501–510.
[23]
Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Julian Shun. 2012. Internally deterministic parallel algorithms can be fast. In PPoPP ’12. 181–192.
[24]
Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Harsha Vardhan Simhadri. 2011. Scheduling irregular parallel computations on hierarchical caches. In Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 355–366.
[25]
Guy E. Blelloch and Phillip B. Gibbons. 2004. Effectively sharing a cache among threads. In SPAA.
[26]
Guy E. Blelloch, Phillip B. Gibbons, Yossi Matias, and Girija J. Narlikar. 1997. Space-efficient Scheduling of Parallelism with Synchronization Variables. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA ’97). 12–23.
[27]
Guy E. Blelloch, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha, and Siddhartha Chatterjee. 1994. Implementation of a Portable Nested Data-Parallel Language. J. Parallel Distrib. Comput. 21, 1 (1994), 4–14.
[28]
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1995. Cilk: An Efficient Multithreaded Runtime System. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. Santa Barbara, California, 207–216.
[29]
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1996. Cilk: An Efficient Multithreaded Runtime System. J. Parallel and Distrib. Comput. 37, 1 (1996), 55 – 69.
[30]
Robert D. Blumofe and Charles E. Leiserson. 1998. Space-Efficient Scheduling of Multithreaded Computations. SIAM J. Comput. 27, 1 (1998), 202–229.
[31]
Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling multithreaded computations by work stealing. J. ACM 46 (Sept. 1999), 720–748. Issue 5.
[32]
Robert L. Bocchino, Stephen Heumann, Nima Honarmand, Sarita V. Adve, Vikram S. Adve, Adam Welc, and Tatiana Shpeisman. 2011. Safe nondeterminism in a deterministic-by-default parallel language. In ACM POPL.
[33]
Robert L. Bocchino, Jr., Vikram S. Adve, Danny Dig, Sarita V. Adve, Stephen Heumann, Rakesh Komuravelli, Jeffrey Overbey, Patrick Simmons, Hyojin Sung, and Mohsen Vakilian. 2009. A type and effect system for deterministic parallel Java. In Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications (OOPSLA ’09). 97–116.
[34]
Robert L Bocchino, Jr., Vikram S. Adve, Sarita V. Adve, and Marc Snir. 2009. Parallel programming must be deterministic by default. In First USENIX Conference on Hot Topics in Parallelism.
[35]
Hans-Juergen Boehm. 2011. How to Miscompile Programs with "Benign" Data Races. In 3rd USENIX Workshop on Hot Topics in Parallelism, HotPar’11, Berkeley, CA, USA, May 26-27, 2011.
[36]
Manuel M. T. Chakravarty, Roman Leshchinskiy, Simon L. Peyton Jones, Gabriele Keller, and Simon Marlow. 2007. Data parallel Haskell: a status report. In Proceedings of the POPL 2007 Workshop on Declarative Aspects of Multicore Programming, DAMP 2007, Nice, France, January 16, 2007. 10–18.
[37]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. 2005. X10: an object-oriented approach to non-uniform cluster computing. In Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA ’05). ACM, 519–538.
[38]
C. J. Cheney. 1970. A Non-Recursive List Compacting Algorithm. Commun. ACM 13, 11 (Nov. 1970), 677–8.
[39]
Guang-Ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, and Andrew F. Stark. 1998. Detecting data races in Cilk programs that use locks. In Proceedings of the 10th ACM Symposium on Parallel Algorithms and Architectures (SPAA ’98).
[40]
Intel Corp. 2017. Knights landing (KNL): 2nd Generation Intel Xeon Phi processor. In Intel Xeon Processor E7 v4 Family Specification. https://ark.intel.com/products/series/93797/Intel- Xeon- Processor- E7- v4- Family .
[41]
Damien Doligez and Georges Gonthier. 1994. Portable, Unobtrusive Garbage Collection for Multiprocessor Systems. In Conference Record of the Twenty-first Annual ACM Symposium on Principles of Programming Languages (ACM SIGPLAN Notices). ACM Press, Portland, OR. ftp://ftp.inria.fr/INRIA/Projects/para/doligez/DoligezGonthier94.ps.gz
[42]
Damien Doligez and Xavier Leroy. 1993. A Concurrent Generational Garbage Collector for a Multi-Threaded Implementation of ML. In Conference Record of the Twentieth Annual ACM Symposium on Principles of Programming Languages (ACM SIGPLAN Notices). ACM Press, 113–123. file://ftp.inria.fr/INRIA/Projects/cristal/Xavier.Leroy/publications/concurrentgc.ps.gz
[43]
Tamar Domani, Elliot K. Kolodner, Ethan Lewis, Erez Petrank, and Dafna Sheinwald. 2002. Thread-Local Heaps for Java. In ISMM’02 Proceedings of the Third International Symposium on Memory Management (ACM SIGPLAN Notices), David Detlefs (Ed.). ACM Press, Berlin, 76–87. http://www.cs.technion.ac.il/~erez/publications.html
[44]
Martin Elsman. 2001. A Stack Machine for Region Based Programs, See [ SPACE 2001 ]. http://www.diku.dk/topps/space2001/ program.html#MartinElsman
[45]
Perry A. Emrath, Sanjoy Ghosh, and David A. Padua. 1991. Event Synchronization Analysis for Debugging Parallel Programs. In Supercomputing ’91. 580–588.
[46]
Perry A. Emrath and Davis A. Padua. 1988. Automatic Detection of Nondeterminacy in Parallel Programs. In Proceedings of the Workshop on Parallel and Distributed Debugging. 89–99.
[47]
Kayvon Fatahalian, Daniel Reiter Horn, Timothy J. Knight, Larkhoon Leem, Mike Houston, Ji Young Park, Mattan Erez, Manman Ren, Alex Aiken, William J. Dally, and Pat Hanrahan. 2006. Sequoia: Programming the Memory Hierarchy. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC ’06). ACM, New York, NY, USA, Article 83.
[48]
Mingdong Feng and Charles E. Leiserson. 1997. Efficient Detection of Determinacy Races in Cilk Programs. In Proceedings of the Ninth Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA). 1–11.
[49]
Mingdong Feng and Charles E. Leiserson. 1999. Efficient Detection of Determinacy Races in Cilk Programs. Theory of Computing Systems 32, 3 (1999), 301–326.
[50]
Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: efficient and precise dynamic race detection. SIGPLAN Not. 44, 6 (June 2009), 121–133.
[51]
Cormac Flanagan, Stephen N. Freund, Marina Lifshin, and Shaz Qadeer. 2008. Types for atomicity: Static checking and inference for Java. ACM Trans. Program. Lang. Syst. 30, 4 (2008), 20:1–20:53.
[52]
Matthew Fluet, Greg Morrisett, and Amal J. Ahmed. 2006. Linear Regions Are All You Need. In Proceedings of the 15th Annual European Symposium on Programming (ESOP).
[53]
Matthew Fluet, Mike Rainey, and John Reppy. 2008. A scheduling framework for general-purpose parallel languages. In ACM SIGPLAN International Conference on Functional Programming (ICFP).
[54]
Matthew Fluet, Mike Rainey, John Reppy, and Adam Shaw. 2011. Implicitly threaded parallelism in Manticore. Journal of Functional Programming 20, 5-6 (2011), 1–40.
[55]
Matthew Fluet, Mike Rainey, John Reppy, Adam Shaw, and Yingqi Xiao. 2007. Manticore: A Heterogeneous Parallel Language. In Proceedings of the 2007 Workshop on Declarative Aspects of Multicore Programming (DAMP ’07). 37–44.
[56]
Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin. 2009. Reducers and Other Cilk++ Hyperobjects. In 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures. 79–90.
[57]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The Implementation of the Cilk-5 Multithreaded Language. In PLDI. 212–223.
[58]
David K. Gifford and John M. Lucassen. 1986. Integrating Functional and Imperative Programming. In Proceedings of the ACM Symposium on Lisp and Functional Programming (LFP). ACM Press, 22–38.
[59]
Marcelo J. R. Gonçalves. 1995. Cache Performance of Programs with Intensive Heap Allocation and Generational Garbage Collection. Ph.D. Dissertation. Department of Computer Science, Princeton University.
[60]
Marcelo J. R. Gonçalves and Andrew W. Appel. 1995. Cache Performance of Fast-Allocating Programs. In Record of the 1995 Conference on Functional Programming and Computer Architecture.
[61]
Dan Grossman, Greg Morrisett, Trevor Jim, Michael Hicks, Yanling Wang, and James Cheney. 2002. Region-Based Memory Management in Cyclone. In Proceedings of SIGPLAN 2002 Conference on Programming Languages Design and Implementation (ACM SIGPLAN Notices). ACM Press, Berlin, 282–293.
[62]
Adrien Guatto, Sam Westrick, Ram Raghunathan, Umut A. Acar, and Matthew Fluet. 2018. Hierarchical memory management for mutable state. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2018, Vienna, Austria, February 24-28, 2018. 81–93.
[63]
Robert H. Halstead, Jr. 1984. Implementation of Multilisp: Lisp on a Multiprocessor. In Proceedings of the 1984 ACM Symposium on LISP and functional programming (LFP ’84). ACM, 9–17.
[64]
Kevin Hammond. 2011. Why Parallel Functional Programming Matters: Panel Statement. In Reliable Software Technologies -Ada-Europe 2011 - 16th Ada-Europe International Conference on Reliable Software Technologies, Edinburgh, UK, June 20-24, 2011. Proceedings. 201–205.
[65]
David R. Hanson. 1990. Fast Allocation and Deallocation of Memory Based on Object Lifetimes. Software Prac. Experience 20, 1 (Jan. 1990), 5–12.
[66]
Shams Mahmood Imam and Vivek Sarkar. 2014. Habanero-Java library: a Java 8 framework for multicore programming. In 2014 International Conference on Principles and Practices of Programming on the Java Platform Virtual Machines, Languages and Tools, PPPJ ’14. 75–86.
[67]
Intel. 2011. Intel Threading Building Blocks. https://www.threadingbuildingblocks.org/ .
[68]
Intel Corporation 2009a. Intel Cilk++ SDK Programmer’s Guide. Intel Corporation. Document Number: 322581-001US.
[69]
Intel Corporation 2009b. Intel(R) Threading Building Blocks. Intel Corporation. Available from http://www. threadingbuildingblocks.org/documentation.php .
[70]
Richard Jones, Antony Hosking, and Eliot Moss. 2011. The garbage collection handbook: the art of automatic memory management. Chapman & Hall/CRC.
[71]
Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2018a. RustBelt: securing the foundations of the rust programming language. PACMPL 2, POPL (2018), 66:1–66:34.
[72]
Ralf Jung, Robbert Krebbers, Jacques-Henri Jourdan, Ales Bizjak, Lars Birkedal, and Derek Dreyer. 2018b. Iris from the ground up: A modular foundation for higher-order concurrent separation logic. J. Funct. Program. 28 (2018), e20.
[73]
Gabriele Keller, Manuel M.T. Chakravarty, Roman Leshchinskiy, Simon Peyton Jones, and Ben Lippmeier. 2010. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (ICFP ’10). 261–272.
[74]
A. Krishnamurthy, D. E. Culler, A. Dusseau, S. C. Goldstein, S. Lumetta, T. von Eicken, and K. Yelick. 1993. Parallel Programming in Split-C. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing (Supercomputing ’93). ACM, New York, NY, USA, 262–273.
[75]
Milind Kulkarni, Keshav Pingali, Bruce Walter, Ganesh Ramanarayanan, Kavita Bala, and L. Paul Chew. 2007. Optimistic Parallelism Requires Abstractions. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). 211–222.
[76]
Lindsey Kuper and Ryan R Newton. 2013. LVars: lattice-based data structures for deterministic parallelism. In Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing. ACM, 71–84.
[77]
Lindsey Kuper, Aaron Todd, Sam Tobin-Hochstadt, and Ryan R. Newton. 2014a. Taming the Parallel Effect Zoo: Extensible Deterministic Parallelism with LVish. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 2–14.
[78]
Lindsey Kuper, Aaron Turon, Neelakantan R. Krishnaswami, and Ryan R. Newton. 2014b. Freeze After Writing: Quasideterministic Parallel Programming with LVars. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 257–270.
[79]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW ’10. ACM, 591–600.
[80]
John Launchbury and Simon L. Peyton Jones. 1994. Lazy Functional State Threads. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), Orlando, Florida, USA, June 20-24, 1994. 24–35.
[81]
Matthew Le and Matthew Fluet. 2015. Partial Aborts for Transactions via First-class Continuations. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP 2015). 230–242.
[82]
Doug Lea. 2000. A Java fork/join framework. In Proceedings of the ACM 2000 conference on Java Grande (JAVA ’00). 36–43.
[83]
Daan Leijen, Wolfram Schulte, and Sebastian Burckhardt. 2009. The design of a task parallel library. In Proceedings of the 24th ACM SIGPLAN conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’09). 227–242.
[84]
Peng Li, Simon Marlow, Simon L. Peyton Jones, and Andrew P. Tolmach. 2007. Lightweight concurrency primitives for GHC. In Proceedings of the ACM SIGPLAN Workshop on Haskell, Haskell 2007, Freiburg, Germany, September 30, 2007. 107–118.
[85]
Henry Lieberman and Carl E. Hewitt. 1981. A Real-Time Garbage Collector Based on the Lifetimes of Objects. AI Memo 569a. MIT. ftp://publications.ai.mit.edu/ai- publications/pdf/AIM- 569a.pdf
[86]
J. M. Lucassen and D. K. Gifford. 1988. Polymorphic Effect Systems. In Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’88). ACM, New York, NY, USA, 47–57.
[87]
Simon Marlow. 2011. Parallel and Concurrent Programming in Haskell. In Central European Functional Programming School - 4th Summer School, CEFP 2011, Budapest, Hungary, June 14-24, 2011, Revised Selected Papers. 339–401.
[88]
John Mellor-Crummey. 1991. On-the-fly Detection of Data Races for Programs with Nested Fork-Join Parallelism. In Proceedings of Supercomputing’91. 24–33.
[89]
Stefan K. Muller and Umut A. Acar. 2016. Latency-Hiding Work Stealing: Scheduling Interacting Parallel Computations with Work Stealing. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016, Asilomar State Beach/Pacific Grove, CA, USA, July 11-13, 2016. 71–82.
[90]
Stefan K. Muller, Umut A. Acar, and Robert Harper. 2017. Responsive Parallel Computation: Bridging Competitive and Cooperative Threading. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA, 677–692.
[91]
Stefan K. Muller, Umut A. Acar, and Robert Harper. 2018. Types and Cost Models for Responsive Parallelism (Draft). In Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming (ICFP ’18).
[92]
Girija J. Narlikar. 1999. Scheduling threads for low space requirement and good locality. In 11th Annual ACM Symposium on Parallel Algorithms and Architectures. 83–95.
[93]
Robert H. B. Netzer and Barton P. Miller. 1992. What are Race Conditions? ACM Letters on Programming Languages and Systems 1, 1 (March 1992), 74–88.
[94]
Robert W. Numrich and John Reid. 1998. Co-array Fortran for Parallel Programming. SIGPLAN Fortran Forum 17, 2 (Aug. 1998), 1–31.
[95]
Atsushi Ohori, Kenjiro Taura, and Katsuhiro Ueno. 2018. Making SML# a General-purpose High-performance Language. Unpublished Manuscript.
[96]
Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2008. A Probabilistic Language Based on Sampling Functions. ACM Trans. Program. Lang. Syst. 31, 1, Article 4 (Dec. 2008), 46 pages.
[97]
Simon L. Peyton Jones, Roman Leshchinskiy, Gabriele Keller, and Manuel M. T. Chakravarty. 2008. Harnessing the Multicores: Nested Data Parallelism in Haskell. In FSTTCS. 383–414.
[98]
Simon L. Peyton Jones and Philip Wadler. 1993. Imperative Functional Programming. In Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’93). 71–84.
[99]
Keshav Pingali, Donald Nguyen, Milind Kulkarni, Martin Burtscher, Muhammad Amber Hassaan, Rashid Kaleem, TsungHsien Lee, Andrew Lenharth, Roman Manevich, Mario Méndez-Lojo, Dimitrios Prountzos, and Xin Sui. 2011. The tao of parallelism in algorithms. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. 12–25.
[100]
Ram Raghunathan, Stefan K. Muller, Umut A. Acar, and Guy Blelloch. 2016. Hierarchical Memory Management for Parallel Programs. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming (ICFP 2016). ACM, New York, NY, USA, 392–406.
[101]
Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin T. Vechev, and Eran Yahav. 2012. Scalable and precise dynamic datarace detection for structured parallelism. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, Beijing, China - June 11 - 16, 2012. 531–542.
[102]
John C. Reynolds. 1978. Syntactic Control of Interference. In Proceedings of the 5th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL ’78). ACM, New York, NY, USA, 39–46.
[103]
John C. Reynolds. 2002. Separation Logic: A Logic for Shared Mutable Data Structures. In 17th IEEE Symposium on Logic in Computer Science (LICS 2002), 22-25 July 2002, Copenhagen, Denmark, Proceedings. 55–74.
[104]
Dan Robinson. 2017. HPE shows The Machine — with 160TB of shared memory. Data Center Dynamics (May 2017).
[105]
Douglas T. Ross. 1967. The AED free storage package. Commun. ACM 10, 8 (Aug. 1967), 481–492.
[106]
Rust Team. 2019. Rust Language. https://www.rust- lang.org/
[107]
Jacob T. Schwartz. 1975. Optimization of very high level languages (parts I and II). Computer Languages 2–3, 1 (1975), 161–194,197–218.
[108]
Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Aapo Kyrola, Harsha Vardhan Simhadri, and Kanat Tangwongsan. 2012. Brief Announcement: The Problem Based Benchmark Suite. In Proceedings of the Twenty-fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’12). 68–70.
[109]
KC Sivaramakrishnan and Stephen Dolan. 2017. A deep dive into Multicore OCaml garbage collector. http://kcsrk.info/ multicore/gc/2017/07/06/multicore- ocaml- gc/ Unpublished manuscript.
[110]
K. C. Sivaramakrishnan, Lukasz Ziarek, and Suresh Jagannathan. 2014. MultiMLton: A multicore-aware runtime for standard ML. Journal of Functional Programming FirstView (6 2014), 1–62.
[111]
A. Sodani. 2015. Knights landing (KNL): 2nd Generation Intel Xeon Phi processor. In 2015 IEEE Hot Chips 27 Symposium (HCS). 1–24.
[112]
SPACE 2001. Proceedings of the Second workshop on Semantics, Program Analysis and Computing Environments for Memory Management (SPACE’01). London. http://www.diku.dk/topps/space2001/
[113]
Daniel Spoonhower. 2009. Scheduling Deterministic Parallel Programs. Ph.D. Dissertation. Carnegie Mellon University. https://www.cs.cmu.edu/~rwh/theses/spoonhower.pdf
[114]
Guy L. Steele, Jr. 1994. Building Interpreters by Composing Monads. In Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’94). ACM, New York, NY, USA, 472–492.
[115]
Guy L. Steele Jr. 1990. Making Asynchronous Parallelism Safe for the World. In Proceedings of the Seventeenth Annual ACM Symposium on Principles of Programming Languages (POPL). ACM Press, 218–231.
[116]
Robert Endre Tarjan. 1975. Efficiency of a Good But Not Linear Set Union Algorithm. J. ACM 22, 2 (April 1975), 215–225.
[117]
Tachio Terauchi and Alex Aiken. 2008. Witnessing Side Effects. ACM Trans. Program. Lang. Syst. 30, 3, Article 15 (May 2008), 42 pages.
[118]
Mads Tofte and Jean-Pierre Talpin. 1997. Region-Based Memory Management. Information and Computation (Feb. 1997). http://www.diku.dk/research- groups/topps/activities/kit2/infocomp97.ps
[119]
Aaron Turon, Derek Dreyer, and Lars Birkedal. 2013. Unifying refinement and hoare-style reasoning in a logic for higherorder concurrency. In ACM SIGPLAN International Conference on Functional Programming, ICFP’13, Boston, MA, USA -September 25 - 27, 2013. 377–390.
[120]
David M. Ungar. 1984. Generation Scavenging: A Non-Disruptive High Performance Storage Reclamation Algorithm. ACM SIGPLAN Notices 19, 5 (April 1984), 157–167. Also published as ACM Software Engineering Notes 9, 3 (May 1984) — Proceedings of the ACM/SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, 157–167, April 1984.
[121]
Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, and I-Ting Angelina Lee. 2016. Provably Good and Practically Efficient Parallel Race Detection for Fork-Join Programs. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2016, Asilomar State Beach/Pacific Grove, CA, USA, July 11-13, 2016. 83–94.
[122]
Viktor Vafeiadis and Matthew J. Parkinson. 2007. A Marriage of Rely/Guarantee and Separation Logic. In CONCUR 2007 - Concurrency Theory, 18th International Conference, CONCUR 2007, Lisbon, Portugal, September 3-8, 2007, Proceedings. 256–271.
[123]
David Walker. 2001. On Linear Types and Regions, See [ SPACE 2001 ]. http://www.diku.dk/topps/space2001/program.html# DavidWalker
[124]
Kathy Yelick, Luigi Semenzato, Geoff Pike, Carleton Miyamoto, Ben Liblit, Arvind Krishnamurthy, Paul Hilfinger, Susan Graham, David Gay, Phil Colella, and Alex Aiken. 1998. Titanium: a high-performance Java dialect. Concurrency: Practice and Experience 10, 11ÃćÂĂÂŘ13 (1998), 825–836.
[125]
Lukasz Ziarek, K. C. Sivaramakrishnan, and Suresh Jagannathan. 2011. Composable asynchronous events. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2011, San Jose, CA, USA, June 4-8, 2011. 628–639.

Cited By

View all
  • (2024)Explicit Effects and Effect Constraints in ReMLProceedings of the ACM on Programming Languages10.1145/36329218:POPL(2370-2394)Online publication date: 5-Jan-2024
  • (2024)DisLog: A Separation Logic for DisentanglementProceedings of the ACM on Programming Languages10.1145/36328538:POPL(302-331)Online publication date: 5-Jan-2024
  • (2023)Parallelism in a Region Inference ContextProceedings of the ACM on Programming Languages10.1145/35912567:PLDI(884-906)Online publication date: 6-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 4, Issue POPL
January 2020
1984 pages
EISSN:2475-1421
DOI:10.1145/3377388
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2019
Published in PACMPL Volume 4, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data race
  2. disentanglement
  3. functional programming
  4. memory management
  5. parallel computing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)241
  • Downloads (Last 6 weeks)35
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Explicit Effects and Effect Constraints in ReMLProceedings of the ACM on Programming Languages10.1145/36329218:POPL(2370-2394)Online publication date: 5-Jan-2024
  • (2024)DisLog: A Separation Logic for DisentanglementProceedings of the ACM on Programming Languages10.1145/36328538:POPL(302-331)Online publication date: 5-Jan-2024
  • (2023)Parallelism in a Region Inference ContextProceedings of the ACM on Programming Languages10.1145/35912567:PLDI(884-906)Online publication date: 6-Jun-2023
  • (2023)Evaluating Functional Memory-Managed Parallel Languages for HPC using the NAS Parallel Benchmarks2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00072(413-422)Online publication date: May-2023
  • (2022)Concurrent and parallel garbage collection for lightweight threads on multicore processorsProceedings of the 2022 ACM SIGPLAN International Symposium on Memory Management10.1145/3520263.3534652(29-42)Online publication date: 14-Jun-2022
  • (2021)Efficient tree-traversals: reconciling parallelism and dense data representationsProceedings of the ACM on Programming Languages10.1145/34735965:ICFP(1-29)Online publication date: 19-Aug-2021
  • (2021)The Case for an Interwoven Parallel Hardware/Software Stack2021 SC Workshops Supplementary Proceedings (SCWS)10.1109/SCWS55283.2021.00017(50-59)Online publication date: Nov-2021
  • (2020)Retrofitting parallelism onto OCamlProceedings of the ACM on Programming Languages10.1145/34089954:ICFP(1-30)Online publication date: 3-Aug-2020
  • (2020)The history of Standard MLProceedings of the ACM on Programming Languages10.1145/33863364:HOPL(1-100)Online publication date: 12-Jun-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media