Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Efficient Race Detection for Reducer Hyperobjects

Published: 09 August 2018 Publication History

Abstract

A multithreaded Cilk program that is ostensibly deterministic may nevertheless behave nondeterministically due to programming errors in the code. For a Cilk program that uses reducers—a general reduction mechanism supported in various Cilk dialects—such programming errors are especially challenging to debug, because the errors can expose the nondeterminism in how the Cilk runtime system manages reducers.
We identify two unique types of races that arise from incorrect use of reducers in a Cilk program, and we present two algorithms to catch these races. The first algorithm, called the Peer-Set algorithm, detects view-read races, which occur when the program attempts to retrieve a value out of a reducer when the read may result in a nondeterministic value, such as before all previously spawned subcomputations that might update the reducer have necessarily returned. The second algorithm, called the SP+ algorithm, detects determinacy races—instances where a write to a memory location occurs logically in parallel with another access to that location—even when the raced-on memory locations relate to reducers. Both algorithms are provably correct, asymptotically efficient, and can be implemented efficiently in practice. We have implemented both algorithms in our prototype race detector, Rader. When running Peer-Set, Rader incurs a geometric-mean multiplicative overhead of 2.56 over running the benchmark without instrumentation. When running SP+, Rader incurs a geometric-mean multiplicative overhead of 16.94.

References

[1]
Martin Abadi, Cormac Flanagan, and Stephen N. Freund. 2006. Types for safe locking: Static race detection for Java. ACM Transactions on Programming Languages and Systems 28, 2, 207--255.
[2]
Rahul Agrawal and Scott D. Stoller. 2004. Type inference for parameterized race-free Java. In Verification, Model Checking, and Abstract Interpretation. Lecture Notes in Computer Science, Vol. 2937. Springer, 149--160.
[3]
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, and Charles E. Leiserson. 2004. On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs. In Proceedings of the 16th Annual ACM Symposium on Parallel Algorithms and Architectures. 133--144.
[4]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques.
[5]
Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling multithreaded computations by work stealing. Journal of the ACM 46, 5, 720--748.
[6]
Chandrasekhar Boyapati and Martin Rinard. 2001. A parameterized type system for race-free Java programs. In Proceedings of the 16th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’01). ACM, New York, NY, 56--69.
[7]
Vincent Cavé, Jisheng Zhao, Jun Shirako, and Vivek Sarkar. 2011. Habanero-Java: The new adventures of old X10. In Proceedings of the 9th International Conference on Principles and Practice of Programming in Java (PPPJ’11). 51--61.
[8]
Guang-Ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, and Andrew F. Stark. 1998. Detecting data races in Cilk programs that use locks. In Proceedings of the 10th ACM Symposium on Parallel Algorithms and Architectures (SPAA’98).
[9]
Jong-Deok Choi, Keunwoo Lee, Alexey Loginov, Robert O’Callahan, Vivek Sarkar, and Manu Sridharan. 2002. Efficient and precise datarace detection for multithreaded object-oriented programs. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI’02). ACM, New York, NY, 258--269.
[10]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (3rd ed.). MIT Press, Cambridge, MA.
[11]
Joseph Devietti, Benjamin P. Wood, Karin Strauss, Luis Ceze, Dan Grossman, and Shaz Qadeer. 2012. RADISH: Always-on sound and complete race detection in software and hardware. In Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA’12). IEEE, Los Alamitos, CA, 201--212.
[12]
Dimitar Dimitrov, Martin Vechev, and Vivek Sarkar. 2015. Race detection in two dimensions. In Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’15). ACM, New York, NY, 101--110.
[13]
Anne Dinning and Edith Schonberg. 1990. An empirical comparison of monitoring algorithms for access anomaly detection. In Proceedings of the 2nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’90). 1--10.
[14]
Dawson Engler and Ken Ashcraft. 2003. RacerX: Effective, static detection of race conditions and deadlocks. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP’03). ACM, New York, NY, 237--252.
[15]
John Erickson, Madanlal Musuvathi, Sebastian Burckhardt, and Kirk Olynyk. 2010. Effective data-race detection for the kernel. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation.
[16]
Mingdong Feng and Charles E. Leiserson. 1999. Efficient detection of determinacy races in Cilk programs. Theory of Computing Systems 32, 3, 301--326.
[17]
Jeremy T. Fineman. 2005. Provably Good Race Detection That Runs in Parallel. Master’s Thesis. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, Cambridge, MA.
[18]
Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: Efficient and precise dynamic race detection. In Proceedings of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). ACM, New York, NY, 121--133.
[19]
Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin. 2009. Reducers and other Cilk++ hyperobjects. In Proceedings o f the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures. 79--90.
[20]
Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The implementation of the Cilk-5 multithreaded language. In Proceedings of the PLDI Conference.
[21]
GCC 4.8. 2014. GCC 4.8 Release Series Changes, New Features, and Fixes. Retrieved from https://gcc.gnu.org/gcc-4.8/changes.html.
[22]
Intel Corporation. 2011. Intrinsics for Low Overhead Tool Annotations. Retrieved from https://www.cilkplus.org/open_specification/intrinsics-low-overhead-tool-annotations-v10.
[23]
Intel Corporation. 2013. Intel®Cilk™ Plus Language Extension Specification, Version 1.1. Document 324396-002US. Intel Corporation. http://cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_2.html.
[24]
Intel Corporation. 2013. An Introduction to the Cilk Screen Race Detector. Retrieved from https://software.intel.com/en-us/articles/an-introduction-to-the-cilk-screen-race-detector.
[25]
ISO/IEC. 2017. ISO/IEC 14882:2017(E)—Programming Language C++. Retrieved from https://isocpp.org/std/the-standard.
[26]
I-Ting Angelina Lee, Silas Boyd-Wickizer, Zhiyi Huang, and Charles E. Leiserson. 2010. Using memory mapping to support cactus stacks in work-stealing runtime systems. In Proceedings of the PACT Conference. ACM, New York, NY, 411--420.
[27]
I-Ting Angelina Lee, Aamir Shafi, and Charles E. Leiserson. 2012. Memory-mapping support for reducer hyperobjects. In Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’12). 287--297.
[28]
Charles E. Leiserson. 2010. The Cilk++ concurrency platform. Journal of Supercomputing 51, 3, 244--257.
[29]
Charles E. Leiserson and Tao B. Schardl. 2010. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In Proceedings of the SPAA Conference.
[30]
Don McCrady. 2008. Avoiding contention using combinable objects. Microsoft Developer Network Blog Post. Re-trieved from. http://blogs.msdn.com/nativeconcurrency/archive/2008/09/25/avoiding-contention-using-combinable-objects.aspx.
[31]
John Mellor-Crummey. 1991. On-the-fly detection of data races for programs with nested fork-join parallelism. In Proceedings of the 1991 Supercomputing Conference. 24--33.
[32]
Mayur Naik, Alex Aiken, and John Whaley. 2006. Effective static race detection for Java. In Proceedings of the 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’06). ACM, New York, NY, 308--319.
[33]
Robert H. B. Netzer and Barton P. Miller. 1992. What are race conditions? ACM Letters on Programming Languages and Systems 1, 1, 74–88.
[34]
Itzhak Nudler and Larry Rudolph. 1986. Tools for the efficient development of efficient parallel programs. In Proceedings of the 1st Israeli Conference on Computer Systems Engineering.
[35]
Robert O’Callahan and Jong-Deok Choi. 2003. Hybrid dynamic data race detection. In Proceedings of the 9th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’03). ACM, New York, NY, 167--178.
[36]
OpenMP Architecture Review Board. 2013. OpenMP Application Program Interface, Version 4.0. Retrieved from http://www.openmp.org/mp-documents/OpenMP4.0.0.pdf.
[37]
Eli Pozniansky and Assaf Schuster. 2007. MultiRace: Efficient on-the-fly data race detection in multithreaded C++ programs: Research articles. Concurrency and Computation: Practice and Experience 19, 3, 327--340.
[38]
Polyvios Pratikakis, Jeffrey S. Foster, and Michael Hicks. 2011. LOCKSMITH: Practical static race detection for C. ACM Transactions on Programming Languages and Systems 33, 1, Article 3, 55 pages.
[39]
Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2010. Efficient data race detection for async-finish parallelism. In Runtime Verification. Lecture Notes in Computer Science, Vol. 6418. Springer, 368--383.
[40]
Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2012. Scalable and precise dynamic datarace detection for structured parallelism. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). 531--542.
[41]
James Reinders. 2007. Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly Media Inc.
[42]
Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A dynamic race detector for multi-threaded programs. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP’97).
[43]
Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: Data race detection in practice. In Proceedings of the Workshop on Binary Instrumentation and Applications (WBIA’09). ACM, New York, NY, 62--71.
[44]
J. Shirako, D. M. Peixotto, V. Sarkar, and W. N. Scherer. 2009. Phaser accumulators: A new reduction construct for dynamic parallelism. In Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing.
[45]
Robert Endre Tarjan. 1979. Applications of path compression on balanced trees. Journal of the Association for Computing Machinery 26, 4, 690--715.
[46]
Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, and I-Ting Angelina Lee. 2016. Provably good and practically efficient parallel race detection for fork-join programs. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’16). ACM, New York, NY, 83--94.
[47]
Christoph von Praun and Thomas R. Gross. 2001. Object race detection. In Proceedings of the 16th ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA’01). ACM, New York, NY, 70--82.
[48]
Jan Wen Voung, Ranjit Jhala, and Sorin Lerner. 2007. RELAY: Static race detection on millions of lines of code. In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering (ESEC-FSE’07). ACM, New York, NY, 205--214.
[49]
Martin Wimmer. 2013. Wait-free hyperobjects for task-parallel programming systems. In Proceedings of the IPDPS Conference. 803--812.
[50]
Yuan Yu, Tom Rodeheffer, and Wei Chen. 2005. Racetrack: Efficient detection of data race conditions via adaptive tracking. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP’05). ACM, New York, NY, 221--234.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 4, Issue 4
Special Issue on SPAA 2015
December 2017
122 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3177741
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 August 2018
Accepted: 01 March 2018
Revised: 01 January 2018
Received: 01 April 2016
Published in TOPC Volume 4, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cilk
  2. determinacy race
  3. nondeterminism
  4. reducers
  5. view-read race

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 210
    Total Downloads
  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)12
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media