Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1273647.1273658acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
Article

Healing data races on-the-fly

Published: 09 July 2007 Publication History

Abstract

Testing of concurrent software is extremely difficult. Despite all the progress in the testing and verification technology, concurrent bugs, the most common of which are deadlocks and races, make it to the field. This paper describes a set of techniques, implemented in a tool called ConTest, allowing concurrent programs to self-heal at run-time. Concurrent bugs have the very desirable property for healing that some of the interleaving produce correct results while in others bugs manifest. Healing concurrency problems is about limiting, or changing the probability of interleaving, such that bugs will be seen less. When healing concurrent programs, if a deadlock does not result from limiting the interleaving, we are sure that the result of the healed program could have been in the original program and therefore no new functional bug has been introduced. In this initial work which deals with different types of data races, we suggest three types of healing mechanisms: (1) changing the probability of interleaving by introducing sleep or yield statements or by changing thread priorities, (2) removing interleaving using synchronisation commands like locking and unlocking certain mutexes or waits and notifies, and (3) removing the result of "bad interleaving" by replacing the value of variables by the one that "should" have been taken. We also classify races according to the relevant healing strategies to apply.

References

[1]
S. V. Adve, M. D. Hill, B. P. Miller, and R. H. B. Netzer. Detecting Data Races on Weak Memory Systems. In Proc. of ISCA'91, 1991. ACM Press.
[2]
C. Artho, K. Havelund, and A. Biere. High-level Data Races. In Proc. of VVEIS'03, Angers, France, 2003.
[3]
C. Artho, K. Havelund, and A. Biere. Using Block-Local Atomicity to Detect Stale-Value Concurrency Errors. In Proc. of ATVA'04, LNCS 3299, 2004. Springer.
[4]
V. Balasundaram and K. Kennedy. Compile-time Detection of Race Conditions in a Parallel Program. In Proc. of ICS'89, 1989. ACM Press.
[5]
T. Ball and S. Rajamani. The SLAM Toolkit. In Proc. of CAV'01, LNCS 2102. Springer, 2001.
[6]
U. Banerjee, B. Bliss, Z. Ma, and P. Petersen. A Theory of Data Race Detection. In Proc. of PADTAD'06, 2006. ACM Press.
[7]
W. Beaton and J. d. Rivieres. Eclipse Platform Technical Overview. Technical report, The Eclipse Foundation, 2006.
[8]
T. P. e. a. Brian Goetz. Java Concurrency in Practice. Addison-Wesley, 2006.
[9]
S. Chaki, E. Clarke, A. Groce, J. Ouaknine, O. Strichman, and K. Yorav. Efficient Verification of Sequential and Concurrent C Programs. Formal Methods in System Design, 25(2-3):129--166, 2004.
[10]
G.-I. Cheng, M. Feng, C. E. Leiserson, K. H. Randall, and A. F. Stark. Detecting Data Races in CILK Programs that Use Locks. In Proc. of SPAA'98, 1998. ACM Press.
[11]
J. Choi, K. Lee, A. Loginov, R. O'Callahan, V. Sarkar, and M. Sridharan. Efficient and Precise Data Race Detection for Multithreaded Object-Oriented Programs, 2002.
[12]
E. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, 1999.
[13]
A. Dinning and E. Schonberg. An Empirical Comparison of Monitoring Algorithms for Access Anomaly Detection. In Proc. of PPOPP'90, 1990. ACM Press.
[14]
A. Dinning and E. Schonberg. Detecting Access Anomalies in Programs with Critical Sections. In Proc. of PADD'91, 1991. ACM Press.
[15]
P. A. Emrath and D. A. Padua. Automatic Detection of Nondeterminacy in Parallel Programs. In Proc. of PADD'88, 1988. ACM Press.
[16]
E. Farchi, Y. Nir, and S. Ur. Concurrent Bug Patterns and How To Test Them. In Proc. of IPDPS'03, 2003. IEEE Computer Society.
[17]
C. Flanagan and S. N. Freund. Detecting Race Conditions in Large Programs. In Proc. of PASTE'01, 2001. ACM Press.
[18]
C. Flanagan and S. N. Freund. Atomizer: A Dynamic Atomicity Checker for Multithreaded Programs. In Proc. of POPL'04, 2004. ACM Press.
[19]
T. Elmas, S. Qadeer, and S. Tasiran. Goldilocks: Efficiently computing the happens-before relation using locksets. Proceedings of the Workshop on Formal Approaches to Testing and Runtime Verification, 2006.
[20]
C. Flanagan and S. Qadeer. Types for Atomicity. In Proc. of TLDI'03, 2003. ACM Press.
[21]
S. N. Freund and S. Qadeer. Exploiting Purity for Atomicity. IEEE Transaction on Software Engineering, 31(4):275--291, 2005.
[22]
T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software Verification with Blast. In Proc. of 10th SPIN Workshop, LNCS 2648, 2003. Springer.
[23]
A. Itzkovitz, A. Schuster, and O. Zeev-Ben-Mordehai. Toward Integration of Data Race Detection in DSM Systems. Journal of Parallel and Distributed Computing, 59(2):180--203, 1999.
[24]
R. J. Lipton. Reduction: A Method of Proving Properties of Parallel Programs. Communications of the ACM, 18(12):717--721, 1975.
[25]
J. Mellor-Crummey. Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs. In Proc. of PADD'93, 1993. ACM Press.
[26]
N. Mittal and V.K. Garg. Finding Missing Synchronization in a Distributed Computation Using Controlled Re-Execution. In Distributed Computation, 2004.
[27]
R. Nagpaly, K. Pattabiramanz, D. Kirovski, and B. Zorn. ToleRace: Tolerating and Detecting Races. In Proc. of STMCS'07, 2007.
[28]
R. Netzer and B. Miller. Detecting Data Races in Parallel Program Executions. In Advances in Languages and Compilers for Parallel Computing, 1990 Workshop, 1990. MIT Press.
[29]
R. H. B. Netzer and B. P. Miller. Improving the Accuracy of Data Race Detection. Proc. of PPOPP'91, published in ACM SIGPLAN NOTICES, 26(7):133--144, 1991.
[30]
R. H. B. Netzer and B. P. Miller. What Are Race Conditions?: Some Issues and Formalizations. ACM Lett. Program. Lang. Syst., 1(1):74--88, 1992.
[31]
D. Perkovic and P. J. Keleher. A Protocol-Centric Approach to On-the-Fly Race Detection. IEEE Transactions on Parallel and Distributed Systems, 11(10), 2000.
[32]
E. Pozniansky and A. Schuster. Efficient On-the-Fly Data Race Detection in Multithreaded C++ Programs. In Proc. of PPoPP'03, 2003. ACM Press.
[33]
B. Richards and J. R. Larus. Protocol-based Data Race Detection. In Proc. SIGMETRICS Symposium on Parallel and Sistributed Tools, 1998. ACM Press.
[34]
M. Ronsse and K. D. Bosschere. Recplay: A Fully Integrated Practical Record/Replay System. ACM Transactions on Computer Systems, 17(2):133--152, 1999.
[35]
S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A Dynamic Data Race Detector for Multi-threaded Programs. ACM Transactions on Computer Systems (TOCS), 15(4):391--411, 1997.
[36]
A. Tarafdar and V.K. Garg VK. Software Fault Tolerance of Concurrent Programs Using Controlled Re-Execution. In Proc. of DISC'99, 1999.
[37]
R. Tzoref, S. Ur, and E. Yom-Tov. Instrumenting Where it Hurts-An Automatic Concurrent Debugging Technique. In Proc. of ISSTA'07, to appear. ACM Press, 2007.
[38]
C. von Praun and T. R. Gross. Object Race Detection. In Proc. of OOPSLA'01, 2001. ACM Press.
[39]
L. Wang and S. D. Stoller. Run-time Analysis for Atomicity. In Proc. of RV'03, ENTCS 89(2), 2003. Elsevier.
[40]
L. Wang and S. D. Stoller. Static Analysis of Atomicity for Programs with Non-blocking Synchronization. In PPoPP'05, 2005. ACM Press.

Cited By

View all
  • (2024)Polynima: Practical Hybrid Recompilation for Multithreaded BinariesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650065(1126-1141)Online publication date: 22-Apr-2024
  • (2023)Hippodrome: Data Race Repair Using Static Analysis SummariesACM Transactions on Software Engineering and Methodology10.1145/354694232:2(1-33)Online publication date: 31-Mar-2023
  • (2022)On-the-Fly Repairing of Atomicity Violations in ARINC 653 SoftwareApplied Sciences10.3390/app1204201412:4(2014)Online publication date: 15-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PADTAD '07: Proceedings of the 2007 ACM workshop on Parallel and distributed systems: testing and debugging
July 2007
72 pages
ISBN:9781595937483
DOI:10.1145/1273647
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concurrency
  2. self-healing
  3. testing

Qualifiers

  • Article

Conference

ISSTA07
Sponsor:

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Polynima: Practical Hybrid Recompilation for Multithreaded BinariesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650065(1126-1141)Online publication date: 22-Apr-2024
  • (2023)Hippodrome: Data Race Repair Using Static Analysis SummariesACM Transactions on Software Engineering and Methodology10.1145/354694232:2(1-33)Online publication date: 31-Mar-2023
  • (2022)On-the-Fly Repairing of Atomicity Violations in ARINC 653 SoftwareApplied Sciences10.3390/app1204201412:4(2014)Online publication date: 15-Feb-2022
  • (2022) Multi‐agent architecture approach for self‐healing systems: Run‐time recovery with case‐based reasoning Concurrency and Computation: Practice and Experience10.1002/cpe.744235:1Online publication date: 30-Oct-2022
  • (2020)FERA: A Framework for Critical Assessment of Execution Monitoring Based Approaches for Finding Concurrency BugsIntelligent Computing10.1007/978-3-030-52249-0_5(54-74)Online publication date: 4-Jul-2020
  • (2019)Tuning lock-based multicore program based on sliding windows to tolerate data raceThe Journal of Supercomputing10.1007/s11227-019-02921-7Online publication date: 6-Jun-2019
  • (2018)A systematic survey on automated concurrency bug detection, exposing, avoidance, and fixing techniquesSoftware Quality Journal10.1007/s11219-017-9385-326:3(855-889)Online publication date: 1-Sep-2018
  • (2018)Discovering Concurrency ErrorsLectures on Runtime Verification10.1007/978-3-319-75632-5_2(34-60)Online publication date: 11-Feb-2018
  • (2017)A Survey about Self-Healing Systems (Desktop and Web Application)Communications and Network10.4236/cn.2017.9100409:01(71-88)Online publication date: 2017
  • (2017)Repairing event race errors by controlling nondeterminismProceedings of the 39th International Conference on Software Engineering10.1109/ICSE.2017.34(289-299)Online publication date: 20-May-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media