Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Analyzing multicore dumps to facilitate concurrency bug reproduction

Published: 13 March 2010 Publication History

Abstract

Debugging concurrent programs is difficult. This is primarily because the inherent non-determinism that arises because of scheduler interleavings makes it hard to easily reproduce bugs that may manifest only under certain interleavings. The problem is exacerbated in multi-core environments where there are multiple schedulers, one for each core. In this paper, we propose a reproduction technique for concurrent programs that execute on multi-core platforms. Our technique performs a lightweight analysis of a failing execution that occurs in a multi-core environment, and uses the result of the analysis to enable reproduction of the bug in a single-core system, under the control of a deterministic scheduler.
More specifically, our approach automatically identifies the execution point in the re-execution that corresponds to the failure point. It does so by analyzing the failure core dump and leveraging a technique called execution indexing that identifies a related point in the re-execution. By generating a core dump at this point, and comparing the differences betwen the two dumps, we are able to guide a search algorithm to efficiently generate a failure inducing schedule. Our experiments show that our technique is highly effective and has reasonable overhead.

References

[1]
A. R. Alameldeen and D. A. Wood. Addressing Workload Variability in Architectural Simulations. In IEEE Micro, 23(6):94--98, 2003.
[2]
G. Altekar and I. Stoica. ODR: Output-Deterministic Replay for Multicore Debugging. In SOSP, pages 193--206, 2009.
[3]
A. Ayers, R. Schooler, C. Metcalf, A. Agarwal, J. Rhee, and E. Witchel. Traceback: First Fault Diagnosis by Reconstruction of Distributed Control Flow. In PLDI, pages 201--212, 2005.
[4]
S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drinic, D. Mihocka, and J. Chau. Framework for Instruction-Level Tracing and Analysis of Program Executions. In VEE, pages 154--163, 2006.
[5]
H. J. Boehm and M. Weiser. Garbage Collection in an Uncooperative Environment. In Software Practice and Experience, 18(9):807--820, 1988.
[6]
M. D. Bond and K. S. McKinley. Probabilistic Calling Context. In OOPSLA, pages 97--112, 2007.
[7]
J.-D. Choi and H. Srinivasan. Deterministic Replay of Java Multi-threaded Applications. In SIGMETRICS, pages 48--59, 1998.
[8]
G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution Replay of Multiprocessor Virtual Machines. In VEE, pages 121--130, 2008.
[9]
J. Ferrante, K. J. Ottenstein, and J. D. Warren. The Program Dependence Graph and its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9(3):319--349, 1987.
[10]
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving Data Publishing: A Survey on Recent Developments. In ACM Computing Surveys, 2009.
[11]
Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, and Z. Zhang. R2: An Application-Level Kernel for Record and Replay. In OSDI, pages 193--208, 2008.
[12]
D. R. Hower and M. D. Hill. Rerun: Exploiting Episodes for Lightweight Memory Race Recording. In ISCA, pages 265--276, 2008.
[13]
P. Joshi, C. S. Park, K. Sen, and M. Naik. A Randomized Dynamic Program Analysis Technique for Detecting Real Deadlocks. In PLDI, pages 110--120, 2009.
[14]
S. T. King, G. W. Dunlap, and P. M. Chen. Debugging Operating Systems with Time-Traveling Virtual Machines. In USENIX, pages 1--15, 2005.
[15]
B. Korel and J. Laski. Dynamic Program Slicing. In Information Processing Letters, 29(3):155--163, 1988.
[16]
P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: A Software-Hardware Interface for Practical Deterministic Multiprocessor Replay. In ASPLOS, pages 73--84, 2009.
[17]
M. Musuvathi and S. Qadeer. Iterative Context Bounding for Systematic Testing of Multithreaded Programs. In PLDI, pages 446--455, 2007.
[18]
S. Narayanasamy, C. Pereira, and B. Calder. Recording Shared Memory Dependencies Using Strata. In ASPLOS, pages 229--240, 2006.
[19]
N. Nethercote and J. Seward. Valgrind: A Framework for Heavy-weight Dynamic Binary Instrumentation. In PLDI, pages 89--100, 2007.
[20]
R. H. B. Netzer and M. H. Weaver. Optimal Tracing and Incremental Reexecution for Debugging Long-Running Programs. In PLDI, pages 313--325, 1994.
[21]
D. Z. Pan and M. A. Linton. Supporting Reverse Execution for Parallel Programs. In SIGPLAN and SIGOPS Workshop on Parallel and Distributed Debugging, pages 124--129, 1988.
[22]
S. Park, S. Lu, and Y. Zhou. Ctrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. In ASPLOS, pages 25--36, 2009.
[23]
S. Park, W. Xiong, Z. Yin, R. Kaushik, K. Lee, S. Lu, and Y. Zhou. Do You Have to Reproduce the Bug at the First Replay Attempt? -- pres: Probabilistic Replay with Execution Sketching on Multiprocessors. In SOSP, pages 177--192, 2009.
[24]
M. Ronsse, K. D. Bosschere, M. Christiaens, J. C. d. Kergommeaux, and D. Kranzlmüller. Record/Replay for Nondeterministic Program Executions. In Communcation of the ACM, 46(9):62--67, 2003.
[25]
Y. Saito. Jockey: A User-Space Library for Record-Replay Debugging. In Automated Analysis--Driven Debugging, pages 69--76, 2005.
[26]
, S. Sarkar, P. Sewell, F.Z. Nardelli, S. Owens, T. Ridge, T. Braibant, M. Myreen, and J. Aglave The Semantics of x86-CC Multiprocessor Machine Code In POPL, pages 379--391, 2009.
[27]
K. Sen. Race Directed Random Testing of Concurrent Programs. In PLDI, pages 11--21, 2008.
[28]
S. M. Srinivasan, S. Kandula, C. R. Andrews, and Y. Zhou. Flashback: A Lightweight Extension For Rollback and Deterministic Replay for Software Debugging. In USENIX, pages 29--44, 2004.
[29]
B. Xin, N. Sumner, and X. Zhang. Efficient Program Execution Indexing. In PLDI, pages 238--249, 2008.
[30]
X. Zhang, R. Gupta, and Y. Zhang. Cost and Precision Tradeoffs of Dynamic Data Slicing Algorithms. ACM Transactions on Programming Languages and Systems, 27(4):631--661, 2005.

Cited By

View all
  • (2023)Multithreading Technology Based on Golang Implementation2023 3rd Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS)10.1109/ACCTCS58815.2023.00116(603-608)Online publication date: Feb-2023
  • (2019)ReCDroidProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00030(128-139)Online publication date: 25-May-2019
  • (2017)Hybridizing and Relaxing Dependence Tracking for Efficient Parallel Runtime SupportACM Transactions on Parallel Computing10.1145/31081384:2(1-42)Online publication date: 30-Aug-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 45, Issue 3
ASPLOS '10
March 2010
399 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1735971
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
    March 2010
    422 pages
    ISBN:9781605588391
    DOI:10.1145/1736020
    • General Chair:
    • James C. Hoe,
    • Program Chair:
    • Vikram S. Adve
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2010
Published in SIGPLAN Volume 45, Issue 3

Check for updates

Author Tags

  1. concurrency bugs
  2. execution indexing
  3. multi-core
  4. reproduction

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Multithreading Technology Based on Golang Implementation2023 3rd Asia-Pacific Conference on Communications Technology and Computer Science (ACCTCS)10.1109/ACCTCS58815.2023.00116(603-608)Online publication date: Feb-2023
  • (2019)ReCDroidProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00030(128-139)Online publication date: 25-May-2019
  • (2017)Hybridizing and Relaxing Dependence Tracking for Efficient Parallel Runtime SupportACM Transactions on Parallel Computing10.1145/31081384:2(1-42)Online publication date: 30-Aug-2017
  • (2017)10 Years of research on debugging concurrent and multicore softwareSoftware Quality Journal10.1007/s11219-015-9301-725:1(49-82)Online publication date: 1-Mar-2017
  • (2024)How Well Industry-Level Cause Bisection Works in Real-World: A Study on Linux KernelCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663828(62-73)Online publication date: 10-Jul-2024
  • (2024)MissConf: LLM-Enhanced Reproduction of Configuration-Triggered BugsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3647635(484-495)Online publication date: 14-Apr-2024
  • (2022)Enriching Compiler Testing with Real Program from Bug ReportProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556894(1-12)Online publication date: 10-Oct-2022
  • (2020)Assessing and restoring reproducibility of Jupyter notebooksProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416585(138-149)Online publication date: 21-Dec-2020
  • (2020)RSX: Reproduction Scenario Extraction Technique for Business Application Workloads in DBMS2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW51248.2020.00043(91-96)Online publication date: Oct-2020
  • (2019)ReCDroidProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00030(128-139)Online publication date: 25-May-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media