Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Supporting reverse execution for parallel programs

Published: 01 November 1988 Publication History

Abstract

Parallel programs are difficult to debug because they run for a, long time and two executions may yield different results. Reverse execution, is a simple and powerful concept that solves both these problems. We are designing a tool for debugging parallel programs, called Recap, that provides the illusion of reverse execution using checkpoints and event recording and playback. During normal execution, Recap logs the results of system calls and shared memory reads: as well as the times that asynchronous events (signals) occur. Recap periodically checkpoints the state of a process by forking and suspending a new process. To reverse execute to a certain point in time, Recap continues the nearest checkpoint process forward in a self-contained environment, simulating all events using the log. We are implementing Recap as part of a larger environment for parallel program development.

References

[1]
A. Agarwal, R. L. Sites, and M. Horowitz, "ATUM: A New Technique for Capturing Address %ra.ces Using Microcode", Proceedings of the 13th Symposium on Computer Architecture, June 1986, pp. 119-127.
[2]
T. A. Cargill and B. N. Locanthi, "Cheap Hardwa.re Support for Softwa.re Debugging and Profiling", Proceedings of th.e Second International Conference o77. Architectural,5'~lppor~ for Programming Languages and Operating Systems, Palo Alto, California, in SIGPLAN Notices, Vol. 22, No. 10, October 1987, pp. 8:2-83.
[3]
R. Curt, is and L. Wittie, "Bugnet: A Debugging System for Pa.rallel Programming Environments", Proceedings of the 3rd Interna.tional Conference on Distributed Computing Systems, Miami, Florida, October 1982, pp. 394-399.
[4]
S. i. Feldma~l and C. B. Brown, "Igor: A Systern for Program Debugging Via R.eversible Execution", Proceedings of the A CM Workshop on. Parallel and Distributed Debugging, Ma.y 1988.
[5]
T. J. LeBla.nc and J. M. M ellor-Crummey, "Debugging Para.tlel Programs with Instant. Replay", IEEE Transactions on Com.puters, Vol. 36, No. 4, April 1987, pp. 471-482.
[6]
M. A. Linton, "Distributed Management of a Software Database", IEEE Software, Vol. 4, No. 6, November 1987, pp 70-76.
[7]
B. P. Miller and Jong-Deok Choi, "A Mechanism for Efficient Debugging of Parallel Programs", Technical l~eport~ TR754, University of Wisconsin-Madison, 1987.
[8]
M. Young, A. Tevanian, it. t~ashid, D. Golub, 21. Eppinger, J. Chew, W. Bolosky, D. Black, and it. Baron, "The Duality of Memory a.nd Conununication in the Implementation of a Multoiprocessor Operating System", Proceedlugs of the 11th A CM Symposium on Oper'aing Sys~.ems Principles, Austin, Texe~s, November 1987, pp. 63-76.

Cited By

View all
  • (2011)Replay debugging of non‐deterministic executions in the Kernel‐based Virtual MachineSoftware: Practice and Experience10.1002/spe.109443:11(1261-1281)Online publication date: 27-May-2011
  • (2010)Robust non-intrusive record-replay with processor extractionProceedings of the 8th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging10.1145/1866210.1866211(9-19)Online publication date: 13-Jul-2010
  • (2010)PinPlayProceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization10.1145/1772954.1772958(2-11)Online publication date: 24-Apr-2010
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 24, Issue 1
Special issue: Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on parallel and distributed debugging
Jan. 1989
280 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/69215
Issue’s Table of Contents
  • cover image ACM Conferences
    PADD '88: Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
    November 1988
    282 pages
    ISBN:0897912969
    DOI:10.1145/68210
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 1988
Published in SIGPLAN Volume 24, Issue 1

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)75
  • Downloads (Last 6 weeks)20
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2011)Replay debugging of non‐deterministic executions in the Kernel‐based Virtual MachineSoftware: Practice and Experience10.1002/spe.109443:11(1261-1281)Online publication date: 27-May-2011
  • (2010)Robust non-intrusive record-replay with processor extractionProceedings of the 8th Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging10.1145/1866210.1866211(9-19)Online publication date: 13-Jul-2010
  • (2010)PinPlayProceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization10.1145/1772954.1772958(2-11)Online publication date: 24-Apr-2010
  • (2005)Minimizing the log size for execution replay of shared-memory programsParallel Processing: CONPAR 94 — VAPP VI10.1007/3-540-58430-7_8(76-87)Online publication date: 3-Jun-2005
  • (2005)Trace size vs parallelism in trace-and-replay debugging of shared-memory programsLanguages and Compilers for Parallel Computing10.1007/3-540-57659-2_35(617-632)Online publication date: 31-May-2005
  • (2003)ROS: The Rollback-One-Step Method to Minimize the Waiting Time during Debugging Long-Running Parallel ProgramsHigh Performance Computing for Computational Science — VECPAR 200210.1007/3-540-36569-9_45(664-678)Online publication date: 15-Apr-2003
  • (2002)Shortcut ReplayProceedings of the7th Asian Computing Science Conference on Advances in Computing Science: Internet Computing and Modeling, Grid Computing, Peer-to-Peer Computing, and Cluster10.5555/646068.676955(34-46)Online publication date: 4-Dec-2002
  • (2002)ROSProceedings of the 5th international conference on High performance computing for computational science10.5555/1766851.1766904(664-678)Online publication date: 26-Jun-2002
  • (2002)Shortcut Replay: A Replay Technique for Debugging Long-Running Parallel ProgramsAdvances in Computing Science — ASIAN 200210.1007/3-540-36184-7_5(34-46)Online publication date: 8-Nov-2002
  • (2002)Debugging in Distributed SystemsEncyclopedia of Software Engineering10.1002/0471028959.sof085Online publication date: 15-Jan-2002
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media