Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/874064.875647guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Checkpointing and Its Applications

Published: 27 June 1995 Publication History

Abstract

Abstract: The paper describes our experience with the implementation and applications of the Unix checkpointing library libckp, and identifies two concepts that have proven to be the key to making checkpointing a powerful tool. First, including all persistent states, i.e., user files, as part of the process state that can be checkpointed and recovered provides a truly transparent and consistent rollback. Second, excluding part of the persistent state from the process state allows user programs to process future inputs from a desirable state, which leads to interesting new applications of checkpointing. We use real-life examples to demonstrate the use of libckp for bypassing premature software exits, for fast initialization and for memory rejuvenation.

References

[1]
M. Litzkow and M. Solomon, "Supporting checkpointing and process migration ouside the Unix," in Proc. Usenix Winter Conference, 1992.
[2]
J. S. Plank, M. Beck, G. Kingsley, and K. Li. "Libckpt: Transparent checkpointing under Unix," in Proc. Usenix Technical Conference, pp. 213-224, Jan. 1995.
[3]
R. E. Strom" S. A. Yemini, and D. F. Bacon, "A recoverable object store," in Proc. Hawaii International Conference on System Sciences, pp. II-215-II-221, Jan. 1988.
[4]
Y. M. Wang and W. K. Fuchs, "Lazy checkpoint coordination for bounding rollback propagation," in Proc. IEEE Symp. Reliable Distributed Syst., pp. 78-85, Oct. 1993.
[5]
F. Douglis and J. Ousterhout, "Transparent process migration: Design alternatives and the Sprite implementation," Software - Practice and Experience, Vol. 21, No.8, pp. 757- 785, Aug. 1991.
[6]
G. S. Fowler, D. G. Korn, J. J. Snyder, ane K.-P. Vo, "Feature-based portability," in Proc. VHLL Usenix Symposium on Very High Level Languages, Oct. 1994.
[7]
W.-J. Sun and C. Sechen, "Efficient and effective placement for very large circuits," in Proc. IEEE International Conference on Computer-Aided-Design, pp. 170-177, 1993.
[8]
H. Kriplani, F. Najm, and I. Hajj, "Pattern independent maximum Current estimation in power and ground buses of CMOS VLSI circuits: Algorithms, signal correlations and their resolution." submitted to IEEE Transactions on Computer-Aided Design, Feb. 1993.
[9]
P. Y. Chung, Y. M. Wang, and I. N. Hajj, "Diagnosis and correction of logic design errors in digital circuits," In Proc. the 30th ACM/IEEE Design Automation Conference, pp. 503- 508, 1993.
[10]
W. Chuang, S. S. Sapatnekar, and I. N. Hajj, "Timing and area optimization for standard-cell VLSI circuit design," IEEE Trans. Computer-Aided Design, to appear.
[11]
D. Hill, D. Shugard, J. Fishburn, and K. Keutzer, Algorithms and Techniques for VLSI Layout Synthesis. Kluwer, 1989.
[12]
A. Flora-Holmquist and M. Staskauskas, "Software design technology for communication systems reliability," in Proc. 1994 International Conference on Communication Technology, June 1994.
[13]
K. M. Chandy and L. Lamport, "Distributed snapshots: Determining global states of distributed systems," ACM Trans. Comput. Syst., Vol. 3, No.1, pp. 63-75, Feb. 1985.
[14]
R. Koo and S. Toueg, "Checkpointing and rollback-recovery for distributed systems," IEEE Trans. Software Eng., Vol. SE-13, No.1, pp. 23-31, Jan. 1987.
[15]
A. Avizienis, "The N-version approach to fault-tolerant software," IEEE Trans. Software Eng., Vol. SE-11, No. 12, pp. 1491-1501, Dec. 1985.
[16]
B. Randell, "System structure for software fault tolerance," IEEE Trans. Software Eng., Vol. SE-1, No.2, pp. 220-232, June 1975.
[17]
P. E. Ammann and J. C. Knight, "Data diversity: An approach to software fault-tolerance," IEEE Trans. Comput., Vol. 37, No.4, pp. 418-425, Apr. 1988.
[18]
D. Fell, The Essential Gardener. Avenel, NJ: Crescent Books, 1993.
[19]
K.-P. Vo, "Writing reusable libraries with disciplines and methods," in submitted to ACM SIGSOFT Symposium on Software Reusability, 1995.
[20]
R. Hastings and B. Joyce, "Purify: Fast detection of memory leaks and access errors," in Proc. Winter Usenix Conference, pp. 125-136, Jan. 1992.
[21]
A. S. Corporation, "SENTINEL run-time analysis tool: User's guide." 1994.
[22]
H.-J. Boehm and M. Weiser, "Garbage collection in an uncooperative environment," Software - Practice and Experience, Vol. 18, No.9, pp. 807-820, Sept. 1988.

Cited By

View all
  • (2015)Page overlaysACM SIGARCH Computer Architecture News10.1145/2872887.275037943:3S(79-91)Online publication date: 13-Jun-2015
  • (2015)ThyNVMProceedings of the 48th International Symposium on Microarchitecture10.1145/2830772.2830802(672-685)Online publication date: 5-Dec-2015
  • (2015)Speculative Memory CheckpointingProceedings of the 16th Annual Middleware Conference10.1145/2814576.2814802(197-209)Online publication date: 24-Nov-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
FTCS '95: Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
June 1995

Publisher

IEEE Computer Society

United States

Publication History

Published: 27 June 1995

Author Tags

  1. Unix
  2. Unix checkpointing library
  3. bypassed premature software exits
  4. fast initialization
  5. future input processing
  6. libckp
  7. memory rejuvenation
  8. operating systems (computers)
  9. persistent state
  10. process state
  11. recovery
  12. rollback
  13. software fault tolerance
  14. software libraries
  15. system recovery
  16. user files
  17. user programs

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2015)Page overlaysACM SIGARCH Computer Architecture News10.1145/2872887.275037943:3S(79-91)Online publication date: 13-Jun-2015
  • (2015)ThyNVMProceedings of the 48th International Symposium on Microarchitecture10.1145/2830772.2830802(672-685)Online publication date: 5-Dec-2015
  • (2015)Speculative Memory CheckpointingProceedings of the 16th Annual Middleware Conference10.1145/2814576.2814802(197-209)Online publication date: 24-Nov-2015
  • (2015)Page overlaysProceedings of the 42nd Annual International Symposium on Computer Architecture10.1145/2749469.2750379(79-91)Online publication date: 13-Jun-2015
  • (2014)A survey of software aging and rejuvenation studiesACM Journal on Emerging Technologies in Computing Systems10.1145/253911710:1(1-34)Online publication date: 13-Jan-2014
  • (2014)A Causal Checkpointing Algorithm for Mobile Computing EnvironmentsProceedings of the 15th International Conference on Distributed Computing and Networking - Volume 831410.1007/978-3-642-45249-9_9(134-148)Online publication date: 4-Jan-2014
  • (2012)Compiler support for fine-grain software-only checkpointingProceedings of the 21st international conference on Compiler Construction10.1007/978-3-642-28652-0_11(200-219)Online publication date: 24-Mar-2012
  • (2010)RelaxACM SIGARCH Computer Architecture News10.1145/1816038.181602638:3(497-508)Online publication date: 19-Jun-2010
  • (2010)TimetravelerACM SIGARCH Computer Architecture News10.1145/1816038.181598638:3(198-209)Online publication date: 19-Jun-2010
  • (2010)RelaxProceedings of the 37th annual international symposium on Computer architecture10.1145/1815961.1816026(497-508)Online publication date: 19-Jun-2010
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media