Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Application semantic driven assertions toward fault tolerant computing

Published: 01 June 2006 Publication History
  • Get Citation Alerts
  • Abstract

    Based on semantics of an application processing logic, we find out the most critical and sensitive parts of an application and we derive set of conditions or assertions among the various diagnostic checkpoint variables and we enhance the processing logic to enable it to detect run-time various operational or environmental faults toward fault tolerant computing. This paper examines how a single-version algorithm can establish software based fault tolerance by designing in thoughtful software based execution-time checks in a computing application. The algorithm developed here relies on various assertions that are derived from the semantics of an application. Various diagnostic assertive checkpoints have been derived based on an application's semantics. This work is not intended to correct bit-errors using conventional error correction codes. Errors have been detected through checkpoints and periodical execution of an application with known test data and verification of observed result with known result thereof. Electrical transients or small particles hitting the circuit, often cause random errors or faults in data and program flow. The manuscript describes an algorithm that allows the detection and recovery of transient or operational failures in software on a specific problem, just by using one version of a software program running on just one machine. This approach does not aim to tolerate software design bugs. This algorithmic approach uses various run-time signatures and validation thereof in order to detect faults.

    References

    [1]
    {1} A. Avizienis, "The N - Version Approach to Fault Tolerant Software," IEEE Transactions on Software Engineering, vol. SE-11, December 1985, pp. 1491-1501,.
    [2]
    {2} Goutam K Saha, "EMP -Fault Tolerant Computing: A New Approach," Journal of Microelectronic Systems Integration, Vol. 5, Number 3, 1997, pp.183-193, Plenum Press, USA.
    [3]
    {3} Y.N. Shen, H. Kari, S.S. Kim, and F. Lombardi, "Scheduling Policies for Fault Tolerance in a VLSI Processor," in Proc. IEEE Int. Workshop on DFT in VLSI Systems, Montreal, Canada, October 1994, pp. 1-9.
    [4]
    {4} J.W.S. Liu, W.K. Shih, K.J. Lin, R. Bettati, and J.Y. Chung, "Imprecise Computations," Proc. IEEE, vol. 82, no.1, Jan. 1994, pp. 83-94.
    [5]
    {5} B. Randell, et al., "Reliability in Computing System Design," ACM Computing Surveys, Vol. 10, Number 2, 1985, pp. 374-382.
    [6]
    {6} Goutam Kumar Saha, "Low-Cost, Fault Tolerance Applications," IEEE Potentials, vol. 24, no. 4, November 2005, pp. 35-39.
    [7]
    {7} M. Zenha Rela, H. Madeira, J.G. Silva, "Experimental Evaluation of the Fail-Silent Behavior in Programs with Consistency Checks", Proc. FTCS-26, 1996, pp. 394-403.
    [8]
    {8} K.H. Huang, J.A. Abraham, "Algorithm - Based Fault Tolerance for Matrix Operations," IEEE Trans Computers, Vol 33, Dec 1984, pp. 518-528.
    [9]
    {9} S. Yau, F. Chen, "An Approach to Concurrent Flow Checking," IEEE Trans on Software Engineering, Vol. SE-6, No. 2, March 1980, pp. 126-137.
    [10]
    {10} D.K. Pradhan, "Fault-Tolerant Computer System Design", Prentice Hall PTR, 1996.
    [11]
    {11} Goutam Kumar Saha, "Software Implemented Fault Tolerance Through Data Error Recovery," ACM Ubiquity, Vol. 6 (35), ACM Press, USA, 2005.
    [12]
    {12} Goutam Kumar Saha, "Transient Fault Tolerance in Mobile Agent Based Computing," INFOCOMP Journal of Computer Science, Vol. 4, no.4, UFLA, Brazil, 2005, pp. 1-11.
    [13]
    {13} Goutam Kumar Saha, "Fault Tolerance in Web Services," ACM Ubiquity, Vol. 7 (9), ACM Press, USA, 2006.
    [14]
    {14} Goutam Kumar Saha, "Software Based Fault Tolerant Computing," ACM Ubiquity, Vol. 6 (40), ACM Press, USA, 2005.
    [15]
    {15} Goutam Kumar Saha, "Transient Software Fault Tolerance Using Single-Version Algorithm," ACM Ubiquity, Vol. 6 (28), ACM Press, USA, 2005.
    [16]
    {16} Goutam Kumar Saha, "Transient Fault Tolerance Through Recovery," ACM Ubiquity, Vol. 6 (35), ACM Press, USA, 2005.
    [17]
    {17} Goutam Kumar Saha, "A Software Fix Towards Fault Tolerant Computing," ACM Ubiquity, Vol. 6 (16), ACM Press, USA, 2005.
    [18]
    {18} Goutam Kumar Saha, "Software Based Fault Tolerant Array," IEEE Potentials, Vol. 25 (1), IEEE Press, USA, 2006.
    [19]
    {19} Goutam Kumar Saha, "Transient Fault Tolerance Through Algorithms," to appear in IEEE Potentials, USA, 2006.
    [20]
    {20} D.W. Bradley, C. Ortega-Sanchez and A.M. Tyrrell, "Embryonics + Imunotronics: A Bio-Inspired Approach to Fault Tolerance," In Proceedings of 2nd NASA/DoD Workshop on Evolvable Hardware, July 2000.
    [21]
    {21} Z. Alkhalifa, V.S.S. Nair, N. Krishnamurthi, J.A. Abraham, "Design and Evaluation of System - Level Checks for On-line Control Flow Error Detection, IEEE Transactions on Parallel and Distributed Systems, Vol.10 (6), Jun 1999, pp. 627-641.
    [22]
    {22} R. Reddy, R. France, G. George, "An Aspect Oriented Approach to Analyzing Dependability Features," Proceedings of the AOM-AOSD, 2005.

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Ubiquity
    Ubiquity  Volume 2006, Issue June
    June 2006
    68 pages
    EISSN:1530-2180
    DOI:10.1145/1147991
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 2006
    Published in UBIQUITY Volume 2006, Issue June

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)253
    • Downloads (Last 6 weeks)45
    Reflects downloads up to 28 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Magazine Site

    View this article on the magazine site (external)

    Magazine Site

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media