Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Application semantic driven assertions toward fault tolerant computing

Published: 01 June 2006 Publication History

Abstract

Based on semantics of an application processing logic, we find out the most critical and sensitive parts of an application and we derive set of conditions or assertions among the various diagnostic checkpoint variables and we enhance the processing logic to enable it to detect run-time various operational or environmental faults toward fault tolerant computing. This paper examines how a single-version algorithm can establish software based fault tolerance by designing in thoughtful software based execution-time checks in a computing application. The algorithm developed here relies on various assertions that are derived from the semantics of an application. Various diagnostic assertive checkpoints have been derived based on an application's semantics. This work is not intended to correct bit-errors using conventional error correction codes. Errors have been detected through checkpoints and periodical execution of an application with known test data and verification of observed result with known result thereof. Electrical transients or small particles hitting the circuit, often cause random errors or faults in data and program flow. The manuscript describes an algorithm that allows the detection and recovery of transient or operational failures in software on a specific problem, just by using one version of a software program running on just one machine. This approach does not aim to tolerate software design bugs. This algorithmic approach uses various run-time signatures and validation thereof in order to detect faults.

References

[1]
{1} A. Avizienis, "The N - Version Approach to Fault Tolerant Software," IEEE Transactions on Software Engineering, vol. SE-11, December 1985, pp. 1491-1501,.
[2]
{2} Goutam K Saha, "EMP -Fault Tolerant Computing: A New Approach," Journal of Microelectronic Systems Integration, Vol. 5, Number 3, 1997, pp.183-193, Plenum Press, USA.
[3]
{3} Y.N. Shen, H. Kari, S.S. Kim, and F. Lombardi, "Scheduling Policies for Fault Tolerance in a VLSI Processor," in Proc. IEEE Int. Workshop on DFT in VLSI Systems, Montreal, Canada, October 1994, pp. 1-9.
[4]
{4} J.W.S. Liu, W.K. Shih, K.J. Lin, R. Bettati, and J.Y. Chung, "Imprecise Computations," Proc. IEEE, vol. 82, no.1, Jan. 1994, pp. 83-94.
[5]
{5} B. Randell, et al., "Reliability in Computing System Design," ACM Computing Surveys, Vol. 10, Number 2, 1985, pp. 374-382.
[6]
{6} Goutam Kumar Saha, "Low-Cost, Fault Tolerance Applications," IEEE Potentials, vol. 24, no. 4, November 2005, pp. 35-39.
[7]
{7} M. Zenha Rela, H. Madeira, J.G. Silva, "Experimental Evaluation of the Fail-Silent Behavior in Programs with Consistency Checks", Proc. FTCS-26, 1996, pp. 394-403.
[8]
{8} K.H. Huang, J.A. Abraham, "Algorithm - Based Fault Tolerance for Matrix Operations," IEEE Trans Computers, Vol 33, Dec 1984, pp. 518-528.
[9]
{9} S. Yau, F. Chen, "An Approach to Concurrent Flow Checking," IEEE Trans on Software Engineering, Vol. SE-6, No. 2, March 1980, pp. 126-137.
[10]
{10} D.K. Pradhan, "Fault-Tolerant Computer System Design", Prentice Hall PTR, 1996.
[11]
{11} Goutam Kumar Saha, "Software Implemented Fault Tolerance Through Data Error Recovery," ACM Ubiquity, Vol. 6 (35), ACM Press, USA, 2005.
[12]
{12} Goutam Kumar Saha, "Transient Fault Tolerance in Mobile Agent Based Computing," INFOCOMP Journal of Computer Science, Vol. 4, no.4, UFLA, Brazil, 2005, pp. 1-11.
[13]
{13} Goutam Kumar Saha, "Fault Tolerance in Web Services," ACM Ubiquity, Vol. 7 (9), ACM Press, USA, 2006.
[14]
{14} Goutam Kumar Saha, "Software Based Fault Tolerant Computing," ACM Ubiquity, Vol. 6 (40), ACM Press, USA, 2005.
[15]
{15} Goutam Kumar Saha, "Transient Software Fault Tolerance Using Single-Version Algorithm," ACM Ubiquity, Vol. 6 (28), ACM Press, USA, 2005.
[16]
{16} Goutam Kumar Saha, "Transient Fault Tolerance Through Recovery," ACM Ubiquity, Vol. 6 (35), ACM Press, USA, 2005.
[17]
{17} Goutam Kumar Saha, "A Software Fix Towards Fault Tolerant Computing," ACM Ubiquity, Vol. 6 (16), ACM Press, USA, 2005.
[18]
{18} Goutam Kumar Saha, "Software Based Fault Tolerant Array," IEEE Potentials, Vol. 25 (1), IEEE Press, USA, 2006.
[19]
{19} Goutam Kumar Saha, "Transient Fault Tolerance Through Algorithms," to appear in IEEE Potentials, USA, 2006.
[20]
{20} D.W. Bradley, C. Ortega-Sanchez and A.M. Tyrrell, "Embryonics + Imunotronics: A Bio-Inspired Approach to Fault Tolerance," In Proceedings of 2nd NASA/DoD Workshop on Evolvable Hardware, July 2000.
[21]
{21} Z. Alkhalifa, V.S.S. Nair, N. Krishnamurthi, J.A. Abraham, "Design and Evaluation of System - Level Checks for On-line Control Flow Error Detection, IEEE Transactions on Parallel and Distributed Systems, Vol.10 (6), Jun 1999, pp. 627-641.
[22]
{22} R. Reddy, R. France, G. George, "An Aspect Oriented Approach to Analyzing Dependability Features," Proceedings of the AOM-AOSD, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Ubiquity
Ubiquity  Volume 2006, Issue June
June 2006
68 pages
EISSN:1530-2180
DOI:10.1145/1147991
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 2006
Published in UBIQUITY Volume 2006, Issue June

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)320
  • Downloads (Last 6 weeks)12
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media