Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/645610.661371guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Hardware-Software Co-Reliability in Field Reconfigurable Multi-Processor-Memory Systems

Published: 15 April 2002 Publication History

Abstract

Advances in field reconfigurable technology have made possible the design and implementation of highly flexible parallel multi-processor-memory systems; system reliability is often an important measure of these systems because a degradation of an individual module can unacceptably impair the reliable operation of these systems. System reliability is mainly determined by the hardware (HW) configurations (requested by the software, SW) and the process of field reconfiguration/repair (by utilizing unused processors and memory modules as spares). This is referred to as HW/SW Co-reliability. System configurations are categorized in terms of parallel processor size and processor/ memory intensity as affecting the HW/SW Co-reliability. Their characteristics are discussed. A model for HW/SW Co-reliability based on a combinatorial analysis for field reconfigurable multiprocessor-memory systems is then proposed and further validated by extensive parametric simulations, thus allowing the design and implementation of highly reliable field-reconfigurable multiprocessor-memory systems.

References

[1]
Li Keqin and V.Y. Pan, "Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system", Computers, IEEE Transactions on, Vol. 50 Issue. 5, pp. 519-525, May 2001.
[2]
G. Rubino and B. Sericola, "Interval availability analysis using denumerable Markov processes: application to multiprocessor subject to breakdowns and repair", Computers, IEEE Transactions on, Vol. 44 Issue. 2, pp. 286-291, Feb. 1995.
[3]
Liu Jyh-Charn and K.G. Shin, "Efficient implementation techniques for gracefully degradable multiprocessor systems", Computers, IEEE Transactions on, Vol. 44 Issue. 4, pp. 503-517, Apr. 1995.
[4]
S. Tridandapani, A.K. Somani and U.R. Sandadi, "Low overhead multiprocessor allocation strategies exploiting system spare capacity for fault detection and location", Computers, IEEE Transactions on, Vol. 44 Issue. 7, pp. 865-877, Jul. 1995.
[5]
P. Mohapatra and C.R. Das, "On dependability evaluation of mesh-connected processors", Computers, IEEE Transactions on, Vol. 44 Issue. 9, pp. 1073-1084, Sep. 1995.
[6]
C.-I.H. Chen and V. Cherkassky, "Task allocation and reallocation for fault tolerance in multicomputer systems", Aerospace and Electronic Systems, IEEE Transactions on, Vol. 30 Issue. 4, pp. 1094-1104, Oct. 1994.
[7]
S.J. Upadhyaya and H. Pham, "Analysis of noncoherent systems and an architecture for the computation of the system reliability", Computers, IEEE Transactions on, Vol. 42 Issue. 4, pp. 484-493, Apr. 1993.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
IPDPS '02: Proceedings of the 16th International Parallel and Distributed Processing Symposium
April 2002
ISBN:0769515738

Publisher

IEEE Computer Society

United States

Publication History

Published: 15 April 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media