Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1146909.1146926acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
Article

Prototyping a fault-tolerant multiprocessor SoC with run-time fault recovery

Published: 24 July 2006 Publication History

Abstract

Modern integrated circuits (ICs) are becoming increasingly complex. The complexity makes it difficult to design, manufacture and integrate these high performance ICs. The advent of multiprocessor Systems-on-chips (SoCs) makes it even more challenging for programmers to utilize the full potential of the computation resources on the chips. In the mean time, the complexity of the chip design creates new reliability challenges. As a result, chip designers and users cannot fully exploit the tremendous silicon resources on the chip. This research proposes a prototype which is composed of a fault tolerantmultiprocessor SoC and a coupled single program, multiple data (SPMD) programming framework. We use a SystemC based modeling and simulation environment to design and analyze this prototype. Our analysis shows that this prototype as a reliable computing platform constructed from the potentially unreliable chip resources, thus protecting the previous investment of hardware and software designs. Moreover, the promising application-driven simulation results shed light on the potential of a scalable and reliable multiprocessing computing platform for a wide range of mission-critical applications.

References

[1]
D. Bertozzi, L. Benini, and G. De Micheli. Low power error resilient encoding for on-chip data buses. In Proceedings of 2002 Design Automation and Test in Europe Conference (DATE), 2002.
[2]
D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner, and T. Mudge. Razor: A low-power pipeline based on circuit-level timing speculation. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture (MICRO), 2003.
[3]
S. Manolache, P. Eles, and Z. Peng. Fault and energy-aware communication mapping with guaranteed latency for applications implemented on NoC. In Proceedings of 42nd ACM/IEEE Design Automation Conference (DAC), 2005.
[4]
D. K. Pradhan. Fault-Tolerant Computer System Design. Prentice-Hall, Inc., 1996.
[5]
W. Qin. SimIt-ARM. http://sourceforge.net/projects/simit-arm/.
[6]
W. Robbins. Redundancy and binning of picoChip processors. Fall Processor Forum, 2004, San Jose, CA.
[7]
M. B. Taylor, J. Kim, J. Miller, D. Wentzlaff, F. Ghodrat, B. Greenwald, H. Hoffman, P. Johnson, J.-W. Lee, W. Lee, A. Ma, A. Saraf, M. Seneski, N. Shnidman, V. Strumpen, M. Frank, S. Amarasinghe, and A. Agarwal. The R aw microprocessor: A computational fabric for software circuits and general-purpose programs. IEEE Micro, 22(2), 2002.
[8]
X. Zhu, W. Qin, and S. Malik. Modeling operation and microarchitecture concurrency for communication architec tures with application to retargetable simulation. In Proceedings of International Conference on Hardware/Software Co-design and System Synthesis (CODES+ISSS), 2004.

Cited By

View all
  • (2018)A Hierarchical and Distributed Fault Tolerant Proposal for NoC-Based MPSoCsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2016.25936406:4(524-537)Online publication date: 1-Oct-2018
  • (2016)Distributed Sensor Network-on-Chip for Performance Optimization of Soft-Error-Tolerant Multiprocessor System-on-ChipIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2015.245291024:4(1546-1559)Online publication date: Apr-2016
  • (2016)A layered approach for fault tolerant NoC-based MPSoCs — Special session: Dependable MPSoCs2016 17th Latin-American Test Symposium (LATS)10.1109/LATW.2016.7483367(189-194)Online publication date: Apr-2016
  • Show More Cited By

Index Terms

  1. Prototyping a fault-tolerant multiprocessor SoC with run-time fault recovery

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DAC '06: Proceedings of the 43rd annual Design Automation Conference
    July 2006
    1166 pages
    ISBN:1595933816
    DOI:10.1145/1146909
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 July 2006

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fault-tolerance
    2. multiprocessor system
    3. network-on-chip
    4. retargetable simulation
    5. run-time verification
    6. system-on-chip

    Qualifiers

    • Article

    Conference

    DAC06
    Sponsor:
    DAC06: The 43rd Annual Design Automation Conference 2006
    July 24 - 28, 2006
    CA, San Francisco, USA

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Upcoming Conference

    DAC '25
    62nd ACM/IEEE Design Automation Conference
    June 22 - 26, 2025
    San Francisco , CA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)A Hierarchical and Distributed Fault Tolerant Proposal for NoC-Based MPSoCsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2016.25936406:4(524-537)Online publication date: 1-Oct-2018
    • (2016)Distributed Sensor Network-on-Chip for Performance Optimization of Soft-Error-Tolerant Multiprocessor System-on-ChipIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2015.245291024:4(1546-1559)Online publication date: Apr-2016
    • (2016)A layered approach for fault tolerant NoC-based MPSoCs — Special session: Dependable MPSoCs2016 17th Latin-American Test Symposium (LATS)10.1109/LATW.2016.7483367(189-194)Online publication date: Apr-2016
    • (2014)On-chip sensor networks for soft-error tolerant real-time multiprocessor systems-on-chipACM Journal on Emerging Technologies in Computing Systems10.1145/256492810:2(1-20)Online publication date: 6-Mar-2014
    • (2014)Runtime fault recovery protocol for NoC-based MPSoCsFifteenth International Symposium on Quality Electronic Design10.1109/ISQED.2014.6783316(132-139)Online publication date: Mar-2014
    • (2013)Framework for simulation of heterogeneous MpSoC for design space explorationVLSI Design10.1155/2013/9361812013(11-11)Online publication date: 1-Jan-2013
    • (2012)An efficient soft error protection scheme for MPSoC and FPGA-based verificationAnti-counterfeiting, Security, and Identification10.1109/ICASID.2012.6325306(1-5)Online publication date: Aug-2012
    • (2011)A Hardware-Software Collaborated Method for Soft-Error Tolerant MPSoCProceedings of the 2011 IEEE Computer Society Annual Symposium on VLSI10.1109/ISVLSI.2011.48(260-265)Online publication date: 4-Jul-2011
    • (2011)Matrix control-flow algorithm-based fault toleranceProceedings of the 2011 IEEE 17th International On-Line Testing Symposium10.1109/IOLTS.2011.5993808(37-42)Online publication date: 13-Jul-2011
    • (2010)Compiler directed network-on-chip reliability enhancement for chip multiprocessorsACM SIGPLAN Notices10.1145/1755951.175590245:4(85-94)Online publication date: 13-Apr-2010
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media