Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Exploiting Existing Comparators for Fine-Grained Low-Cost Error Detection

Published: 27 October 2014 Publication History
  • Get Citation Alerts
  • Abstract

    Fault tolerance has become a fundamental concern in computer design, in addition to performance and power. Although several error detection schemes have been proposed to discover a faulty core in the system, these proposals could waste the whole core, including many error-free structures in it after error detection. Moreover, many fault-tolerant designs require additional hardware for data replication or for comparing the replicated data. In this study, we provide a low-cost, fine-grained error detection scheme by exploiting already existing comparators and data replications in the several pipeline stages such as issue queue, rename logic, and translation lookaside buffer. We reduce the vulnerability of the source register tags in IQ by 60%, the vulnerability of instruction TLB by 64%, the vulnerability of data TLB by 45%, and the vulnerability of the register tags of rename logic by 20%.

    References

    [1]
    R. Anglada and A. Rubio. 1988. An Approach to Crosstalk Effect Analysis and Avoidance Techniques in Digital CMOS VLSI Circuits. International Journal of Electronics 6, 5 (1988), 9--17.
    [2]
    T. M. Austin. 2000. DIVA: A Dynamic Approach to Microprocessor Verification. Journal of Instruction-Level Parallelism 2 (2000), 1--6.
    [3]
    R. Baumann. 2005. Soft Errors in Advanced Computer Systems. IEEE Design and Test of Computers 22, 3 (2005), 258--266.
    [4]
    D. J. Baylis. 1998. Error Correcting Codes: A Mathematical Introduction. Chapman and Hall.
    [5]
    N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. 2011. The gem5 Simulator. SIGARCH Computer Architecture News 39, 2 (August 2011), 1--7.
    [6]
    A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. S. Mukherjee, and R. Rangan. 2005. Computing Architectural Vulnerability Factors for Address-Based Structures. In Proceedings of the 32nd Annual International Symposium on Computer Architecture. 532--543.
    [7]
    A. Biswas, P. Racunas, J. S. Emer, and S. S. Mukherjee. 2007. Computing Accurate AVFs Using ACE Analysis on Performance Models: A Rebuttal. Computer Architecture Letters 7, 1 (2007), 21--24.
    [8]
    F. A. Bower, P. G. Shealy, S. Ozev, and D. J. Sorin. 2004. Tolerating Hard Faults in Microprocessor Array Structures. In Proceedings of the International Conference on Dependable Systems and Networks (DSN). 51--60.
    [9]
    F. A. Bower, D. J. Sorin, and S. Ozev. 2005. A Mechanism for Online Diagnosis of Hard Faults in Microprocessors. In Proceedings of the 38th Annual International Symposium on Microarchitecture.
    [10]
    J. Carretero, P. Chaparro, X. Vera, J. Abella, and A. Gonzlez. 2009. In Proceedings of the 36th Annual International Symposium on Computer Architecture. 105--115.
    [11]
    K. Constantinides, S. Plaza, J. Blome, B. Zhang, V. Bertacco, S. Mahlke, T. Austin, and M. Orshansky. 2006. BulletProof: A Defect-Tolerant CMP Switch Architecture. In Proceedings of the 12th International Symposium on High Performance Computer Architecture. 3--14.
    [12]
    P. DeMone. HP's Struggle for Simplicity Ends at Intel. Retrieved October 27, 1999, from http://www.realworldtech.com/hp-intel-itanium/3/.
    [13]
    O. Ergin, D. Balkan, D. Ponomarev, and K. Ghose. 2006. Early Register Deallocation Mechanisms Using Checkpointed Register Files. IEEE Transactions on Computers 55, 9 (2006), 1153--1166.
    [14]
    D. Ernst and T. Austin. 2002. Efficient Dynamic Scheduling Through Tag Elimination. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA’02). 37--46.
    [15]
    E. S. Fetzer, D. Dahle, C. Little, and K. Safford. 2006. The Parity Protected, Multithreaded Register Files on the 90-nm Itanium Processor. IEEE Journal of Solid-State Circuits 41, 1 (January 2006), 246--255.
    [16]
    S. Gupta, S. Feng, A. Ansari, J. A. Blome, and S. A. Mahlke. 2008. StageNetSlice: A Reconfigurable Microarchitecture Building Block for Resilient CMP Systems. In CASES (2008). 1--10.
    [17]
    J. L. Hennessy and D. A. Patterson. 2012. Computer Architecture—A Quantitative Approach. (5th ed.). Morgan Kaufmann.
    [18]
    J. L. Henning. 2006. SPEC CPU2006 Benchmark Descriptions. SIGARCH Computer Architecture News 34 (2006), 1--17.
    [19]
    B. Jacob and T. Mudge. 1998. Virtual Memory in Contemporary Microprocessors. IEEE Micro 18, 4 (July 1998), 60--75.
    [20]
    I. Kim and M. H. Lipasti. 2003. Half-Price Architecture. In Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA’03). 28--38.
    [21]
    M. Li, P. Ramach, S. K. Sahoo, S. V. Adve, V. S. Adve, and Y. Zhou. 2008b. Understanding the Propagation of Hard Errors to Software and Implications for Resilient System Design. In Proceedings of ASPLOS.
    [22]
    X. Li, S. V. Adve, P. Bose, and J. A. Rivers. 2005. SoftArch: An Architectural-Level Tool for Modeling and Analyzing Soft Errors. In Proceedings of the 2005 International Conference on Dependable Systems and Networks. 496--505.
    [23]
    X. Li, S. V. Adve, P. Bose, and J. A. Rivers. 2008a. Online Estimation of Architectural Vulnerability Factor for Soft Errors. SIGARCH Computer Architecture News 36, 3 (June 2008), 341--352.
    [24]
    A. Meixner and D. J. Sorin. 2007. Error Detection Using Dynamic Dataflow Verification. In Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques. 104--118.
    [25]
    A. Moshovos. 2002. Power-Aware Register Renaming. Computer Engineering Group, Technical Report 01-08-2, University of Toronto.
    [26]
    S. S. Mukherjee, M. Kontz, and S. K. Reinhardt. 2002. Detailed Design and Evaluation of Redundant Multithreading Alternatives. In Proceedings of ISCA. 99--110.
    [27]
    S. S. Mukherjee, C. Weaver, J. Emer, S. K. Reinhardt, and T. Austin. 2003. A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture.
    [28]
    D. V. Ponomarev, G. Kucuk, O. Ergin, and K. Ghose. 2004. Energy Efficient Comparators for Superscalar Datapaths. IEEE Transactions on Computers 53, 7 (2004), 892--904.
    [29]
    D. V. Ponomarev, G. Kucuk, O. Ergin, K. Ghose, and P. M. Kogge. 2003. Energy-Efficient Issue Queue Design. IEEE Transactions on VLSI Systems 11 (2003), 789--800.
    [30]
    B. R. Rau and J. A. Fisher. 1993. Instruction-level Parallel Processing: History, Overview, and Perspective. Journal on Supercomputers 7, 1--2 (May 1993), 9--50.
    [31]
    S. K. Reinhardt and S. S. Mukherjee. 2000. Transient Fault Detection via Simultaneous Multithreading. SIGARCH Computer Architecture News 28, 2 (2000), 25--36.
    [32]
    B. F. Romanescu and D. J. Sorin. 2008. Core Cannibalization Architecture: Improving Lifetime Chip Performance for Multicore Processors in the Presence of Hard Faults. In PACT. 43--51.
    [33]
    D. Sager, Desktop Platforms Group, and Intel Corp. 2001. The Microarchitecture of the Pentium 4 Processor. Intel Technology Journal 1 (2001), 2001.
    [34]
    J. J. Sharkey, D. V. Ponomarev, K. Ghose, and O. Ergin. 2006. Instruction Packing: Toward Fast and Energy-Efficient Instruction Scheduling. ACM Transactions on Architecture and Code Optimization 3, 2 (June 2006), 156--181.
    [35]
    P. Shivakumar, S. W. Keckler, C. R. Moore, and D. Burger. 2012. Exploiting Microarchitectural Redundancy for Defect Tolerance. In Proceedings of the 30th International Conference on Computer Design. 35--42.
    [36]
    D. Sima. 2000. The Design Space of Register Renaming Techniques. IEEE Micro 20, 5 (September 2000), 70--83.
    [37]
    T. J. Slegel, N. J. Wang, and S. J. Patel. 1999. IBM’s S/390 G5 Microprocessor Design. IEEE Micro 19 (1999), 12--23.
    [38]
    J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers. 2005. Exploiting Structural Duplication for Lifetime Reliability Enhancement. In Proceedings of the 32nd Annual International Symposium on Computer Architecture. 520--531.
    [39]
    N. J. Wang, A. Mahesri, and S. J. Patel. 2007. Examining ACE Analysis Reliability Estimates Using Fault-Injection. SIGARCH Computer Architecture News 35, 2 (May 2007), 460--469.
    [40]
    N. J. Wang, Student Member, and Sanjay J. Patel. 2006. ReStore: Symptom-Based Soft Error Detection in Microprocessors. IEEE TDSC 3 (2006), 188--201.
    [41]
    C. Weaver, J. Emer, S. S. Mukherjee, and S. K. Reinhardt. 2004. Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor. In Proceedings of the 31st Annual International Symposium on Computer Architecture (ISCA’04). 264--275.
    [42]
    A. Wood, R. Jardine, and W. Bartlett. 2006. Data Integrity in HP NonStop Servers. In Workshop on SELSE.
    [43]
    G. Yalcin and O. Ergin. 2007. Using Tag-Match Comparators for Detecting Soft Errors. IEEE Computer Architecture Letters 6, 2 (2007), 53--56.
    [44]
    G. Yalcin, O. S. Unsal, A. Cristal, and M. Valero. 2011. FIMSIM: A Fault Injection Infrastructure for Microarchitectural Simulators. In Proceedings of the International Conference on Computer Design (ICCD’11).

    Cited By

    View all
    • (2024)Two-Dimensional Protection Code for Virtual Page Information in Translation Lookaside BuffersElectronics10.3390/electronics1307132013:7(1320)Online publication date: 1-Apr-2024

    Index Terms

    1. Exploiting Existing Comparators for Fine-Grained Low-Cost Error Detection

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Architecture and Code Optimization
      ACM Transactions on Architecture and Code Optimization  Volume 11, Issue 3
      October 2014
      298 pages
      ISSN:1544-3566
      EISSN:1544-3973
      DOI:10.1145/2658949
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 27 October 2014
      Accepted: 01 July 2014
      Revised: 01 June 2014
      Received: 01 September 2013
      Published in TACO Volume 11, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. CAM logic
      2. Reliability

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)37
      • Downloads (Last 6 weeks)6

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Two-Dimensional Protection Code for Virtual Page Information in Translation Lookaside BuffersElectronics10.3390/electronics1307132013:7(1320)Online publication date: 1-Apr-2024

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media