Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2516821.2516842acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrtnsConference Proceedingsconference-collections
research-article

Static probabilistic worst case execution time estimation for architectures with faulty instruction caches

Published: 16 October 2013 Publication History

Abstract

Semiconductor technology evolution suggests that permanent failure rates will increase dramatically with scaling, in particular for SRAM cells. While well known approaches such as error correcting codes exist to recover from failures and provide fault-free chips, they will not be affordable anymore in the future due to their non-scalable cost. Consequently, other approaches like fine grain disabling and reconfiguration of hardware elements (e.g. individual functional units or cache blocks) will become economically necessary. This fine-grain disabling will lead to degraded performance compared to a fault-free execution.
To the best of our knowledge, all static worst-case execution time (WCET) estimation methods assume fault-free architectures. Their result is not safe anymore when using fine grain disabling of hardware components, which degrades performance. In this paper we provide the first method that statically calculates a probabilistic WCET bound in the presence of permanent faults in instruction caches. The proposed method, from a given program, cache configuration and probability of cell failure, derives a probabilistic WCET bound. The proposed method, because it relies on static analysis, is guaranteed to identify the longest program path, its probabilistic nature only stemming from the presence of faults. The method is computationally tractable because it does not require an exhaustive enumeration of the possible locations of faulty cache blocks. Experimental results show that it provides WCET estimates very close to, but never below, the method that derives probabilistic WCETs by enumerating all possible locations of faulty cache blocks. The proposed method not only allows to quantify the impact of permanent faults on WCET estimates, but also can be used in architectural exploration frameworks to select the most appropriate fault management mechanisms.

References

[1]
S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De. Parameter variations and impact on circuits and microarchitecture. In DAC40, pages 338--342, June 2003.
[2]
K. Bowman, J. Tschanz, C. Wilkerson, S.-L. Lu, T. Karnik, V. De, and S. Borkar. Circuit techniques for dynamic variation tolerance. In DAC46, pages 4--7, New York, NY, USA, 2009. ACM.
[3]
F. J. Cazorla, E. Quiñones, T. Vardanega, L. Cucu, B. Triquet, G. Bernat, E. Berger, J. Abella, F. Wartel, M. Houston, L. Santinellei, L. Kosmidis, C. Lo, and D. Maxim. Proartis: Probabilistically analysable real-time systems. ACM Trans. Embed. Comput. Syst., 2013.
[4]
L. Cheng, P. Gupta, C. J. Spanos, K. Qian, and L. He. Physically justifiable die-level modeling of spatial variation in view of systematic across wafer variability. IEEE Trans. on CAD of Integrated Circuits and Systems, 30(3):388--401, 2011.
[5]
P. Chevochot and I. Puaut. Scheduling fault-tolerant distributed hard real-time tasks independently of the replication strategies. In 6th International Conference on Real-Time Computing and Applications Symposium, pages 356--363, 1999.
[6]
A. Colin and I. Puaut. A modular and retargetable framework for tree-based WCET analysis. In Euromicro Conference on Real-Time Systems (ECRTS), pages 37--44, Delft, The Netherlands, June 2001.
[7]
J. Engblom. Processor pipelines and static worst-case execution time analysis. PhD thesis, Uppsala University, 2002.
[8]
C. Ferdinand and R. Wilhelm. On predicting data cache behavior for real-time systems. In LCTES '98: Proceedings of the ACM SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems, pages 16--30, 1998.
[9]
S. Ghosh, R. Melhem, and D. Mossé. Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. Parallel and Distributed Systems, IEEE Transactions on, 8(3):272--284, 1997.
[10]
R. Hamming. Error Detecting and Error Correcting Codes. Bell System Technical Journal, 26(2):147--160, 1950.
[11]
D. Hardy and I. Puaut. WCET analysis of multi-level non-inclusive set-associative instruction caches. In Proceedings of the 29th Real-Time Systems Symposium, pages 456--466, Dec. 2008.
[12]
D. Hardy, I. Sideris, N. Ladas, and Y. Sazeides. The performance vulnerability of architectural and non-architectural arrays to permanent faults. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO'12, pages 48--59, 2012.
[13]
L. Kosmidis, J. Abella, E. Quiñones, and F. J. Cazorla. A cache design for probabilistically analysable real-time systems. In Proceedings of the Conference on Design, Automation and Test in Europe, DATE '13, pages 513--518, 2013.
[14]
X. Li, A. Roychoudhury, and T. Mitra. Modeling out-of-order processors for wcet estimation. Real Time Systems Journal, 34(3), Nov. 2006.
[15]
C. McNairy and J. Mayfield. Montecito error protection and mitigation. In HPCRI '05: 1st Workshop on High Performance Computing Reliability Issues, 2005.
[16]
F. Mueller. Timing analysis for instruction caches. Real-Time Systems, 18(2-3):217--247, 2000.
[17]
S. R. Nassif, N. Mehta, and Y. Cao. A resilience roadmap. In DATE, pages 1011--1016, 2010.
[18]
S. Punnekkat, A. Burns, and R. Davis. Analysis of checkpointing for real-time systems. Real-Time Systems, 20(1):83--102, 2001.
[19]
J. Reineke, D. Grund, C. Berg, and R. Wilhelm. Timing predictability of cache replacement policies. Real-Time Systems, 37(2):99--122, 2007.
[20]
M. Slijepcevic, L. Kosmidis, J. Abella, E. Q. nones, and F. J. Cazorla. DTM: Degraded test mode for fault-aware probabilistic timing analysis. In Euromicro Conference on Real-Time Systems (ECRTS), July 2013. To appear.
[21]
H. Theiling, C. Ferdinand, and R. Wilhelm. Fast and precise WCET prediction by separated cache and path analyses. Real-Time Systems, 18(2--3):157--179, 2000.
[22]
R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. Ferdinand, R. Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and P. Stenström. The worst-case execution-time problem-overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst., 7(3):36:1--36:53, May 2008.

Cited By

View all
  • (2022)Mixed-Criticality Scheduling Upon Permitted Failure Probability and Dynamic PriorityIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.305323241:1(62-75)Online publication date: Jan-2022
  • (2019)Probabilistic timing analysis of time-randomised caches with fault detection mechanismsIET Computers & Digital Techniques10.1049/iet-cdt.2018.5043Online publication date: 7-Jan-2019
  • (2017)Aging Assessment and Design Enhancement of Randomized Cache MemoriesIEEE Transactions on Device and Materials Reliability10.1109/TDMR.2017.265454817:1(32-41)Online publication date: Mar-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
RTNS '13: Proceedings of the 21st International conference on Real-Time Networks and Systems
October 2013
298 pages
ISBN:9781450320580
DOI:10.1145/2516821
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • CNRS: Centre National De La Rechercue Scientifique
  • INRIA: Institut Natl de Recherche en Info et en Automatique

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

RTNS 2013
Sponsor:
  • CNRS
  • INRIA

Acceptance Rates

RTNS '13 Paper Acceptance Rate 29 of 62 submissions, 47%;
Overall Acceptance Rate 119 of 255 submissions, 47%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)2
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Mixed-Criticality Scheduling Upon Permitted Failure Probability and Dynamic PriorityIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.305323241:1(62-75)Online publication date: Jan-2022
  • (2019)Probabilistic timing analysis of time-randomised caches with fault detection mechanismsIET Computers & Digital Techniques10.1049/iet-cdt.2018.5043Online publication date: 7-Jan-2019
  • (2017)Aging Assessment and Design Enhancement of Randomized Cache MemoriesIEEE Transactions on Device and Materials Reliability10.1109/TDMR.2017.265454817:1(32-41)Online publication date: Mar-2017
  • (2017)Static probabilistic timing analysis with a permanent fault detection mechanism2017 12th IEEE International Symposium on Industrial Embedded Systems (SIES)10.1109/SIES.2017.7993373(1-10)Online publication date: Jun-2017
  • (2017)On the Criticality of Probabilistic Worst-Case Execution Time ModelsDependable Software Engineering. Theories, Tools, and Applications10.1007/978-3-319-69483-2_4(59-74)Online publication date: 17-Oct-2017
  • (2016)Schedulability analysis of dependent probabilistic real-time tasksProceedings of the 24th International Conference on Real-Time Networks and Systems10.1145/2997465.2997499(99-107)Online publication date: 19-Oct-2016
  • (2016)Static probabilistic timing analysis in presence of faults2016 11th IEEE Symposium on Industrial Embedded Systems (SIES)10.1109/SIES.2016.7509422(1-10)Online publication date: May-2016
  • (2016)Resilient random modulo cache memories for probabilistically-analyzable real-time systems2016 IEEE 22nd International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS.2016.7604666(27-32)Online publication date: Jul-2016
  • (2016)Effects of online fault detection mechanisms on Probabilistic Timing Analysis2016 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)10.1109/DFT.2016.7684067(41-46)Online publication date: Sep-2016
  • (2015)Iterative robust multiprocessor schedulingProceedings of the 23rd International Conference on Real Time and Networks Systems10.1145/2834848.2834857(23-32)Online publication date: 4-Nov-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media