Abstract
In this paper, we present an analytical methodology to measure the vulnerability of the memory components of a microprocessor-based computing system. It is based on the data and the instruction lifetime and residence. The proposed approach considers only the software-layer of the system, which makes it usable at early design stage when the hardware architecture is not fully defined. Then, to consider the hardware memory hierarchy (i.e., RAM, Caches, Register Files) at software level, we have developed a memory subsystem emulator that can be easily configured to support different features. The methodology can be used to perform a fast, easy and not costly cache-aware Design Space Exploration (DSE) to accurately evaluate the vulnerability of the RAM and the caches. The first set of experiments run on Mibench benchmarks shows that we can perform a fast, easy and not costly DSE to accurately evaluate the effects of the faults in both the RAM and the caches. In addition, we validate the proposed approach on a real industrial test case, which is a Flight Management System for avionic application. The results show that the proposed methodology give precise results compared to a classical fault injection tool, and it scales well with the complexity of the application.












Similar content being viewed by others
References
Alipour M, Salehi ME, Baghini HS (2012) Design space exploration to find the optimum cache and register file size for embedded applications. arXiv:1205.1871
Avižienis A, Laprie J-C, Randell B, Landwehr C (2004) Basic concepts and taxonomy of dependable and secure computing. IEEE Trans Dependable Secure Comput 1(1):11–33
Baumann R (2005) Soft errors in advanced computer systems. IEEE Des Test 22(3):258–266
Benso A, Di Carlo S, Di Natale G, Prinetto P, Taghaferri L (2003) Data criticality estimation in software applications. In: Proceedings international test conference (ITC), 2003, vol 1, pp 802–810
Biswas A, Racunas P, Cheveresan R, Emer J, Mukherjee SS, Rangan R (2005) Computing architectural vulnerability factors for address-based structures. SIGARCH Comput Archit News 33(2):532–543
Borkar S, Karnik T, De V (2004) Design and reliability challenges in nanometer technologies. In: Proceedings of the 41st annual design automation conference, DAC ’04, pp 75–75
Cai Y, Schmitz MT, Ejlali A, Al-Hashimi BM, Reddy SM (2006) Cache size selection for performance, energy and reliability of time-constrained systems. In: Proceedings of the conference on Asia South Pacific design automation: ASP-DAC, Yokohama, Japan, January 24–27, pp 923–928
Ebrahimi M, Chen L, Asadi H, Tahoori MB (2013) CLASS: combined logic and architectural soft error sensitivity analysis. In: 18th Asia and South Pacific design automation conference, ASP-DAC 2013, Yokohama, Japan, January 22–25, 2013, pp 601–607
George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and avf estimation revisited. In: 2010 IEEE/IFIP international conference on dependable systems & networks (DSN). IEEE, pp 477–486
George NJ, Elks CR, Johnson BW, Lach J (2010) Transient fault models and AVF estimation revisited. In: Proceedings of the 2010 IEEE/IFIP international conference on dependable systems and networks, DSN 2010, Chicago, IL, USA, June 28 – July 1 2010, pp 477–486
Ghosh A, Givargis T (2003) Analytical design space exploration of caches for embedded systems. In: Design, automation and test in Europe conference and exposition DATE, Munich, Germany, March 3–7, pp 10650–10655
Hiser J, Davidson JW, Whalley DB (2007) Fast, accurate design space exploration of embedded systems memory configurations. In: Proceedings of the 2007 ACM symposium on applied computing SAC, Seoul, Korea, March 11–15, pp 699–706
Kooli M (2016) Analysing and supporting the reliability decision-making process in computing systems with a reliability evaluation framework. Theses, Université Montpellier
Kooli M, Di Natale G (2014) A survey on simulation-based fault injection tools for complex systems. In: Proceedings of the 9th international conference on design & technology of integrated systems in Nanoscale Era, DTIS, Santorini, Greece, May 6–8, pp 1–6
Kooli M, Di Natale G, Bosio A (2016) Cache-aware reliability evaluation through llvm-based analysis and fault injection. In: 22nd IEEE international symposium on on-line testing and robust system design, IOLTS, Catalunya, Spain, July 4–6
Kooli M, Kaddachi F, Di Natale G, Bosio A (2016) Cache- and register-aware system reliability evaluation based on data lifetime analysis. In: 34th IEEE VLSI test symposium, VTS 2016, Las Vegas, NV, USA, April 25–27, pp 1–6
Lattner C, Vikram A (2004) LLVM A compilation framework for lifelong program analysis & transformation. Proceedings of the international symposium on code generation and optimization: feedback-directed and runtime optimization, CGO ’04 p 75
Leveugle R, Calvez A, Maistri P, Vanhauwaert P (2009) Statistical fault injection: quantified error and confidence. In: Proceedings of the conference on design, automation and test in Europe, DATE Nice, France, pp 502–506
Li X, Negi HS, Mitra T, Roychoudhury A (2004) Design space exploration of caches using compressed traces. In: Proceedigns of the 18th annual international conference on supercomputing, ICS, Saint Malo, France, June 26 - July 01, pp 116–125
Liang Y, Mitra T (2008) Static analysis for fast and accurate design space exploration of caches. In: Proceedings of the 6th international conference on hardware/software codesign and system synthesis, CODES+ISSS 2008, Atlanta, GA, USA, October 19–24, pp 103–108
Liang Y, Mitra T (2013) An analytical approach for fast and accurate design space exploration of instruction caches. ACM Trans Embedded Comput Syst 13(3):43:1–43:29
SimpleScalar LLC (2004) Simplescalar LLC to serve and project
The gem5 simulator
Ma A, Cheng Y, Xing Z (2011) Accurate and simplified prediction of AVF for delay and energy efficient cache design. J Comput Sci Technol 26(3):504–519
Maghsoudloo M, Zarandi HR (2015) Design space exploration of non-uniform cache access for soft-error vulnerability mitigation. Microelectron Reliab 55(11):2439–2452
Mibench
Montesinos P, Liu W, Torrellas J (2007) Using register lifetime predictions to protect register files against soft errors. In: Proceedings of the 37th annual IEEE/IFIP international conference on dependable systems and networks, DSN ’07, Washington, DC, USA. IEEE Computer Society, pp 286–296
Mukherjee SS, Weaver C, Emer J, Reinhardt SK, Austin T (2003) A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In: Proceedings of the 36th annual IEEE/ACM international symposium on microarchitecture, MICRO 36, San Diego, CA, USA, December 3–5, pp 29–42
Nicolaidis M (2010) Soft errors in modern electronic systems, vol 41. Springer Science & Business Media, Berlin
Patel R, Rajawat A (2015) Instruction cache design space exploration for embedded software applications. In: 19th international symposium on VLSI design and test, VDAT, Ahmedabad, India, June 26–29, pp 1–5
Savino A, Vallero A, Di Carlo S (2018) Redo: cross-layer multi-objective design-exploration framework for efficient soft error resilient systems. IEEE Trans Comput 67(10):1462–1477
Shafique M, Rehman S, Aceituno PV, Henkel J (2013) Exploiting program-level masking and error propagation for constrained reliability optimization. In: Proceedings ACM/EDAC/IEEE design automation conference (DAC), pp 1–9
Vadlamani R, Zhao J, Burleson W, Tessier R (2010) Multicore soft error rate stabilization using adaptive dual modular redundancy. In: Proceedings of the conference on design, automation and test in Europe, DATE, Dresden, Germany, pp 27–32
Vallero A, Savino A, Chatzidimitriou A, Kaliorakis M, Kooli M, Riera Villanueva M, Di Natale G, Bosio A, Canal R, Gizopoulos D, Di Carlo S (2018) Syra: early system reliability analysis for cross-layer soft errors resilience in memory arrays of microprocessor systems. IEEE Trans Comput pp 1–1
Vallero A, Savino A, Politano G, Di Carlo S, Chatzidimitriou A, Tselonis S, Kaliorakis M, Gizopoulos D, Riera M, Canal R, Gonzalez A, Kooli M, Bosio A, Di Natale G (2016) Cross-layer system reliability assessment framework for hardware faults. In: Proceedings IEEE international test conference (ITC) , pp 1–10
Vallero A, Tselonis S, Foutris N, Kaliorakis M, Kooli M, Savino A, Politano G, Bosio A, Di Natale G, Gizopoulos D, Di Carlo S (2015) Cross-layer reliability evaluation, moving from the hardware architecture to the system level: a clereco eu project overview. Microprocess Microsyst 39(8):1204–1214
Wang S, Jie S, Ziavras SG (2009) On the characterization and optimization of on-chip cache reliability against soft errors. IEEE Trans Computers 58(9):1171–1184
Wattanapongsakorn N, Levitan SP (2004) Reliability optimization models for embedded systems with multiple applications. IEEE Trans Reliab 53(3):406–416
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: S. Hamdioui
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kooli, M., Di Natale, G. & Bosio, A. Memory-Aware Design Space Exploration for Reliability Evaluation in Computing Systems. J Electron Test 35, 145–162 (2019). https://doi.org/10.1007/s10836-019-05785-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10836-019-05785-0