Abstract
For systematic fault injection (FI), we deterministically re-execute a program, introduce faults, and observe the program outcome to assess its resilience in the presence of transient hardware faults. For this, simulation-assisted ISA-level FI provides a good trade-off between result quality and the required time to execute the FI campaign. However, for each architecture, this requires a specialized ISA simulator with tracing, injection, and error observation capabilities; a dependency that not only increases the bar for the exploration of ISA-level hardening mechanisms, but which can also deviate from the behavior of the actual hardware, especially when an error propagates through the system and triggers semantic edge cases.
With SailFAIL, we propose a model-driven approach to derive FI platforms from Sail models, which formally describe the ISA semantics. Based on two existing (RISC-V, CHERI RISC-V) and one newly introduced (AVR) Sail models, we use the Sail toolchain to derive emulators that we combine with the FAIL* framework into multiple new FI platforms. Furthermore, we extend Sail to automatically introduce bit-wise dynamic register tracing into the emulator, which enables us to harvest bit-wise access information that we use to improve the well-known def-use pruning technique. Thereby, we further reduce the number of necessary injections by up to 19%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
FAIL* splits up an access to a 32-bit register in four 4-byte accesses.
References
Armstrong, A., et al.: ISA semantics for ARMv8-A, RISC-V, and CHERI-MIPS. In: Proceedings of 46th ACM SIGPLAN Symposium on Principles of Programming Languages, January 2019. https://doi.org/10.1145/3290384
Berrojo, L., et al.: New techniques for speeding-up fault-injection campaigns. In: Design, Automation and Test in Europe Conference and Exhibition 2002 (DATE 2002), pp. 847–852, Washington, DC, USA. IEEE Computer Society Press (2002). https://doi.org/10.1109/DATE.2002.998398
Carreira, J., Madeira, H., Silva, J.G., Silva, J.G.: Xception: software fault injection and monitoring in processor functional units. In: Proceedings of the Conference on Dependable Computing for Critical Applications (DCCA 1995), pp. 135–149, September 1995
Chisnall, D., et al.: Beyond the PDP-11: architectural support for a memory-safe C abstract machine. In: Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems. ACM, New York (2015). https://doi.org/10.1145/2694344.2694367
Cho, H., Mirkhani, S., Cher, C.Y., Abraham, J.A., Mitra, S.: Quantitative evaluation of soft error injection techniques for robust system design. In: Proceedings of the 50th Annual Design Automation Conference, pp. 1–10 (2013). https://doi.org/10.1145/2463209.2488859
Civera, P., Macchiarulo, L., Rebaudengo, M., Reorda, M.S., Violante, M.: An FPGA-based approach for speeding-up fault injection campaigns on safety-critical circuits. J. Electron. Test. 18(3), 261–271 (2002). https://doi.org/10.1023/A:1015079004512
Constantinescu, C.: Trends and challenges in VLSI circuit reliability. IEEE Micro 23(4), 14–19 (2003). https://doi.org/10.1109/MM.2003.1225959. ISSN 0272-1732
Dietrich, C., Bargholz, M., Loeck, Y., Budoj, M., Nedaskowskij, L., Lohmann, D.: SailFail: Model-Derived Simulation-Assisted ISA- Level Fault-Injection Platforms (Software Artifact), May 2022. https://doi.org/10.5281/zenodo.6553206
Entrena, L., Garcia-Valderas, M., Fernandez-Cardenal, R., Lindoso, A., Portela, M., Lopez-Ongil, C.: Soft error sensitivity evaluation of microprocessors by multilevel emulation-based fault injection. IEEE Trans. Comput. 61(3), 313–322 (2012). https://doi.org/10.1109/TC.2010.262. ISSN 0018-9340
Guan, Q., Debardeleben, N., Blanchard, S., Fu, S.: F-SEFI: a fine-grained soft error fault injection tool for profiling application vulnerability. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 1245–1254, May 2014. https://doi.org/10.1109/IPDPS.2014.128
Guthoff, J., Sieh, V.: Combining software-implemented and simulation-based fault injection into a single fault injection method. In: Proceedings of the 25rd International Symposium on Fault-Tolerant Computing (FTCS-25), pp. 196–206. IEEE Computer Society Press, June 1995. https://doi.org/10.1109/FTCS.1995.466978
Hari, S.K.S., Adve, S.V., Naeimi, H., Ramachandran, P.: Relyzer: exploiting application-level fault equivalence to analyze application resiliency to transient faults. In: Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2012). ACM Press, New York (2012). https://doi.org/10.1145/2150976.2150990. ISBN 978-1-4503-0759-8
Hochschild, P.H., et al.: Cores that don’t count. In: Proceedings of the Workshop on Hot Topics in Operating Systems, pp. 9–16 (2021)
Hoffmann, M., Ulbrich, P., Dietrich, C., Schirmeier, H., Lohmann, D., Schröder-Preikschat, W.: A practitioner’s guide to software-based soft-error mitigation using AN-codes. In: Proceedings of the 15th IEEE International Symposium on High-Assurance Systems Engineering (HASE 2014), pp. 33–40. IEEE Computer Society Press, January 2014. https://doi.org/10.1109/HASE.2014.14. ISBN 978-1-4799-3465-2
ISO 26262-9:2018: Road vehicles - Functional safety - Part 9: Automotive Safety Integrity Level (ASIL)-oriented and safety-oriented analyses. International Organization for Standardization, Geneva, Switzerland (2018)
Mukherjee, S.: Architecture Design for Soft Errors. Morgan Kaufmann Publishers Inc., San Francisco (2008). ISBN 978-0-12-369529-1
Mundkur, P., et al.: RISCV sail model. https://github.com/riscv/sail-riscv. Accessed 04 Feb 2022
Nassif, S.R., Mehta, N., Cao, Y.: A resilience roadmap. In: Design, Automation Test in Europe Conference Exhibition (DATE 2010), pp. 1011–1016 (2010). https://doi.org/10.1109/DATE.2010.5456958
Papadimitriou, G., Gizopoulos, D.: Demystifying the system vulnerability stack: transient fault effects across the layers. In: 48th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2021, Valencia, Spain, 14–18 June 2021, pp. 902–915 (2021). https://doi.org/10.1109/ISCA52012.2021.00075
Pusz, O., Dietrich, C., Lohmann, D.: Data-flow-sensitive fault-space pruning for the injection of transient hardware faults. In: Proceedings of the 2021 ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES 2021), pp. 97–109. ACM Press, New York, June 2021. https://doi.org/10.1145/3461648.3463851
Schirmeier, H., Breddemann, M.: Quantitative cross-layer evaluation of transient-fault injection techniques for algorithm comparison. In: 15th European Dependable Computing Conference, EDCC 2019, Naples, Italy, 17–20 September 2019, pp. 15–22 (2019). https://doi.org/10.1109/EDCC.2019.00016
Schirmeier, H., Borchert, C., Spinczyk, O.: Avoiding pitfalls in fault-injection based comparison of program susceptibility to soft errors. In: Proceedings of the 45th International Conference on Dependable Systems and Networks (DSN 2015), Washington, DC, USA. IEEE Computer Society Press, June 2015. https://doi.org/10.1109/DSN.2015.44
Schirmeier, H., Hoffmann, M., Dietrich, C., Lenz, M., Lohmann, D., Spinczyk, O.: FAIL*: an open and versatile fault-injection framework for the assessment of software-implemented hardware fault tolerance. In: Sens, P. (ed.) Proceedings of the 11th European Dependable Computing Conference (EDCC 2015), pp. 245–255, September 2015. https://doi.org/10.1109/EDCC.2015.28
Skarin, D., Barbosa, R., Karlsson, J.: GOOFI-2: a tool for experimental dependability assessment. In: Proceedings of the 39th International Conference on Dependable Systems and Networks (DSN 2009), pp. 557–562. IEEE Computer Society Press, June 2010. https://doi.org/10.1109/DSN.2010.5544265
Smith, D.T., Johnson, B.W., Profeta, J.A., Bozzolo, D.G.: A method to determine equivalent fault classes for permanent and transient faults. In: Annual Reliability and Maintainability Symposium 1995 Proceedings, pp. 418–424. IEEE (1995). https://doi.org/10.1109/RAMS.1995.513278
Venkatagiri, R., et al.: gem5-approxilyzer: an open-source tool for application-level soft error analysis. In: 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp. 214–221 (2019). https://doi.org/10.1109/DSN.2019.00033
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dietrich, C., Bargholz, M., Loeck, Y., Budoj, M., Nedaskowskij, L., Lohmann, D. (2022). SailFAIL: Model-Derived Simulation-Assisted ISA-Level Fault-Injection Platforms. In: Trapp, M., Saglietti, F., Spisländer, M., Bitsch, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2022. Lecture Notes in Computer Science, vol 13414. Springer, Cham. https://doi.org/10.1007/978-3-031-14835-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-14835-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14834-7
Online ISBN: 978-3-031-14835-4
eBook Packages: Computer ScienceComputer Science (R0)