Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3295500.3356177acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

BinFI: an efficient fault injector for safety-critical machine learning systems

Published: 17 November 2019 Publication History

Abstract

As machine learning (ML) becomes pervasive in high performance computing, ML has found its way into safety-critical domains (e.g., autonomous vehicles). Thus the reliability of ML has grown in importance. Specifically, failures of ML systems can have catastrophic consequences, and can occur due to soft errors, which are increasing in frequency due to system scaling. Therefore, we need to evaluate ML systems in the presence of soft errors.
In this work, we propose BinFI, an efficient fault injector (FI) for finding the safety-critical bits in ML applications. We find the widely-used ML computations are often monotonic. Thus we can approximate the error propagation behavior of a ML application as a monotonic function. BinFI uses a binary-search like FI technique to pinpoint the safety-critical bits (also measure the overall resilience). BinFI identifies 99.56% of safety-critical bits (with 99.63% precision) in the systems, which significantly outperforms random FI, with much lower costs.

References

[1]
Autonomous and ADAS test cars produce over 11 TB of data per day. https://www.tuxera.com/blog/autonomous-and-adas-test-cars-produce-over-11-tb-of-data-per-day/
[2]
Autonomous Car - A New Driver for Resilient Computing and Design-for-Test. https://nepp.nasa.gov/workshops/etw2016/talks/15WED/20160615-0930-Autonomous_Saxena-Nirmal-Saxena-Rec2016Jun16-nasaNEPP.pdf
[3]
Autumn model in Udacity challenge. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/autumn
[4]
Cifar dataset. https://www.cs.toronto.edu/~kriz/cifar.html
[5]
comma.ai's steering model. https://github.com/commaai/research
[6]
Driving dataset. https://github.com/SullyChen/driving-datasets
[7]
Epoch model in Udacity challenge. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/cg23
[8]
Functional Safety Methodologies for Automotive Applications. https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/solutions/automotive-functional-safety-wp.pdf
[9]
Mnist dataset. http://yann.lecun.com/exdb/mnist/
[10]
NVIDIA DRIVE AGX. https://www.nvidia.com/en-us/self-driving-cars/drive-platform/hardware/
[11]
On-road tests for Nvidia Dave system. https://devblogs.nvidia.com/deep-learning-self-driving-cars/
[12]
Rambo. https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/rambo
[13]
Survival dataset. https://archive.ics.uci.edu/ml/datasets/Haberman's+Survival
[14]
Tensorflow Popularity. https://towardsdatascience.com/deep-learning-framework-power-scores-2018-23607ddf297a
[15]
Training AI for Self-Driving Vehicles: the Challenge of Scale. https://devblogs.nvidia.com/training-self-driving-vehicles-challenge-scale/
[16]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16). 265--283.
[17]
Rizwan A Ashraf, Roberto Gioiosa, Gokcen Kestor, Ronald F DeMara, Chen-Yong Cher, and Pradip Bose. 2015. Understanding the propagation of transient errors in HPC applications. In SC'15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--12.
[18]
Subho S Banerjee, Saurabh Jha, James Cyriac, Zbigniew T Kalbarczyk, and Ravishankar K Iyer. 2018. Hands Off the Wheel in Autonomous Vehicles?: A Systems Perspective on over a Million Miles of Field Data. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 586--597.
[19]
Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al. 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316 (2016).
[20]
Chun-Kai Chang, Sangkug Lym, Nicholas Kelly, Michael B Sullivan, and Mattan Erez. 2018. Evaluating and accelerating high-fidelity error injection for HPC. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis. IEEE Press, 45.
[21]
G Cong, G Domeniconi, J Shapiro, F Zhou, and BY Chen. 2018. Accelerating Deep Neural Network Training for Action Recognition on a Cluster of GPUs. Technical Report. Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States).
[22]
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2014. Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014).
[23]
Nathan DeBardeleben, James Laros, John T Daly, Stephen L Scott, Christian Engelmann, and Bill Harrod. 2009. High-end computing resilience: Analysis of issues facing the HEC community and path-forward for research and development. Whitepaper, Dec (2009).
[24]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. (2009).
[25]
Fernando Fernandes dos Santos, Caio Lunardi, Daniel Oliveira, Fabiano Libano, and Paolo Rech. 2019. Reliability Evaluation of Mixed-Precision Architectures. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 238--249.
[26]
Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and Sebastian Thrun. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 7639 (2017), 115.
[27]
Bo Fang, Karthik Pattabiraman, Matei Ripeanu, and Sudhanva Gurumurthi. 2014. Gpu-qin: A methodology for evaluating the error resilience of gpgpu applications. In 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). IEEE, 221--230.
[28]
Michael S Gashler and Stephen C Ashmore. 2014. Training deep fourier neural networks to fit time-series data. In International Conference on Intelligent Computing. Springer, 48--55.
[29]
Giorgis Georgakoudis, Ignacio Laguna, Dimitrios S Nikolopoulos, and Martin Schulz. 2017. Refine: Realistic fault injection via compiler-based instrumentation for accuracy, portability and speed. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 29.
[30]
Jason George, Bo Marr, Bilge ES Akgul, and Krishna V Palem. 2006. Probabilistic arithmetic and energy efficient embedded signal processing. In Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems. ACM, 158--168.
[31]
Jianmin Guo, Yu Jiang, Yue Zhao, Quan Chen, and Jiaguang Sun. 2018. DLFuzz: differential fuzzing testing of deep learning systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 739--743.
[32]
Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. 2015. Deep learning with limited numerical precision. In International Conference on Machine Learning. 1737--1746.
[33]
Siva Kumar Sastry Hari, Sarita V Adve, Helia Naeimi, and Pradeep Ramachandran. 2012. Relyzer: Exploiting application-level fault equivalence to analyze application resiliency to transient faults. In ACM SIGPLAN Notices, Vol. 47. ACM, 123--134.
[34]
Simon Haykin. 1994. Neural networks. Vol. 2. Prentice hall New York.
[35]
Kim Hazelwood, Sarah Bird, David Brooks, Soumith Chintala, Utku Diril, Dmytro Dzhulgakov, Mohamed Fawzy, Bill Jia, Yangqing Jia, Aditya Kalro, et al. 2018. Applied machine learning at Facebook: a datacenter infrastructure perspective. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 620--629.
[36]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[37]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[38]
Sanghyun Hong, Pietro Frigo, Yiğitcan Kaya, Cristiano Giuffrida, and Tudor Dumitras. 2019. Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks. arXiv preprint arXiv:1906.01017 (2019).
[39]
Sebastian Houben, Johannes Stallkamp, Jan Salmen, Marc Schlipsing, and Christian Igel. 2013. Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark. In The 2013 international joint conference on neural networks (IJCNN). IEEE, 1--8.
[40]
Jie S Hu, Feihui Li, Vijay Degalahal, Mahmut Kandemir, Narayanan Vijaykrishnan, and Mary J Irwin. 2005. Compiler-directed instruction duplication for soft error detection. In Design, Automation and Test in Europe. IEEE, 1056--1057.
[41]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).
[42]
Saurabh Jha, Subho S Banerjee, James Cyriac, Zbigniew T Kalbarczyk, and Ravishankar K Iyer. 2018. Avfi: Fault injection for autonomous vehicles. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). IEEE, 55--56.
[43]
Saurabh Jha, Timothy Tsai, Subho Banerjee, Siva Kumar Sastry Hari, Michael Sullivan, Steve Keckler, Zbigniew Kalbarczyk, and Ravishankar Iyer. 2019. ML-based Fault Injection for Autonomous Vehicles: A Case for Bayesian Fault Injection. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.
[44]
Kyle D Julian, Jessica Lopez, Jeffrey S Brush, Michael P Owen, and Mykel J Kochenderfer. 2016. Policy compression for aircraft collision avoidance systems. In 2016 IEEE/AIAA 35th Digital Avionics Systems Conference (DASC). IEEE, 1--10.
[45]
Zvi M Kedem, Vincent J Mooney, Kirthi Krishna Muntimadugu, and Krishna V Palem. 2011. An approach to energy-error tradeoffs in approximate ripple carry adders. In Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design. IEEE Press, 211--216.
[46]
Philipp Klaus Krause and Ilia Polian. 2011. Adaptive voltage over-scaling for resilient applications. In 2011 Design, Automation & Test in Europe. IEEE, 1--6.
[47]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[48]
Yann LeCun, Bernhard E Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne E Hubbard, and Lawrence D Jackel. 1990. Handwritten digit recognition with a back-propagation network. In Advances in neural information processing systems. 396--404.
[49]
Guanpeng Li, Siva Kumar Sastry Hari, Michael Sullivan, Timothy Tsai, Karthik Pattabiraman, Joel Emer, and Stephen W Keckler. 2017. Understanding error propagation in deep learning neural network (dnn) accelerators and applications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 8.
[50]
Guanpeng Li, Karthik Pattabiraman, and Nathan DeBardeleben. 2018. TensorFI: A Configurable Fault Injector for TensorFlow Applications. In 2018 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE, 313--320.
[51]
Guanpeng Li, Karthik Pattabiraman, Siva Kumar Sastry Hari, Michael Sullivan, and Timothy Tsai. 2018. Modeling soft-error propagation in programs. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 27--38.
[52]
Wenchao Li, Susmit Jha, and Sanjit A Seshia. 2013. Generating control logic for optimized soft error resilience. In Proceedings of the 9th Workshop on Silicon Errors in Logic-System Effects (SELSE'13), Palo Alto, CA, USA. Citeseer.
[53]
Robert E Lyons and Wouter Vanderkulk. 1962. The use of triple-modular redundancy to improve computer reliability. IBM journal of research and development 6, 2 (1962), 200--209.
[54]
Lei Ma, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Felix Juefei-Xu, Chao Xie, Li Li, Yang Liu, Jianjun Zhao, et al. 2018. Deepmutation: Mutation testing of deep learning systems. In 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 100--111.
[55]
Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. 2013. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, Vol. 30. 3.
[56]
Marisol Monterrubio-Velasco, José Carlos Carrasco-Jimenez, Octavio Castillo-Reyes, Fernando Cucchietti, and Josep De la Puente. 2018. A Machine Learning Approach for Parameter Screening in Earthquake Simulation. In 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). IEEE, 348--355.
[57]
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). 807--814.
[58]
Nahmsuk Oh, Philip P Shirvani, and Edward J McCluskey. 2002. Control-flow checking by software signatures. IEEE transactions on Reliability 51, 1 (2002), 111--122.
[59]
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. Deepxplore: Automated whitebox testing of deep learning systems. In proceedings of the 26th Symposium on Operating Systems Principles. ACM, 1--18.
[60]
Pranav Rajpurkar, Awni Y Hannun, Masoumeh Haghpanahi, Codie Bourn, and Andrew Y Ng. 2017. Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836 (2017).
[61]
Prajit Ramachandran, Barret Zoph, and Quoc V Le. 2017. Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017).
[62]
Brandon Reagen, Udit Gupta, Lillian Pentecost, Paul Whatmough, Sae Kyu Lee, Niamh Mulholland, David Brooks, and Gu-Yeon Wei. 2018. Ares: A framework for quantifying the resilience of deep neural networks. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, 1--6.
[63]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779--788.
[64]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7263--7271.
[65]
Daniel A Reed and Jack Dongarra. 2015. Exascale computing and big data. Commun. ACM 58, 7 (2015), 56--68.
[66]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems. 91--99.
[67]
Abu Hasnat Mohammad Rubaiyat, Yongming Qin, and Homa Alemzadeh. 2018. Experimental resilience assessment of an open-source driving agent. arXiv preprint arXiv:1807.06172 (2018).
[68]
Behrooz Sangchoolie, Karthik Pattabiraman, and Johan Karlsson. 2017. One bit is (not) enough: An empirical study of the impact of single and multiple bit-flip errors. In 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 97--108.
[69]
Siva Kumar Sastry Hari, Radha Venkatagiri, Sarita V Adve, and Helia Naeimi. 2014. GangES: Gang error simulation for hardware resiliency evaluation. ACM SIGARCH Computer Architecture News 42, 3 (2014), 61--72.
[70]
Bianca Schroeder and Garth A Gibson. 2007. Understanding failures in petascale computers. In Journal of Physics: Conference Series, Vol. 78. IOP Publishing, 012022.
[71]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. nature 529, 7587 (2016), 484.
[72]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[73]
Marc Snir, Robert W Wisniewski, Jacob A Abraham, Sarita V Adve, Saurabh Bagchi, Pavan Balaji, Jim Belak, Pradip Bose, Franck Cappello, Bill Carlson, et al. 2014. Addressing failures in exascale computing. The International Journal of High Performance Computing Applications 28, 2 (2014), 129--173.
[74]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958.
[75]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.
[76]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1--9.
[77]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.
[78]
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th international conference on software engineering. ACM, 303--314.
[79]
Jiesheng Wei, Anna Thomas, Guanpeng Li, and Karthik Pattabiraman. 2014. Quantifying the accuracy of high-level fault injection techniques for hardware faults. In 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 375--382.
[80]
Zhaohan Xiong, Martin K Stiles, and Jichao Zhao. 2017. Robust ECG signal classification for detection of atrial fibrillation using a novel neural network. In 2017 Computing in Cardiology (CinC). IEEE, 1--4.
[81]
Hong-Jun Yoon, Arvind Ramanathan, and Georgia Tourassi. 2016. Multi-task deep neural networks for automated extraction of primary site and laterality information from cancer pathology reports. In INNS Conference on Big Data. Springer, 195--204.
[82]
Ming Zhang, Subhasish Mitra, TM Mak, Norbert Seifert, Nicholas J Wang, Quan Shi, Kee Sup Kim, Naresh R Shanbhag, and Sanjay J Patel. 2006. Sequential element design with built-in soft error resilience. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 14, 12 (2006), 1368--1378.

Cited By

View all
  • (2024)ReIPE: Recycling Idle PEs in CNN Accelerator for Vulnerable Filters Soft-Error DetectionACM Transactions on Architecture and Code Optimization10.1145/367490921:3(1-26)Online publication date: 28-Jun-2024
  • (2024)FKeras: A Sensitivity Analysis Tool for Edge Neural NetworksACM Journal on Autonomous Transportation Systems10.1145/3665334Online publication date: 18-May-2024
  • (2024)Assessing the Impact of Compiler Optimizations on GPUs ReliabilityACM Transactions on Architecture and Code Optimization10.1145/363824921:2(1-22)Online publication date: 12-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2019
1921 pages
ISBN:9781450362290
DOI:10.1145/3295500
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. error resilience
  2. fault injection
  3. machine learning

Qualifiers

  • Research-article

Funding Sources

  • Natural Sciences and Engineering Research Council of Canada (NSERC)

Conference

SC '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)161
  • Downloads (Last 6 weeks)11
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ReIPE: Recycling Idle PEs in CNN Accelerator for Vulnerable Filters Soft-Error DetectionACM Transactions on Architecture and Code Optimization10.1145/367490921:3(1-26)Online publication date: 28-Jun-2024
  • (2024)FKeras: A Sensitivity Analysis Tool for Edge Neural NetworksACM Journal on Autonomous Transportation Systems10.1145/3665334Online publication date: 18-May-2024
  • (2024)Assessing the Impact of Compiler Optimizations on GPUs ReliabilityACM Transactions on Architecture and Code Optimization10.1145/363824921:2(1-22)Online publication date: 12-Jan-2024
  • (2024)MRFI: An Open-Source Multiresolution Fault Injection Framework for Neural Network ProcessingIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2024.338440432:7(1325-1335)Online publication date: 1-Jul-2024
  • (2024)Artificial Neural Networks for Space and Safety-Critical Applications: Reliability Issues and Potential SolutionsIEEE Transactions on Nuclear Science10.1109/TNS.2024.334995671:4(377-404)Online publication date: May-2024
  • (2024)Silent Data Corruption in Robot Operating System: A Case for End-to-End System-Level Fault Analysis Using Autonomous UAVsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333229343:4(1037-1050)Online publication date: May-2024
  • (2024)Hardened-TC: A Low-cost Reliability Solution for CNNs Run by Modern GPUs2024 IEEE 37th International System-on-Chip Conference (SOCC)10.1109/SOCC62300.2024.10737768(1-6)Online publication date: 16-Sep-2024
  • (2024)EISFINN: On the Role of Efficient Importance Sampling in Fault Injection Campaigns for Neural Network Robustness Analysis2024 IEEE 30th International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS60994.2024.10616075(1-3)Online publication date: 3-Jul-2024
  • (2024)SBanTEM: A Novel Methodology for Sparse Band Tensors as Soft-Error Mitigation in Sparse Convolutional Neural Networks2024 IEEE 30th International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS60994.2024.10616070(1-3)Online publication date: 3-Jul-2024
  • (2024)Approximate Fault-Tolerant Neural Network Systems2024 IEEE European Test Symposium (ETS)10.1109/ETS61313.2024.10567290(1-10)Online publication date: 20-May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media