Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Software Fault Tolerance for Cyber-Physical Systems via Full System Restart

Published: 03 August 2020 Publication History

Abstract

The article addresses the issue of reliability of complex embedded control systems in the safety-critical environment. In this article, we propose a novel approach to design controller that (i) guarantees the safety of nonlinear physical systems, (ii) enables safe system restart during runtime, and (iii) allows the use of complex, unverified controllers (e.g., neural networks) that drive the physical systems toward complex specifications. We use abstraction-based controller synthesis approach to design a formally verified controller that provides application and system-level fault tolerance along with safety guarantee. Moreover, our approach is implementable using a commercial-off-the-shelf (COTS) processing unit. To demonstrate the efficacy of our solution and to verify the safety of the system under various types of faults injected in applications and in the underlying real-time operating system (RTOS), we implemented the proposed controller for the inverted pendulum and three degrees-of-freedom (3-DOF) helicopter.

References

[1]
2018. Retrieved from https://github.com/abditag2/reset-based-recovery.
[2]
2018. FreeRTOS. Retrieved October 2018 from http://www.freertos.org.
[3]
2018. PCA9685: 16-channel, 12-bit PWM Fm+ I2C-bus LED Controller. Retrieved October 2018 from https://goo.gl/FMnOQT.
[4]
F. Abdi, C. Chen, M. Hasan, S. Liu, S. Mohan, and M. Caccamo. 2019. Preserving physical safety under cyber attacks. IEEE Internet of Things Journal 6, 4 (Aug. 2019), 6285--6300.
[5]
F. Abdi, C. Y. Chen, M. Hasan, S. Liu, S. Mohan, and M. Caccamo. 2018. Guaranteed physical security with restart-based design for cyber-physical systems. In Proceedings of the 9th ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS’18). IEEE Press, Piscataway, NJ, 10--21.
[6]
F. Abdi, R. Mancuso, S. Bak, O. Dantsker, and M. Caccamo. 2016. Reset-based recovery for real-time cyber-physical systems with temporal safety constraints. In Proceedings of the 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, 1--8.
[7]
F. Abdi, R. Tabish, M. Rungger, M. Zamani, and M. Caccamo. 2017. Application and system-level software fault tolerance through full system restarts. In Proceedings of the 2017 ACM/IEEE 8th International Conference on Cyber-Physical Systems (ICCPS). IEEE, 197--206.
[8]
M. Althoff and B. H. Krogh. 2014. Reachability analysis of nonlinear differential-algebraic systems. IEEE Trans. Automat. Control 59, 2 (Feb. 2014), 371--383.
[9]
E. Asarin, T. Dang, and A. Girard. 2003. Reachability analysis of nonlinear systems using conservative approximation. In Proceedings of the International Workshop on Hybrid Systems: Computation and Control. Springer, 20--35.
[10]
C. Baier and J. P. Katoen. 2008. Principles of Model Checking. MIT press.
[11]
S. Bak, D. K. Chivukula, O. Adekunle, M. Sun, M. Caccamo, and L. Sha. 2009. The system-level simplex architecture for improved real-time embedded system safety. In Proceedings of the 2009 15th IEEE Real-Time and Embedded Technology and Applications Symposium. IEEE, 99--107.
[12]
S. Bak, T. T. Johnson, M. Caccamo, and L. Sha. 2014. Real-time reachability for verified simplex design. In Real-Time Systems Symposium. IEEE, 138--148.
[13]
F. Blanchini and S. Miani. 2008. Set-theoretic Methods in Control. Springer, 156--163.
[14]
G. Candea, J. Cutler, and A. Fox. 2004. Improving availability with recursive microreboots: A soft-state system case study. Perform. Eval. 56, 1--4 (2004), 213--248.
[15]
G. Candea and A. Fox. 2001. Recursive restartability: Turning the reboot sledgehammer into a scalpel. In Proceedings of the 8th Workshop on Hot Topics in Operating Systems. IEEE, 125--130.
[16]
G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox. 2004. Microreboot—A technique for cheap recovery. In Proceedings of the 6th Conference on Symposium on Opearting Systems Design 8 Implementation, Volume 6 (OSDI’04). ACM, 3--3.
[17]
G. Candea, E. Kiciman, S. Zhang, P. Keyani, and A. Fox. 2003. JAGR: An autonomous self-recovering application server. In Proceedings of the 2003 Autonomic Computing Workshop. IEEE, 168--177.
[18]
T. L. Crenshaw, E. Gunter, C. L. Robinson, L. Sha, and P. R. Kumar. 2007. The simplex reference model: Limiting fault-propagation due to unreliable components in cyber-physical system architectures. In Proceedings of the 28th IEEE International Real-Time Systems Symposium (RTSS'07). 400--412.
[19]
S. Garg, A. Puliafito, M. Telek, and K. S. Trivedi. 1995. Analysis of software rejuvenation using Markov regenerative stochastic Petri net. In Proceedings of the 6th International Symposium on Software Reliability Engineering. IEEE, 180--187.
[20]
C. George and F. Armando. 2003. Crash-only software. In Proceedings of the 9th Workshop on Hot Topics in Operating Systems. 67--72.
[21]
Y. Huang, C. Kintala, N. Kolettis, and N. D. Fulton. 1995. Software rejuvenation: Analysis, module and applications. In Proceedings of the 25th International Symposium on Fault-Tolerant Computing. IEEE, 381--390.
[22]
F. Immler, M. Althoff, X. Chen, C. Fan, G. Frehse, N. Kochdumper, Y. Li, S. Mitra, M. S. Tomar, and M. Zamani. 2018. ARCH-COMP18 category report: Continuous and hybrid systems with nonlinear dynamics. In Proceedings of the 5th International Workshop on Applied Verification for Continuous and Hybrid Systems.
[23]
ARM Inc. 2018. ARM TrustZone. Retrieved October 2018 from https://www.arm.com/products/security-on-arm/trustzone.
[24]
Quanser Inc. 2018. 3 DOF Helicopter. Retrieved October 2018 from http://www.quanser.com/products/3dof_helicopter.
[25]
Quanser Inc. 2018. Q8 Data Acquisition Board. Retrieved October 2018 from http://www.quanser.com/products/q8.
[26]
E. A. Lee. 2008. Cyber physical systems: Design challenges. In 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC). IEEE, 363--369.
[27]
Linux Make. 2017. Super Fast Boot of Embedded Linux. Retrieved from https://www.makelinux.com/emb/fastboot/.
[28]
S. Mohan, S. Bak, E. Betti, H Yun, L. Sha, and M. Caccamo. 2013. S3A: Secure system simplex architecture for enhanced security and robustness of cyber-physical systems. In Proceedings of the 2nd ACM International Conference on High Confidence Networked Systems. ACM, 65--74.
[29]
G. Reissig. 2011. Computing abstractions of nonlinear systems. IEEE Trans. Automat. Control 56, 11 (Nov. 2011), 2583--2598.
[30]
G. Reissig, A. Weber, and M. Rungger. 2017. Feedback refinement relations for the synthesis of symbolic controllers. IEEE Trans. Automat. Control 62, 4 (April 2017), 1781--1796.
[31]
M. Rungger and P. Tabuada. 2017. Computing robust controlled invariant sets of linear systems. IEEE Trans. Automat. Control 62, 7 (July 2017), 3665--3670.
[32]
M. Rungger and M. Zamani. 2016. SCOTS: A tool for the synthesis of symbolic controllers. In Proceedings of the 19th International Conference on Hybrid Systems: Computation and Control. ACM, 99--104.
[33]
D. Seto and L. Sha. 1999. An engineering method for safety region development. Technical report. CMU/SEI-99-TR-018. Software Engineering Institute, Carnegie Mellon University. http://resources.sei.cmu.edu/library/asset-view.cfm?AssetID=13483.
[34]
L. Sha. 1998. Dependable system upgrade. In Proceedings of the 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279). IEEE, 440--448.
[35]
L. Sha. 2001. Using simplicity to control complexity. IEEE Software 18, 4 (July 2001), 20--28.
[36]
L. Sha, R. Rajkumar, and M. Gagliardi. 1996. Evolving dependable real-time systems. In Proceedings on Aerospace Applications Conference, Vol. 1. IEEE, 335--346.
[37]
E. D. Sontag. 2013. Mathematical Control Theory: Deterministic Finite Dimensional Systems. Vol. 6. Springer Science 8 Business Media.
[38]
S. M. Sulaman, A. Orucevic-Alagic, M. Borg, K. Wnuk, M. Höst, and J. L. d. l. Vara. 2014. Development of safety-critical software systems using open source software—A systematic map. In Proceedings of the 2014 40th EUROMICRO Conference on Software Engineering and Advanced Applications. 17--24.
[39]
P. Tabuada. 2009. Verification and Control of Hybrid Systems: A Symbolic Approach. Springer Science 8 Business Media.
[40]
K. Vaidyanathan and K. S Trivedi. 2005. A comprehensive model for software rejuvenation. IEEE Transactions on Dependable and Secure Computing 2, 2 (2005), 124--137.
[41]
P. Vivekanandan, G. Garcia, H. Yun, and S. Keshmiri. 2016. A simplex architecture for intelligent and safe unmanned aerial vehicles. In Proceedings of the 2016 IEEE 22nd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA). IEEE, 69--75.
[42]
M. Zamani and M. Arcak. 2017. Compositional abstraction for networks of control systems: A dissipativity approach. IEEE Transactions on Control of Network Systems PP, 99 (2017), 1--1.
[43]
M. Zamani, I. Tkachev, and A. Abate. 2017. Towards scalable synthesis of stochastic control systems. Discrete Event Dynamic Systems 27, 2 (2017), 341--369.

Cited By

View all
  • (2024)Integrating Graceful Degradation and Recovery through Requirement-driven AdaptationProceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems10.1145/3643915.3644090(122-132)Online publication date: 15-Apr-2024
  • (2024)Experimentation and Implementation of the BFT++ Cyber-Attack Resilience Mechanism for Cyber-Physical SystemsACM Transactions on Cyber-Physical Systems10.1145/36395708:3(1-25)Online publication date: 19-Jan-2024
  • (2023)Towards safe AIProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i12.26789(15340-15349)Online publication date: 7-Feb-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Cyber-Physical Systems
ACM Transactions on Cyber-Physical Systems  Volume 4, Issue 4
Special Issue on Self-Awareness in Resource Constrained CPS and Regular Papers
October 2020
293 pages
ISSN:2378-962X
EISSN:2378-9638
DOI:10.1145/3407233
  • Editor:
  • Tei-Wei Kuo
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 03 August 2020
Accepted: 01 June 2020
Revised: 01 March 2020
Received: 01 December 2018
Published in TCPS Volume 4, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cyber-physical systems
  2. abstraction-based control
  3. fault-tolerance
  4. full system restart
  5. nonlinear systems

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)198
  • Downloads (Last 6 weeks)28
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Integrating Graceful Degradation and Recovery through Requirement-driven AdaptationProceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems10.1145/3643915.3644090(122-132)Online publication date: 15-Apr-2024
  • (2024)Experimentation and Implementation of the BFT++ Cyber-Attack Resilience Mechanism for Cyber-Physical SystemsACM Transactions on Cyber-Physical Systems10.1145/36395708:3(1-25)Online publication date: 19-Jan-2024
  • (2023)Towards safe AIProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i12.26789(15340-15349)Online publication date: 7-Feb-2023
  • (2023)Autonomous Exploration Using Ground Robots with Safety Guarantees2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS55552.2023.10341929(9745-9750)Online publication date: 1-Oct-2023
  • (2023)Algebraically explainable controllers: decision trees and support vector machines join forcesInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-023-00716-z25:3(249-266)Online publication date: 1-Jun-2023
  • (2023)Bounded DBM-based clock state construction for timed automata in UppaalInternational Journal on Software Tools for Technology Transfer (STTT)10.1007/s10009-022-00667-x25:1(19-47)Online publication date: 1-Feb-2023
  • (2022)Secure Reboots for Real-Time Cyber-Physical SystemsProceedings of the 4th Workshop on CPS & IoT Security and Privacy10.1145/3560826.3563384(27-33)Online publication date: 7-Nov-2022
  • (2021)Work-in-Progress: Enabling Secure Boot for Real-Time Restart-Based Cyber-Physical Systems2021 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS52674.2021.00056(524-527)Online publication date: Dec-2021
  • (2020)Reboot-Oriented IoT: Life Cycle Management in Trusted Execution Environment for Disposable IoT devicesProceedings of the 36th Annual Computer Security Applications Conference10.1145/3427228.3427293(428-441)Online publication date: 7-Dec-2020
  • (undefined)Hybrid Modular Redundancy: Exploring Modular Redundancy Approaches in RISC-V Multi-Core Computing Clusters for Reliable Processing in SpaceACM Transactions on Cyber-Physical Systems10.1145/3635161

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media