Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2465470.2465471acmconferencesArticle/Chapter ViewAbstractPublication PagescomparchConference Proceedingsconference-collections
research-article

Fault-tolerant fault tolerance for component-based automation systems

Published: 17 June 2013 Publication History

Abstract

To guarantee high availability, automation systems must be fault-tolerant. To this end, they must provide redundant solutions for the critical parts of the system. Classical fault tolerance patterns such as standby or N-modular redundancy provide system stability in the case of a fault. Fault tolerance is subsequently degraded or, depending on the number of deployed replicas, often even unavailable until the system has been repaired.
We introduce a combination of a component-based framework, redundancy patterns, and a runtime manager, which is able to provide fault tolerance, to detect host failures, and to trigger a reconfiguration of the system at runtime. This combined solution maintains system operation in case a fault occurs and automatically restores fault tolerance. The proposed solution is validated using a case study of an industrial distributed automation system. The validation shows how our solution quickly restores fault tolerance without the need for operator intervention or immediate hardware replacement while limiting the impact on other applications.

References

[1]
AutomationWorld: Automation Services Reduce Downtime for Manufacturers. http://www.automationworld.com/automation-services-reduce-downtime-manufacturers (Oct 2009)
[2]
Oriol, M., Wahler, M., Steiger, R., Stoeter, S., Vardar, E., Koziolek, H., Kumar, A.: FASA: A Scalable Software Framework for Distributed Control Systems. In: Proceedings of the 3rd International ACM Sigsoft Symposium on Architecting Critical Systems (ISARCS 2012), Bertinoro, Italy (June 2012)
[3]
Yeh, Y.C.: Triple-triple Redundant 777 Primary Flight Computer. In: IEEE. Volume 1., IEEE (1996) 293--307
[4]
Thomm, I., Stilkerich, M., Kapitza, R., Schroder-Preikschat, W., Lohmann, D.: Automated Application of Fault Tolerance Mechanisms in a Component-based System. Proceedings of the 9th International Workshop on Java Technologies for Real-Time and Embedded Systems (September 2011) 87--95
[5]
Invensys: Tricon Fault-tolerant Controller. http://iom.invensys.com/EN/Pages/triconex_tricon.aspx
[6]
Lala, J., Alger, L.: Hardware and Software Fault Tolerance: a Unified Architectural Approach. In: Fault-Tolerant Computing, Tokio, IEEE (1988) 240--245
[7]
Guerraoui, R., Schiper, A.: Software-based Replication for Fault Tolerance. Computer 30(4) (April 1997) 68--74
[8]
Wahler, M., Richter, S., Kumar, S., Oriol, M.: Non-disruptive Large-scale Component Updates for Real-time Controllers. In: Data Engineering Workshops (ICDEW), 2011 IEEE 27th International Conference on. (2011) 174--178
[9]
Vardar, E.: Dynamic Load Balancing in Real-Time Control Systems. Master's thesis, Ecole Polytechnique Federale de Lausanne, Switzerland (2012)
[10]
Kulkarni, S.: Component Based Design of Fault-Tolerance. PhD thesis, The Ohio State University (1999)
[11]
Richter, S., Wahler, M., Kumar, A.: A Framework for Component-Based Real-Time Control Applications. In: 13th Real-Time Linux Workshop, Prague, Czech Republic (2011)
[12]
Nenninger, P., Rambow, T., Kiencke, U.: Automatic Model Based Partitioning of Distributed Automotive Electronic Systems (2004)
[13]
Nenninger, P., Rooks, O., Kiencke, U.: A Novel Approach for Dynamic Distribution in Safety Relevant Automotive Systems. In: IAR. (2005)
[14]
Ahistrom, K., Torin, J., Johannessen, P.: Design Method for Conceptual Design of By-wire Control: Two Case Studies. In: Proceedings Seventh IEEE International Conference on Engineering of Complex Computer Systems, IEEE Comput. Soc (June 2001) 133--143
[15]
Barros, F.J.: An Evolving Hierarchical & Modular Approach to Resilient Software. In: Proceedings of the 2008 RISE/EFTS Joint International Workshop on Software Engineering for Resilient Systems. SERENE '08, New York, NY, USA, ACM (2008) 79--86
[16]
Chandra, T.D., Griesemer, R., Redstone, J.: Paxos Made Live: An Engineering Perspective. In: Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing. PODC '07, New York, NY, USA, ACM (2007) 398--407
[17]
Strunk, E.A., Knight, J.C.: Dependability through Assured Reconfiguration in Embedded System Software. IEEE Transactions on Dependable and Secure Computing 3(3) (2006) 172--187
[18]
Schlichting, R.D., Schneider, F.B.: Fail-stop Processors: an Approach to Designing Fault-tolerant Computing Systems. ACM Trans. Comput. Syst. 1(3) (August 1983) 222--238
[19]
Arora, A., Kulkarni, S.: Component Based Design of Multitolerant Systems. Software Engineering, IEEE Transactions on 24(1) (January 1998) 63--78
[20]
Harrison, N.B., Avgeriou, P.: Incorporating Fault Tolerance Tactics in Software Architecture Patterns. In: Proceedings of the 2008 RISE/EFTS Joint International Workshop on Software Engineering for Resilient Systems. SERENE '08, New York, NY, USA, ACM (2008) 9--18
[21]
Dashofy, E.M., van der Hoek, A., Taylor, R.N.: Towards Architecture-based Self-healing Systems. In: Proceedings of the first workshop on Self-healing systems. WOSS '02, New York, NY, USA, ACM (2002) 21--26
[22]
Grover, W.D., Stamatelakis, D.: Cycle-Oriented Distributed Preconfiguration: Ring-like Speed with Mesh-like Capacity for Self-planning Network Restoration. In: Proceedings of IEEE ICC 98. (June 1998) 537--543

Cited By

View all
  • (2023)Fault-tolerance at your Finger Tips with the TeamPlay Coordination LanguageProceedings of the 35th Symposium on Implementation and Application of Functional Languages10.1145/3652561.3652571(1-13)Online publication date: 29-Aug-2023
  • (2021)Agile services and analysis framework for autonomous and autonomic critical infrastructureInnovations in Systems and Software Engineering10.1007/s11334-021-00411-919:2(145-156)Online publication date: 13-Aug-2021
  • (2021)Decentralized Task Reallocation on Parallel Computing Architectures Targeting an Avionics ApplicationJournal of Optimization Theory and Applications10.1007/s10957-021-01862-7Online publication date: 6-May-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISARCS '13: Proceedings of the 4th international ACM Sigsoft symposium on Architecting critical systems
June 2013
68 pages
ISBN:9781450321235
DOI:10.1145/2465470
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. component-based systems
  2. distributed systems
  3. fault tolerance
  4. modular automation architecture

Qualifiers

  • Research-article

Conference

Comparch '13
Sponsor:

Acceptance Rates

ISARCS '13 Paper Acceptance Rate 7 of 12 submissions, 58%;
Overall Acceptance Rate 14 of 30 submissions, 47%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Fault-tolerance at your Finger Tips with the TeamPlay Coordination LanguageProceedings of the 35th Symposium on Implementation and Application of Functional Languages10.1145/3652561.3652571(1-13)Online publication date: 29-Aug-2023
  • (2021)Agile services and analysis framework for autonomous and autonomic critical infrastructureInnovations in Systems and Software Engineering10.1007/s11334-021-00411-919:2(145-156)Online publication date: 13-Aug-2021
  • (2021)Decentralized Task Reallocation on Parallel Computing Architectures Targeting an Avionics ApplicationJournal of Optimization Theory and Applications10.1007/s10957-021-01862-7Online publication date: 6-May-2021
  • (2019)RT-ByzCastIEEE Transactions on Computers10.1109/TC.2018.287144368:3(440-454)Online publication date: 17-Jul-2019
  • (2018)FASAJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2015.01.00261:2(82-111)Online publication date: 29-Dec-2018
  • (2016)Who's On Board?: Probabilistic Membership for Real-Time Distributed Control Systems2016 IEEE 35th Symposium on Reliable Distributed Systems (SRDS)10.1109/SRDS.2016.029(167-176)Online publication date: Sep-2016
  • (2016)Right on Time Distributed Shared Memory2016 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS.2016.038(315-326)Online publication date: Nov-2016
  • (2015)Studying the deficiencies and problems of different architecture in developing distributed systems and analyze the existing solution2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI)10.1109/KBEI.2015.7436151(826-834)Online publication date: Nov-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media