Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1878961.1878970acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

Hardware/software optimization of error detection implementation for real-time embedded systems

Published: 24 October 2010 Publication History

Abstract

This paper presents an approach to system-level optimization of error detection implementation in the context of fault-tolerant real-time distributed embedded systems used for safety-critical applications. An application is modeled as a set of processes communicating by messages. Processes are mapped on computation nodes connected to the communication infrastructure. To provide resiliency against transient faults, efficient error detection and recovery techniques have to be employed. Our main focus in this paper is on the efficient implementation of the error detection mechanisms. We have developed techniques to optimize the hardware/software implementation of error detection, in order to minimize the global worst-case schedule length, while meeting the imposed hardware cost constraints and tolerating multiple transient faults. We present two design optimization algorithms which are able to find feasible solutions given a limited amount of resources: the first one assumes that, when implemented in hardware, error detection is deployed on static reconfigurable FPGAs, while the second one considers partial dynamic reconfiguration capabilities of the FPGAs.

References

[1]
Akerholm, M., Moller, A., Hansson, H. and Nolin, M., Towards a Dependable Component Technology for Embedded System Applications, Intl. Workshop on Object-Oriented Real-Time Dependable Systems, 2005, 320--328.
[2]
Banerjee, S., Bozorgzadeh, E. and Dutt, N., Physically-Aware HWSW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration, DAC, 2005, 335--340.
[3]
Blome, J.A., Gupta, S., Feng, S., and Mahlke, S., Cost-Efficient Soft Error Protection for Embedded Microprocessors, CASES, 2006, 421--431.
[4]
Bolchini, C., Miele, A., Rebaudengo, M., Salice, F., Sciuto, D., Sterpone, L. and Violante, M., Software and Hardware Techniques for SEU Detection in IP Processors, J. Electron. Test., 24 (1--3), 2008, 35--44.
[5]
Constantinescu, C., Trends and Challenges in VLSI Circuit Reliability, IEEE Micro, 23 (4), 2003, 14--19.
[6]
Cordone, R., Redaelli, F., Redaelli, M.A., Santambrogio, M.D. and Sciuto, D., Partitioning and Scheduling of Task Graphs on Partially Dynamically Reconfigurable FPGAs, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 28 (5), 2009, 662--675.
[7]
Eles, P., Doboli, A., Pop, P. and Peng, Z., Scheduling with Bus Access Optimization for Distributed Embedded Systems, IEEE Trans. Very Large Scale Integr. Syst., 8 (5), 2000, 472--491.
[8]
Garey, M.R. and Johnson, D.S., Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman & Co., New York, NY, USA, 1990.
[9]
Hu, J., Li, F., Degalahal, V., Kandemir, M., Vijaykrishnan, N. and Irwin, M.J., Compiler-Assisted Soft Error Detection under Performance and Energy Constraints in Embedded System, ACM Trans. Embed. Comput. Syst., 8 (4), 2009, 1--30.
[10]
Izosimov, V., Pop, P., Eles, P. and Peng, Z., Synthesis of Fault-Tolerant Schedules with Transparency/Performance Trade-offs for Distributed Embedded Systems, DATE, 2006, 1--6.
[11]
Izosimov, V., Pop, P., Eles, P. and Peng, Z., Design Optimization of Time- and Cost-Constrained Fault-Tolerant Distributed Embedded Systems, DATE, 2005, 864--869, Vol. 2.
[12]
Kopetz, H. and Bauer, G., The Time-Triggered Architecture, Proceedings of the IEEE, 91 (1), 2003, 112--126.
[13]
Lima, F., Carro, L. and Reis, R., Designing Fault-Tolerant Systems into SRAM-Based FPGAs, DAC, 2003, 650--655.
[14]
Lyle, G., Chen, S., Pattabiraman, K., Kalbarczyk, Z. and Iyer, R.K., An End-to-End Approach for the Automatic Derivation of Application-Aware Error Detectors, DSN, 2009, 584--589.
[15]
Pattabiraman, K., Kalbarczyk, Z. and Iyer, R.K., Automated Derivation of Application-Aware Error Detectors using Static Analysis: the Trusted Illiac Approach, IEEE Trans. Dependable Secure Comput., 99, 2009.
[16]
Pattabiraman, K., Kalbarczyk, Z. and Iyer, R.K., Application-Based Metrics for Strategic Placement of Detectors, PRDC, 2005, 75--82.
[17]
Reeves, C.R., Modern Heuristic Techniques for Combinatorial Problems, John Wiley & Sons, Inc., New York, NY, USA, 1993.
[18]
Reis, G.A., Chang, J., Vachharajani, N., Rangan, R., August, D.I. and Mukherjee, S.S., Software-Controlled Fault Tolerance, ACM Trans. Archit. Code Optim., 2 (4), 2005, 366--396.
[19]
Schuck, C., Kühnle, M., Hübner, M. and Becker, J., A Framework for Dynamic 2D Placement on FPGAs, Intl. Symp. on Parallel and Distributed Processing, 2008, 1--7.
[20]
Sedcole, P., Blodget, B., Anderson, J., Lysaght, P. and Becker, T., Modular Partial Reconfiguration in Virtex FPGAs, Intl. Conf. on Field Programmable Logic and Applications, 2005, 211--216.
[21]
Wei, T., Mishra, P., Wu, K. and Liang, H., Online Task-Scheduling for Fault-Tolerant Low-Energy Real-Time Systems, Intl. Conf. on Computer-Aided Design, 2006, 522--527.
[22]
Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J. and Stenström, P., The Worst-Case Execution-Time Problem -- Overview of Methods and Survey of Tools, ACM Trans. Embed. Comput. Syst., 7 (3), 2008, 1--53.
[23]
Wirthlin, M., Johnson, E., Rollins, N., Caffrey, M. and Graham, P., The Reliability of FPGA Circuit Designs in the Presence of Radiation Induced Configuration Upsets, Symp. on Field-Programmable Custom Computing Machines, 2003, 133--142.
[24]
Xilinx Inc., Early Access Partial Reconfiguration User Guide, Xilinx UG208 (v1.1), March 6, 2006.

Cited By

View all
  • (2020)Toward Efficient Design Space Exploration for Fault-Tolerant Multiprocessor SystemsIEEE Transactions on Evolutionary Computation10.1109/TEVC.2019.291272624:1(157-169)Online publication date: Feb-2020
  • (2020)Integrating Online Safety-related Memory Tests in Multicore Real-Time Systems2020 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS49844.2020.00035(296-307)Online publication date: Dec-2020
  • (2019)Multi-objective redundancy hardening with optimal task mapping for independent tasks on multi-coresSoft Computing10.1007/s00500-019-03937-0Online publication date: 27-Mar-2019
  • Show More Cited By

Index Terms

  1. Hardware/software optimization of error detection implementation for real-time embedded systems

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            CODES/ISSS '10: Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
            October 2010
            348 pages
            ISBN:9781605589053
            DOI:10.1145/1878961
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Sponsors

            In-Cooperation

            • CEDA
            • IEEE CAS
            • IEEE CS

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 24 October 2010

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. embedded systems
            2. error detection
            3. fault tolerance
            4. fpga
            5. hw/sw codesign
            6. reconfigurable systems
            7. system-level optimization

            Qualifiers

            • Research-article

            Conference

            ESWeek '10
            ESWeek '10: Sixth Embedded Systems Week
            October 24 - 29, 2010
            Arizona, Scottsdale, USA

            Acceptance Rates

            Overall Acceptance Rate 280 of 864 submissions, 32%

            Upcoming Conference

            ESWEEK '24
            Twentieth Embedded Systems Week
            September 29 - October 4, 2024
            Raleigh , NC , USA

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)6
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 12 Sep 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2020)Toward Efficient Design Space Exploration for Fault-Tolerant Multiprocessor SystemsIEEE Transactions on Evolutionary Computation10.1109/TEVC.2019.291272624:1(157-169)Online publication date: Feb-2020
            • (2020)Integrating Online Safety-related Memory Tests in Multicore Real-Time Systems2020 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS49844.2020.00035(296-307)Online publication date: Dec-2020
            • (2019)Multi-objective redundancy hardening with optimal task mapping for independent tasks on multi-coresSoft Computing10.1007/s00500-019-03937-0Online publication date: 27-Mar-2019
            • (2018)A framework for reliability-aware embedded system design on multiprocessor platformsMicroprocessors & Microsystems10.1016/j.micpro.2014.02.00738:6(539-551)Online publication date: 28-Dec-2018
            • (2017)A Majority-Based Reliability-Aware Task Mapping in High-Performance Homogenous NoC ArchitecturesACM Transactions on Embedded Computing Systems10.1145/313127317:1(1-31)Online publication date: 6-Dec-2017
            • (2016)A Majority-Based Reliability-Aware Task-Mapping in High-Performance Homogenous NoC Architectures2016 Euromicro Conference on Digital System Design (DSD)10.1109/DSD.2016.28(479-486)Online publication date: Aug-2016
            • (2014)Embedded software reliability for unreliable hardwareProceedings of the 14th International Conference on Embedded Software10.1145/2656045.2661649(1-1)Online publication date: 12-Oct-2014
            • (2013)Reliability-Driven System-Level Synthesis for Mixed-Critical Embedded SystemsIEEE Transactions on Computers10.1109/TC.2012.22662:12(2489-2502)Online publication date: 1-Dec-2013
            • (2012)Co-design techniques for distributed real-time embedded systems with communication security constraintsProceedings of the Conference on Design, Automation and Test in Europe10.5555/2492708.2492945(947-952)Online publication date: 12-Mar-2012
            • (2012)Towards fault-tolerant embedded systems with imperfect fault detectionProceedings of the 49th Annual Design Automation Conference10.1145/2228360.2228398(188-196)Online publication date: 3-Jun-2012
            • Show More Cited By

            View Options

            Get Access

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media