Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1878961.1879014acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

A task remapping technique for reliable multi-core embedded systems

Published: 24 October 2010 Publication History

Abstract

With the continuous scaling of semiconductor technology, the life-time of circuit is decreasing so that processor failure becomes an important issue in MPSoC design. A software solution to tolerate run-time processor failure is to migrate tasks from the failed processors to the live processors when failure occurs. Previous works on run-time task migration usually aim to minimize the migration overhead with or without a given latency constraint. For streaming applications, however, it is more important to minimize the throughput degradation than the migration overhead or the latency. Hence, we propose a task remapping technique to minimize the throughput degradation assuming that the migration overhead can be amortized safely. The target multi-core system assumed in this paper consists of processor pools and each pool consists of homogeneous processors. The proposed technique is based on an intensive compile-time analysis for all possible failure scenarios. It involves the following steps; 1) Determine the static mapping of tasks onto the live processors, aiming to minimize the throughput degradation: 2) Find an optimal processor-to-processor mapping to minimize the task migration overhead: and 3) Store the resultant task remapping information that includes task mapping and processor-to-processor mapping results. Since the task remapping information is pre-computed at compile-time for all possible failure scenarios, it should be efficiently represented and stored. At run-time, we simply remap the tasks following the compile-time decision. We examine the scalability of the proposed technique on both space and run-time overhead for compile-time analysis varying the number of failed processors. Through intensive experiments, we show that the proposed technique outperforms the previous works with respect to application throughput.

References

[1]
Council, J. E. D. E. (2006). Failure mechanisms and models for semiconductor devices. http://www.jedec.org/ download/search/jep122C.pdf.
[2]
I. Koren and C. M. Krishna, "Fault-Tolerant Systems," Morgan Kaufmann Publisher, 2007.
[3]
S. Chabridon and E. Gelenbe, "Failure detection algorithms for a reliable execution of parallel programs," in Proc. International Symposium on Reliable Distributed Systems, pp. 229--238, Sep. 1995.
[4]
C. Gond, R. Melhem, and R. Gupta, "Loop transformations for fault detection in regular loops on massively parallel systems," IEEE Trans. Parallel and Distributed Systems, vol. 7, no. 12, pp. 1238--1249, Dec. 1996.
[5]
M. Chean and J. Fortes, "The full-use-of-suitable-spares (FUSS) approach to hardware reconfiguration for fault-tolerant processor arrays," IEEE Trans. Computers, vol. 39, no. 4, pp. 564--571, Apr. 1990.
[6]
G. Manimaran and C. S. R. Murthy, "A fault-tolerant dynamic scheduling algorithm for multiprocessor real-time systems and its analysis," IEEE Trans. Parallel and Distributed Systems, vol. 9, no. 11, pp. 1137--1152, Nov. 1998.
[7]
T. T. Y. Suen, T. and J. S. K. Wong, "Efficient task migration algorithm for distributed systems," IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 4, pp. 488--499, Jul. 1992.
[8]
H. W. D. Chang and W. J. B. Oldham, "Dynamic task allocation models for large distributed computing systems," IEEE Trans. Parallel and Distributed Systems, vol. 6, no. 12, pp. 1301--1315, Dec. 1995.
[9]
C. Zhu, Z. Gu, R. P. Dick, and L. Shang, "Reliable multiprocessor system-on-chip synthesis," in Proc. International Conference on Hardware/Software Codesign and System Synthesis, pp. 239--244, Sep. 2007.
[10]
L. Huang, F. Yuan, and Q. Xu, "Lifetime Reliability-Aware Task Allocation and Scheduling for MPSoC Platforms," in Proc. Design Automation and Test in Europe, pp. 1338--1343, Apr. 2009.
[11]
A. K. Coskun, T. S. Rosing, K. A. Whisnant, and K. C. Gross, "Static and dynamic temperature-aware scheduling for multiprocessor SoCs," IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 16, no. 9, pp. 1127--1140, Sep. 2008.
[12]
C. Yang and A. Orailoglu, "Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules," in Proc. International Conference on Hardware/Software Codesign and System Synthesis, pp. 15--20, Sep. 2007.
[13]
C. Yang and A. Orailoglu, "Towards no-cost adaptive MPSoC static schedules through exploitation of logical-to-physical core mapping latitude," in Proc. Design Automation and Test in Europe, pp. 63--68, Apr. 2009.
[14]
G. M. Almeida, G. Sassatelli, P. Benoit, N. Saint-Jean, S. Varyani, L. Torres, and M. Robert, "An Adaptive Message Passing MPSoC Framework," International Journal of Reconfigurable Computing, vol. 2009, Article ID 242981, 2009.
[15]
A. K. Coskun, T. S. Rosing, and K. Whisnant, "Temperature aware task scheduling in MPSoCs," in Proc. Design Automation and Test in Europe, pp. 1--6, Apr. 2007.
[16]
V. Nollet, P. Avasare, J.-Y. Mignolet, and D. Verkest, "Low cost task migration initiation in a heterogeneous MP-SoC," in Proc. Design Automation and Test in Europe, Mar. 2005.
[17]
T. Streichert, C. Strengert, C. Haubelt, and J. Teich, "Dynamic task binding for hardware/software reconfigurable networks," in Proc. Symposium on Integrated Circuits and System Design, pp. 38--43, Aug. 2006.
[18]
A. Dogan and F. Ozguner, "Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing," IEEE Trans. Parallel and Distributed Systems, vo. 13, no. 3, pp. 308--323, Mar. 2002.
[19]
S. M. Shatz, J.-P. Wang, and M. Goto, "Task allocation for maximizing reliability of distributed computer systems," IEEE Trans. Computer, vol. 41, no. 9, pp. 1156--1168, Sep. 1992.
[20]
P. S. Paolucci, A. A. Jerraya, R. Leupers, L. Thiele, and P. Vicini, "SHAPES: A tiled scalable software hardware architecture platform for embedded systems," in Proc. International Conference on Hardware/Software Codesign and System Synthesis, pp. 167--172, Oct. 2006.
[21]
S. Bertozzi, A. Acquaviva, D. Bertozzi, and A. Poggiali, "Supporting task migration in multi-processor systems-on-chip: A feasibility study," in Proc. Design Automation and Test in Europe, pp. 1--6, Mar. 2006.
[22]
H. Yang and S. Ha, "Pipelined Data Parallel Task Mapping/Scheduling Technique for MPSoC," in Proc. Design Automation and Test in Europe, pp. 69--74, Apr. 2009.
[23]
Y.-K. Kwok, I. Ahmad, and J. Gu. "Fast: A low-complexity algorithm for efficient scheduling of DAGs on parallel processors," In Proc. International Conference on Parallel Processing, pp. 155--157, Aug. 1996.
[24]
R.P. Dick, D.L. Rhodes, and W. Wolf, "TGFF: Task Graphs for Free" in Proc. International Workshop on Hardware/Software Codesign, pp. 97--101, Mar. 1998.

Cited By

View all
  • (2023)A performance-centric ML-based multi-application mapping technique for regular Network-on-ChipMemories - Materials, Devices, Circuits and Systems10.1016/j.memori.2023.1000594(100059)Online publication date: Jul-2023
  • (2021)Adaptive Scheduling for Time-Triggered Network-on-Chip-Based Multi-Core Architecture Using Genetic AlgorithmElectronics10.3390/electronics1101004911:1(49)Online publication date: 24-Dec-2021
  • (2021)Multi-objective biogeography-based optimization and reinforcement learning hybridization for network-on chip reliability improvementJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.11.005Online publication date: Nov-2021
  • Show More Cited By

Index Terms

  1. A task remapping technique for reliable multi-core embedded systems

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          CODES/ISSS '10: Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
          October 2010
          348 pages
          ISBN:9781605589053
          DOI:10.1145/1878961
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          In-Cooperation

          • CEDA
          • IEEE CAS
          • IEEE CS

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 24 October 2010

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. multi-core embedded systems
          2. reliability
          3. static task mapping

          Qualifiers

          • Research-article

          Conference

          ESWeek '10
          ESWeek '10: Sixth Embedded Systems Week
          October 24 - 29, 2010
          Arizona, Scottsdale, USA

          Acceptance Rates

          Overall Acceptance Rate 280 of 864 submissions, 32%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)10
          • Downloads (Last 6 weeks)1
          Reflects downloads up to 25 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)A performance-centric ML-based multi-application mapping technique for regular Network-on-ChipMemories - Materials, Devices, Circuits and Systems10.1016/j.memori.2023.1000594(100059)Online publication date: Jul-2023
          • (2021)Adaptive Scheduling for Time-Triggered Network-on-Chip-Based Multi-Core Architecture Using Genetic AlgorithmElectronics10.3390/electronics1101004911:1(49)Online publication date: 24-Dec-2021
          • (2021)Multi-objective biogeography-based optimization and reinforcement learning hybridization for network-on chip reliability improvementJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.11.005Online publication date: Nov-2021
          • (2021)Mapping techniques in multicore processors: current and future trendsThe Journal of Supercomputing10.1007/s11227-021-03650-6Online publication date: 5-Feb-2021
          • (2020)Enhancing System Reliability Through Targeting Fault Propagation ScopeSoft Computing Methods for System Dependability10.4018/978-1-7998-1718-5.ch004(131-160)Online publication date: 2020
          • (2020)Nested genetic algorithm for highly reliable and efficient embedded system designDesign Automation for Embedded Systems10.1007/s10617-020-09234-6Online publication date: 6-Mar-2020
          • (2019)Self-Optimizing and Self-Programming Computing Systems: A Combined Compiler, Complex Networks, and Machine Learning ApproachIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.2897650(1-12)Online publication date: 2019
          • (2019)Resource Management for Improving Soft-Error and Lifetime Reliability of Real-Time MPSoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.288399338:12(2215-2228)Online publication date: Dec-2019
          • (2019)Periodic Task Scheduling Algorithm for Homogeneous Multi-core Parallel Processing System2019 IEEE International Conference on Unmanned Systems (ICUS)10.1109/ICUS48101.2019.8995985(710-713)Online publication date: Oct-2019
          • (2019)Test platform for autopilot system embedded in a model of multi-core architecture using X-Plane flight simulator2019 IEEE/AIAA 38th Digital Avionics Systems Conference (DASC)10.1109/DASC43569.2019.9081788(1-6)Online publication date: Sep-2019
          • Show More Cited By

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media