Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Lifetime Reliability-Constrained Runtime Mapping for Throughput Optimization in Many-Core Systems

Published: 01 September 2019 Publication History

Abstract

Due to technology scaling, lifetime reliability is becoming one of the major design constraints in the performance optimization of future many-core systems. Given a lifetime reliability constraint, the existing lifetime-constrained runtime mapping schemes often lead to low throughput because of the requirement to map all applications to compact regions. In this paper, we propose a runtime application mapping scheme that exploits a <italic>borrowing strategy</italic> to improve the throughput of many-core systems given a lifetime constraint. First, we propose using different strategies for mapping communication-intensive applications and computation-intensive applications. The lifetime reliability constraint can be relaxed in the local time scale when the communication requirement is high. The throughput is improved because the communication distance of communication-intensive applications is optimized while the waiting time of computation-intensive application is reduced. Then, we propose a method to effectively classify applications depending on the communication-to-computation ratio. A dynamic threshold is determined according to the current locations of available cores. Finally, we propose an improved neighborhood allocation scheme to reduce the communication cost in the task mapping. The experimental results show that compared to the state-of-the-art lifetime-constrained mapping, the proposed mapping scheme improves the throughput of many-core systems by 26&#x0025; on average for synthetic task graphs and by 20&#x0025; on average for realistic task graphs while the lifetime reliability is maintained within a constraint.

References

[1]
C. Ramey, “TILE-Gx100 manycore processor: Acceleration interfaces and architecture,” in Proc. IEEE Hot Chips 23 Symp. (HCS), Stanford, CA, USA, 2011, pp. 1–21.
[2]
A. Sodaniet al., “Knights landing: Second-generation Intel Xeon Phi product,” IEEE Micro, vol. 36, no. 2, pp. 34–46, Mar./Apr. 2016.
[3]
A. K. Singh, M. Shafique, A. Kumar, and J. Henkel, “Mapping on multi/many-core systems: Survey of current and emerging trends,” in Proc. 50th ACM/EDAC/IEEE Design Autom. Conf. (DAC), Austin, TX, USA, 2013, pp. 1–10.
[4]
C.-L. Chou and R. Marculescu, “Run-time task allocation considering user behavior in embedded multiprocessor networks-on-chip,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 29, no. 1, pp. 78–91, Jan. 2010.
[5]
M. Fattah, M. Daneshtalab, P. Liljeberg, and J. Plosila, “Smart hill climbing for agile dynamic mapping in many-core systems,” in Proc. 50th ACM/EDAC/IEEE Design Autom. Conf. (DAC), Austin, TX, USA, 2013, pp. 1–6.
[6]
M.-H. Haghbayanet al., “MapPro: Proactive runtime mapping for dynamic workloads by quantifying ripple effect of applications on networks-on-chip,” in Proc. 9th Int. Symp. Netw. Chip (NoCS), vol. 8. Vancouver, BC, Canada, 2015, pp. 1–26.
[7]
J. Ng, X. Wang, A. K. Singh, and T. Mak, “Defragmentation for efficient runtime resource management in NoC-based many-core systems,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 24, no. 11, pp. 3359–3372, Nov. 2016.
[8]
A. Pathania, V. Venkataramani, M. Shafique, T. Mitra, and J. Henkel, “Defragmentation of tasks in many-core architecture,” ACM Trans. Archit. Code Optim., vol. 14, no. 1, pp. 1–2, Mar. 2017.
[9]
J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, “The case for lifetime reliability-aware microprocessors,” in Proc. 31st Annu. Int. Symp. Comput. Archit. (ISCA), Munich, Germany, 2004, pp. 276–285.
[10]
W. Song, S. Mukhopadhyay, and S. Yalamanchili, “Architectural reliability: Lifetime reliability characterization and management of many-core processors,” IEEE Comput. Archit. Lett., vol. 14, no. 2, pp. 103–106, Jul./Dec. 2015.
[11]
L. Huang, F. Yuan, and Q. Xu, “Lifetime reliability-aware task allocation and scheduling for MPSoC platforms,” in Proc. Design Autom. Test Europe Conf. Exhibit. (DATE), Nice, France, 2009, pp. 51–56.
[12]
Z. Yang, C. Serafy, T. Lu, and A. Srivastava, “Phase-driven learning-based dynamic reliability management for multi-core processors,” in Proc. 54th Annu. Design Autom. Conf. (DAC), Austin, TX, USA, 2017, pp. 1–6.
[13]
P. Mercati, A. Bartolini, F. Paterna, T. S. Rosing, and L. Benini, “Workload and user experience-aware dynamic reliability management in multicore processors,” in Proc. 50th ACM/EDAC/IEEE Design Autom. Conf. (DAC), Austin, TX, USA, 2013, pp. 1–6.
[14]
C. Zhuo, D. Sylvester, and D. Blaauw, “Process variation and temperature-aware reliability management,” in Proc. Design Autom. Test Europe Conf. Exhibit. (DATE), Dresden, Germany, 2010, pp. 580–585.
[15]
E. Karl, D. Blaauw, D. Sylvester, and T. Mudge, “Reliability modeling and management in dynamic microprocessor-based systems,” in Proc. 43rd ACM/IEEE Design Autom. Conf. (DAC), San Francisco, CA, USA, 2006, pp. 1057–1060.
[16]
T. Kim, X. Huang, H.-B. Chen, V. Sukharev, and S. X.-D. Tan, “Learning-based dynamic reliability management for dark silicon processor considering EM effects,” in Proc. Design Autom. Test Europe Conf. Exhibit. (DATE), Dresden, Germany, 2016, pp. 463–468.
[17]
L. Huang and Q. Xu, “Energy-efficient task allocation and scheduling for multi-mode MPSoCs under lifetime reliability constraint,” in Proc. Design Autom. Test Europe Conf. Exhibit. (DATE), Dresden, Germany, 2010, pp. 1584–1589.
[18]
M. H. Haghbayan, A. Miele, A. M. Rahmani, P. Liljeberg, and H. Tenhunen, “A lifetime-aware runtime mapping approach for many-core systems in the dark silicon era,” in Proc. Design Autom. Test Europe Conf. Exhibit. (DATE), Dresden, Germany, 2016, pp. 854–857.
[19]
L. Wang, X. Wang, H.-F. Leung, and T. Mak, “Throughput optimization for lifetime budgeting in many-core systems,” in Proc. Great Lakes Symp. VLSI (GLSVLSI), Banff, AB, Canada, 2017, pp. 451–454.
[20]
C.-L. Chou, U. Y. Ogras, and R. Marculescu, “Energy-and performance-aware incremental mapping for networks on chip with multiple voltage levels,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 27, no. 10, pp. 1866–1879, Oct. 2008.
[21]
M. Fattah, M. Ramirez, M. Daneshtalab, P. Liljeberg, and J. Plosila, “CoNA: Dynamic application mapping for congestion reduction in many-core systems,” in Proc. 30th Int. Conf. Comput. Design (ICCD), Montreal, QC, Canada, 2012, pp. 364–370.
[22]
M. Fattah, P. Liljeberg, J. Plosila, and H. Tenhunen, “Adjustable contiguity of run-time task allocation in networked many-core systems,” in Proc. Asia South Pac. Design Autom. Conf. (ASP-DAC), Singapore, 2014, pp. 349–354.
[23]
A. S. Hartman and D. E. Thomas, “Lifetime improvement through runtime wear-based task mapping,” in Proc. 10th Int. Conf. Hardw. Softw. Codesign Syst. Synth. (CODES+ISSS), Tampere, Finland, 2012, pp. 13–22.
[24]
J. Sunet al., “Workload assignment considering NBTI degradation in multicore systems,” ACM J. Emerg. Technol. Comput. Syst., vol. 10, no. 1, pp. 1–4, Jan. 2014.
[25]
W. Liu, J. Yi, M. Li, P. Chen, and L. Yang, “Energy-efficient application mapping and scheduling for lifetime guaranteed MPSoCs,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., to be published. 10.1109/TCAD.2018.2801242.
[26]
D. Gnadet al., “Hayat: Harnessing dark silicon and variability for aging deceleration and balancing,” in Proc. Design Autom. Test Europe Conf. Exhibit. (DATE), San Francisco, CA, USA, 2015, pp. 1–6.
[27]
S. Kobbe, L. Bauer, D. Lohmann, W. Schröder-Preikschat, and J. Henkel, “DistRM: Distributed resource management for on-chip many-core systems,” in Proc. 9th IEEE/ACM/IFIP Int. Conf. Hardw. Softw. Codesign Syst. Synth. (CODES+ISSS), Taipei, Taiwan, 2011, pp. 119–128.
[28]
P. Mercati, A. Bartolini, F. Paterna, L. Benini, and T. S. Rosing, “An on-line reliability emulation framework,” in Proc. 12th IEEE Int. Conf. Embedded Ubiquitous Comput. (EUC), Milan, Italy, 2014, pp. 334–339.
[29]
S. Paganiet al., “TSP: Thermal safe power: Efficient power budgeting for many-core systems in dark silicon,” in Proc. Int. Conf. Hardw. Softw. Codesign Syst. Synth. (CODES+ISSS), 2014, pp. 1–10.
[30]
M.-H. Haghbayanet al., “Dark silicon aware power management for manycore systems under dynamic workloads,” in Proc. Int. Conf. Comput. Design (ICCD), Seoul, South Korea, 2014, pp. 509–512.
[31]
H. Khdr, S. Pagani, M. Shafique, and J. Henkel, “Thermal constrained resource management for mixed ILP-TLP workloads in dark silicon chips,” in Proc. 52nd Annu. Design Autom. Conf. (DAC), San Francisco, CA, USA, 2015, pp. 1–6.
[32]
X. Wanget al., “Bubble budgeting: Throughput optimization for dynamic workloads by exploiting dark cores in many core systems,” IEEE Trans. Comput., vol. 67, no. 2, pp. 178–192, Feb. 2018.
[33]
L. Yanget al., “FoToNoC: A folded torus-like network-on-chip based many-core systems-on-chip in the dark silicon era,” IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 7, pp. 1905–1918, Jul. 2017.
[34]
S. Liet al., “McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures,” in Proc. IEEE/ACM Int. Symp. Microarchitect. (MICRO), New York, NY, USA, 2009, pp. 469–480.
[35]
W. Huanget al., “HotSpot: A compact thermal modeling methodology for early-stage VLSI design,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 14, no. 5, pp. 501–513, May 2006.
[36]
C. Bolchini, M. Carminati, M. Gribaudo, and A. Miele, “A lightweight and open-source framework for the lifetime estimation of multicore systems,” in Proc. Int. Conf. Comput. Design (ICCD), Seoul, South Korea, 2014, pp. 166–172.
[37]
F. Suter. (2013). Daggen: A Synthethic Task Graph Generator. [Online]. Available: https://github.com/frs69wq/daggen
[38]
D. Bertozziet al., “NoC synthesis flow for customized domain specific multiprocessor systems-on-chip,” IEEE Trans. Parallel Distrib. Syst., vol. 16, no. 2, pp. 113–129, Feb. 2005.
[39]
S. Murali, C. Seiculescu, L. Benini, and G. D. Micheli, “Synthesis of networks on chips for 3D systems on chips,” in Proc. Asia South Pac. Design Autom. Conf. (ASP-DAC), Yokohama, Japan, 2009, pp. 242–247.
[40]
J. Hu and R. Marculescu, “Energy-and performance-aware mapping for regular NoC architectures,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 24, no. 4, pp. 551–562, Apr. 2005.

Cited By

View all
  • (2025)Reinforcement learning for thermal and reliability management in manycore systemsDesign Automation for Embedded Systems10.1007/s10617-024-09292-029:1Online publication date: 1-Mar-2025
  • (2023)COP: A Combinational Optimization Power Budgeting Method for Manycore Systems in Dark SiliconIEEE Transactions on Computers10.1109/TC.2022.321141772:5(1356-1370)Online publication date: 1-May-2023

Index Terms

  1. A Lifetime Reliability-Constrained Runtime Mapping for Throughput Optimization in Many-Core Systems
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
      IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems  Volume 38, Issue 9
      Sept. 2019
      200 pages

      Publisher

      IEEE Press

      Publication History

      Published: 01 September 2019

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Reinforcement learning for thermal and reliability management in manycore systemsDesign Automation for Embedded Systems10.1007/s10617-024-09292-029:1Online publication date: 1-Mar-2025
      • (2023)COP: A Combinational Optimization Power Budgeting Method for Manycore Systems in Dark SiliconIEEE Transactions on Computers10.1109/TC.2022.321141772:5(1356-1370)Online publication date: 1-May-2023

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media