Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1854273.1854283acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

Scalable thread scheduling and global power management for heterogeneous many-core architectures

Published: 11 September 2010 Publication History
  • Get Citation Alerts
  • Abstract

    Future many-core microprocessors are likely to be heterogeneous, by design or due to variability and defects. The latter type of heterogeneity is especially challenging due to its unpredictability. To minimize the performance and power impact of these hardware imperfections, the runtime thread scheduler and global power manager must be nimble enough to handle such random heterogeneity. With hundreds of cores expected on a single die in the future, these algorithms must provide high power-performance efficiency, yet remain scalable with low runtime overhead.
    This paper presents a range of scheduling and power management algorithms and performs a detailed evaluation of their effectiveness and scalability on heterogeneous many-core architectures with up to 256 cores. We also conduct a limit study on the potential benefits of coordinating scheduling and power management and demonstrate that coordination yields little benefit. We highlight the scalability limitations of previously proposed thread scheduling algorithms that were designed for small-scale chip multiprocessors and propose a Hierarchical Hungarian Scheduling Algorithm that dramatically reduces the scheduling overhead without loss of accuracy. Finally, we show that the high computational requirements of prior global power management algorithms based on linear programming make them infeasible for many-core chips, and that an algorithm that we call Steepest Drop achieves orders of magnitude lower execution time without sacrificing power-performance efficiency.

    References

    [1]
    }}N. Aggarwal, P. Ranganathan, N.P. Jouppi, and J. E. Smith. Configurable Isolation: Building High Availability Systems with Commodity Multi-Core Processors. In Proceedings of the 34th International Symposium on Computer Architecture (ISCA), June 2007, pp. 470--481.
    [2]
    }}S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The Impact of Performance Asymmetry in Emerging Multicore Architectures. In Proceedings of the 32nd International Symposium on Computer Architecture (ISCA), June 2005, pp. 506--517.
    [3]
    }}M. Becchi and P. Crowley. Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures. In Proceedings the of ACM International Conference on Computing Frontiers (CF), 2006, pp. 29--39.
    [4]
    }}S. Borkar. Designing Reliable Systems from Unreliable Components: The Challenges of Transistor Variability and Degradation. In IEEE Micro, Nov./Dec. 2005, 25(6):10--16.
    [5]
    }}F.A. Bower, D. J. Sorin, and L.P. Cox. The Impact of Dynamically Heterogeneous Multicore Processors on Thread Scheduling. In IEEE Micro, May/June 2008, 28(3):17--25.
    [6]
    }}F.A. Bower, P.G. Shealy, S. Orev, and D.J. Sorin. Tolerating Hard Faults in Microprocessor Array Structures. In Proceedings of the 34th International Conference on Dependable Systems and Networks (DSN), June 2004, pp. 51--60.
    [7]
    }}D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. In Proceedings of the 27th International Symposium on Computer Architecture (ISCA), June 2000, pp. 83--94.
    [8]
    }}R. Burkard, M. Dell'Amico, and S. Martello. Assignment Problems. Published by the Society of Industrial and Applied Mathematics, Philadelphia, PA, 2009, pp. 73--87.
    [9]
    }}J. Chen, M. Annavaram, and M. Dubois. SlackSim: A Platform for Parallel Simulations of CMPs on CMPs. In the Workshop on Design, Analysis, and Simulation of Chip Multiprocessors (dasCMP), Nov. 2008.
    [10]
    }}S. Ghiasi, T. Keller, and F. Rawson. Scheduling for Heterogeneous Processors in Server Systems. In Proceedings of the ACM International Conference on Computing Frontiers (CF), May 2005, pp. 199--210.
    [11]
    }}I. Griva, S.G. Nash, and A. Sofer. Linear and Nonlinear Optimization. Published by the Society of Industrial and Applied Mathematics, Philadelphia, PA, 2009, pp. 301--317.
    [12]
    }}S. Herbert and D. Marculescu. Variation-Aware Dynamic Voltage/Frequency Scaling. In Proceedings of the 15th International Symposium on High-Performance Computer Architecture (HPCA), Feb. 2009, pp. 301--312.
    [13]
    }}E. Humenay, D. Tarjan, and K. Skadron. Impact of Process Variations on Multi-Core Performance Symmetry. In Proceedings of Design, Automation and Test in Europe (DATE), April 2007, pp. 1653--1658.
    [14]
    }}Intel Corporation. From a Few Cores to Many: A Tera-scale Computing Research Overview, Whitepaper, 2006.
    [15]
    }}C. Isci, A. Buyuktosunoglu, C-Y. Cher, P. Bose, and M. Martonosi. An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget. In Proceedings of the 39th International Symposium on Microarchitecture (MICRO), Dec. 2006, pp. 347--358.
    [16]
    }}P. Juang, Q. Wu, L-S. Peh, M. Martonosi, and D. W. Clark. Coordinated, Distributed, Formal Energy Management of CMP Multiprocessors. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), Aug. 2005, pp. 127--130.
    [17]
    }}R. Kumar, D.M. Tullsen, P. Ranganathan, N.P. Jouppi, and K. I. Farkas. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance. In Proceedings of the 31st International Symposium on Computer Architecture (ISCA), June 2004, pp. 64--75.
    [18]
    }}T. Li, D. Baumberger, D.A. Koufaty, and S. Hahn. Efficient Operating System Scheduling for Performance-Asymmetric Multi-Core Architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC07), Nov 2007.
    [19]
    }}K. Meng, R. Joseph, R.P. Dick, and L. Shang. Multi-Optimization Power Management for Chip Multiprocessors. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), Oct 2008, pp. 177--186.
    [20]
    }}M. Monchiero, J.-H. Ahn, A. Falcón, D. Ortega, and P. Faraboschi. How to Simulate 1000 Cores. In the Workshop on Design, Analysis, and Simulation of Chip Multiprocessors (dasCMP), Nov. 2008.
    [21]
    }}K.K. Rangan, G.-Y. Wei, and D. Brooks. Thread Motion: Fine-Grained Power Management for Multi-Core Systems. In Proceedings of the 36th International Symposium on Computer Architecture (ISCA), June 2009, pp. 302--313.
    [22]
    }}J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos. SESC: Cycle Accurate Architectural Simulator. http://sesc.sourceforge.net, 2005.
    [23]
    }}J. Sartori and R. Kumar. Distributed Peak Power Management for Many-core Architectures. In Proceedings of Design, Automation, and Test in Europe (DATE), April 2009.
    [24]
    }}E. Schuchman and T.N. Vijaykumar. Rescue: A Microarchitecture for Testability and Defect Tolerance. In Proceedings of the 32nd International Symposium on Computer Architecture (ISCA), June 2005, pp. 160--171.
    [25]
    }}J. Sharkey, A. Buyuktosunoglu, and P. Bose. Evaluating Design Tradeoffs in On-Chip Power Management for CMPs. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), Aug. 2007, pp. 44--49.
    [26]
    }}D. Shelepov, J.C.S. Alcaide, S. Jeffery, A. Fedorova, N. Perez, Z.F. Huang, S. Blagodurov, and V. Kumar. HASS: A Scheduler for Heterogeneous Multicore Systems. In the ACM SIGOPS Operating Systems Review, April 2009, pp. 66--75.
    [27]
    }}P. Shivakumar, S.W. Keckler, CR. Moore, and D. Burger. Exploiting Microarchitectural Redundancy for Defect Tolerance. In the Proceedings of the International Conference on Computer Design (ICCD), Oct. 2003, pp. 481--488.
    [28]
    }}S. Shyam, K. Constantinides, S. Phadke, V. Bertacco, and T. Austin. Ultra Low-Cost Defect Protection for Microprocessor Pipelines. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Oct. 2006, pp. 73--82.
    [29]
    }}K. Skadron, M.R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan. Temperature-Aware Microarchitecture. In Proceedings of the 30th International Symposium on Computer Architecture (ISCA), June. 2003, pp. 2--13.
    [30]
    }}J. Srinivasan, S.V. Adve, P. Bose, and J.A. Rivers. Exploiting Structural Duplication for Lifetime Reliability Enhancement. In Proceedings of the 32nd International Symposium on Computer Architecture (ISCA), June 2005, pp. 520--531.
    [31]
    }}D. Tarjan, S. Thoziyoor, and N.P. Jouppi. CACTI 4.0. HP Laboratories Palo Alto Technical Report HPL-2006-86, 2006.
    [32]
    }}R. Teodorescu and J. Torrellas. Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors. In Proceedings of the 35th International Symposium on Computer Architecture (ISCA), June 2008, pp. 363--374.
    [33]
    }}Y. Wang, K. Ma, and X. Wang. Temperature-Constrained Power Control for Chip Multiprocessors with Online Model Estimation. In Proceedings of the 36th International Symposium on Computer Architecture (ISCA), June 2009, pp. 314--324.
    [34]
    }}J.A. Winter and D.H. Albonesi. Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures. In Proceedings of the 38th International Conference on Dependable Systems and Networks, June 2008, pp. 42--51.
    [35]
    }}Y. Zhang, D. Parikh, K. Sankaranarayanan, K. Skadron, and M. Stan. HotLeakage: A Temperature-Aware Model of Subthreshold and Gate Leakage for Architects. University of Virginia, Department of Computer Science, Technical Report CS-2003-05, March 2003.

    Cited By

    View all
    • (2023)SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy SavingProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607055(1-13)Online publication date: 12-Nov-2023
    • (2023)F-LEMMA: Fast Learning-Based Energy Management for Multi-/Many-Core ProcessorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.317621942:2(616-629)Online publication date: Feb-2023
    • (2023)Reducing energy consumption using heterogeneous voltage frequency scaling of data-parallel applications for multicore systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.01.005175(121-133)Online publication date: May-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques
    September 2010
    596 pages
    ISBN:9781450301787
    DOI:10.1145/1854273
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 September 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computational complexity
    2. global power management
    3. hard errors
    4. heterogeneous chip multiprocessors
    5. many-core architectures
    6. process variations
    7. scalability
    8. thread scheduling

    Qualifiers

    • Research-article

    Conference

    PACT '10
    Sponsor:
    • IFIP WG 10.3
    • IEEE CS TCPP
    • SIGARCH
    • IEEE CS TCAA

    Acceptance Rates

    Overall Acceptance Rate 121 of 471 submissions, 26%

    Upcoming Conference

    PACT '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy SavingProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607055(1-13)Online publication date: 12-Nov-2023
    • (2023)F-LEMMA: Fast Learning-Based Energy Management for Multi-/Many-Core ProcessorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.317621942:2(616-629)Online publication date: Feb-2023
    • (2023)Reducing energy consumption using heterogeneous voltage frequency scaling of data-parallel applications for multicore systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.01.005175(121-133)Online publication date: May-2023
    • (2023)A Multi-objective Evolution Strategy for Real-Time Task Placement on Heterogeneous ProcessorsIntelligent Systems Design and Applications10.1007/978-3-031-35501-1_45(448-457)Online publication date: 3-Jun-2023
    • (2022)Towards QoS-Based Embedded Machine LearningElectronics10.3390/electronics1119320411:19(3204)Online publication date: 6-Oct-2022
    • (2022)TokenSmart: Distributed, Scalable Power Management in the Many-core EraACM Transactions on Architecture and Code Optimization10.1145/355976220:1(1-26)Online publication date: 17-Nov-2022
    • (2022)Adaptive Power Shifting for Power-Constrained Heterogeneous SystemsIEEE Transactions on Computers10.1109/TC.2022.3174545(1-1)Online publication date: 2022
    • (2022)Heterogeneous Voltage Frequency Scaling of Data-Parallel Applications for Energy Saving on Homogeneous Multicore PlatformsEuro-Par 2021: Parallel Processing Workshops10.1007/978-3-031-06156-1_12(141-153)Online publication date: 9-Jun-2022
    • (2020)CuttleSys: Data-Driven Resource Management for Interactive Services on Reconfigurable Multicores2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00060(650-664)Online publication date: Oct-2020
    • (2020) S 4 oC: A Self-Optimizing, Self-Adapting Secure System-on-Chip Design Framework to Tackle Unknown Threats — A Network Theoretic, Learning Approach 2020 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS45731.2020.9180687(1-8)Online publication date: Oct-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media