Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Selecting Heterogeneous Cores for Diversity

Published: 16 December 2016 Publication History
  • Get Citation Alerts
  • Abstract

    Mobile devices with heterogeneous processors are becoming mainstream. With a heterogeneous processor, the runtime scheduler can pick the best CPU core for a given task based on program characteristics, performance requirements, and power limitations. For a heterogeneous processor to be effective, it must contain a diverse set of cores to match a range of runtime requirements and program behaviors. Selecting a diverse set of cores is, however, a non-trivial problem. Power and performance are dependent on both program features and the microarchitectural features of cores, and a selection of cores must satisfy the competing demands of different types of programs. We present a method of core selection that chooses cores at a range of power-performance points. Our algorithm is based on the observation that it is not necessary for a core to consistently have high performance or low power; one type of core can fulfill different roles for different types of programs. Given a power budget, cores selected with our method provide an average speedup of 6% on EEMBC mobile benchmarks and a 24% speedup on SPEC 2006 integer benchmarks over the state-of-the-art core selection method.

    References

    [1]
    Mohamad Hammam Alsafrjalani and Ann Gordon-Ross. 2014. Dynamic scheduling for reduced energy in configuration-subsetted heterogeneous multicore systems. In International Conference on Embedded and Ubiquitous Computing (EUC).
    [2]
    Murali Annavaram, Ed Grochowski, and John Shen. 2005. Mitigating Amdahl’s Law through EPI throttling. In International Symposium on Computer Architecture (ISCA).
    [3]
    Omid Azizi, Aqeel Mahesri, Benjamin C. Lee, Sanjay J. Patel, and Mark Horowitz. 2010. Energy-performance tradeoffs in processor architecture and circuit design: A marginal cost analysis. In International Symposium on Computer Architecture (ISCA).
    [4]
    Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The gem5 simulator. ACM SIGARCH Comput. Arch. News 39, 2 (Aug. 2011), 7.
    [5]
    Kihwan Choi, Wonbok Lee, Ramakrishna Soma, and Massoud Pedram. 2004. Dynamic voltage and frequency scaling under a precise energy model considering variable and fixed components of the system power dissipation. In International Conference on Computer Aided Design (ICCAD).
    [6]
    Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and T. Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 2 (Apr. 2002).
    [7]
    Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. 2012. Execution migration in a heterogeneous-ISA chip multiprocessor. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
    [8]
    Gaurav Dhiman, Kishore Kumar Pusukuri, and Tajana Rosing. 2008. Analysis of dynamic voltage scaling for system level energy management. In USENIX Workshop on Power Aware Computing Systems (HotPower).
    [9]
    Christophe Dubach, Timothy M. Jones, and Michael F. P. O’Boyle. 2008. Exploring and predicting the architecture/optimising compiler co-design space. In International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES).
    [10]
    Rotem Efraim, Ran Ginosar, Uri Weiser, and Avi Mendelson. 2014. Energy aware race to halt: A down to EARtH approach for platform energy management. Comput. Archit. Lett. 13 (Jan. 2014).
    [11]
    Maja Etinski, Julita Corbalán, Jesús Labarta, and Mateo Valero. 2012. Understanding the future of energy-performance trade-off via DVFS in HPC environments. J. Parallel Distrib. Comput. 72, 4 (2012), 579--590.
    [12]
    Stijn Eyerman and Lieven Eeckhout. 2008. System-level performance metrics for multiprogram workloads. IEEE Micro 28, 3 (May. 2008), 42--53.
    [13]
    Stijn Eyerman and Lieven Eeckhout. 2014. The benefit of SMT in the multi-core era: Flexibility towards degrees of thread-level parallelism. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
    [14]
    Stijn Eyerman, Pierre Michaud, and Wouter Rogiest. 2014. Multiprogram throughput metrics: A systematic approach. ACM Trans. Archit. Code Optim. 11, 3 (Oct. 2014).
    [15]
    Nathan Goulding-Hotta, Jack Sampson, Ganesh Venkatesh, Saturnino Garcia, Joe Auricchio, Po-Chao Huang, Manish Arora, Siddhartha Nath, Vikram Bhatt, Jonathan Babb, Steven Swanson, and Michael Bedford Taylor. 2011. The GreenDroid mobile application processor: An architecture for silicon’s dark future. IEEE Micro 31, 2 (Mar. 2011).
    [16]
    Peter Greenhalgh. 2011. Big.LITTLE processing with ARM Cortex-A15 & Cortex-A7. White paper. ARM Ltd.
    [17]
    Marisabel Guevara, Benjamin Lubin, and Benjamin C. Lee. 2014. Strategies for anticipating risk in heterogeneous system design. In International Symposium on High Performance Computer Architecture (HPCA).
    [18]
    Sukhun Kang and Rakesh Kumar. 2008. Magellan: A search and machine learning-based framework for fast multi-core design space exploration and optimization. In Design, Automation, and Test in Europe Conference (DATE).
    [19]
    Tejas S. Karkhanis and James E. Smith. 2007. Automated design of application specific superscalar processors: An analytical approach. In International Symposium on Computer Architecture (ISCA).
    [20]
    Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, and Dean M. Tullsen. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In International Symposium on Microarchitecture (MICRO).
    [21]
    Rakesh Kumar, Dean M. Tullsen, and Norman P. Jouppi. 2006. Core architecture optimization for heterogeneous chip multiprocessors. In International Conference on Parallel Architectures and Compilation Techniques (PACT). 10.
    [22]
    Etienne Le Sueur and Gernot Heiser. 2010. Dynamic voltage and frequency scaling: The laws of diminishing returns. In Workshop on Power Aware Computing and Systems (HotPower).
    [23]
    Etienne Le Sueur and Gernot Heiser. 2011. Slow down or sleep, that is the question. In USENIX Annual Technical Conference (USENIXATC). Retrieved from http://dl.acm.org/citation.cfm?id=2002181.2002197.
    [24]
    Benjamin C. Lee and David M. Brooks. 2006. Accurate and efficient regression modeling for microarchitectural performance and power prediction. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
    [25]
    Benjamin C. Lee and David M. Brooks. 2007. Illustrative design space studies with microarchitectural regression models. In International Symposium on High Performance Computer Architecture (HPCA). IEEE.
    [26]
    Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In International Symposium on Microarchitecture (MICRO).
    [27]
    Tong Li, Paul Brett, Rob Knauerhase, David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In International Symposium on High Performance Computer Architecture (HPCA).
    [28]
    Hung-Yi Liu, Ilias Diakonikolas, Michele Petracca, and Luca Carloni. 2011. Supervised design space exploration by compositional approximation of pareto sets. In Design Automation Conference (DAC).
    [29]
    Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Ronald Dreslinski Jr., Thomas F. Wenisch, and Scott Mahlke. 2014. Heterogeneous microarchitectures trump voltage scaling for low-power cores. In International Conference on Parallel Architectures and Compilation Techniques (PACT).
    [30]
    Alain J. Martin, Mika Nyström, and Paul I. Pénzes. 2002. ET2: A metric for time and energy efficiency of computation. In Power Aware Computing, Robert Graybill and Rami Melhem (Eds.). Springer, 293--315.
    [31]
    Sandeep Navada, Niket K. Choudhary, Salil V. Wadhavkar, and Eric Rotenberg. 2013. A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors. In International Conference on Parallel Architectures and Compilation Techniques (PACT). Retrieved from http://dl.acm.org/citation.cfm?id=2523721.2523743.
    [32]
    Sankaralingam Panneerselvam and Michael M. Swift. 2016. Firestorm: Operating Systems for Power-Constrained Architectures. Technical report. University of Wisconsin--Madison. Retrieved from http://digital.library.wisc.edu/1793/75140.
    [33]
    Jason A. Poovey, Markus Levy, Shay Gal-On, and Thomas M. Conte. 2009. A benchmark characterization of the EEMBC benchmark suite. IEEE Micro 29, 5 (Sep. 2009), 18--29.
    [34]
    R. Core Team. 2013. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from http://www.R-project.org/.
    [35]
    Arun Raghavan, Laurel Emurian, Lei Shao, Marios Papaefthymiou, Kevin P. Pipe, Thomas F. Wenisch, and Milo M. K. Martin. 2013. Computational sprinting on a hardware/software testbed. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 12.
    [36]
    Shaolei Ren, Yuxiong He, and Kathryn S. McKinley. 2014. A theoretical foundation for scheduling and designing heterogeneous processors for interactive applications. In International Symposium on Distributed Computing (DISC).
    [37]
    Allan Snavely and Dean M. Tullsen. 2000. Symbiotic jobscheduling for a simultaneous multithreaded processor. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
    [38]
    Bo Su, Junli Gu, Li Shen, Wei Huang, Joseph L. Greathouse, and Zhiying Wang. 2014. PPEP: Online performance, power, and energy prediction framework and DVFS space exploration. In International Symposium on Microarchitecture (MICRO).
    [39]
    Dam Sunwoo, William Wang, Mrinmoy Ghosh, Chander Sudanthi, Geoffrey Blake, Christopher D. Emmons, and Nigel C. Paver. 2013. A structured approach to the simulation, analysis and characterization of smartphone applications. In International Symposium on Workload Characterization (IISWC).
    [40]
    Erik Tomusk, Christophe Dubach, and Michael O’Boyle. 2015a. Diversity: A design goal for heterogeneous processors. Computer Architecture Letters (2015).
    [41]
    Erik Tomusk, Christophe Dubach, and Michael O’boyle. 2015b. Four metrics to evaluate heterogeneous multicores. ACM Trans. Archit. Code Optim. 12, 4 (Nov. 2015).
    [42]
    Yatish Turakhia, Bharathwaj Raghunathan, Siddharth Garg, and Diana Marculescu. 2013. HaDeS: Architectural synthesis for heterogeneous dark silicon chip multi-processors. In Design Automation Conference (DAC).
    [43]
    Kenzo Van Craeynest and Lieven Eeckhout. 2013. Understanding fundamental design choices in single-ISA heterogeneous multicore architectures. ACM Trans. Archit. Code Optim. 9, 4 (Jan. 2013).
    [44]
    Ganesh Venkatesh, Jack Sampson, Nathan Goulding, Saturnino Garcia, Vladyslav Bryksin, Jose Lugo-Martinez, Steven Swanson, and Michael Bedford Taylor. 2010. Conservation cores: Reducing the energy of mature computations. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
    [45]
    Ganesh Venkatesh, Jack Sampson, Nathan Goulding-Hotta, Sravanthi Kota Venkata, Michael Bedford Taylor, and Steven Swanson. 2011. QsCores: Trading dark silicon for scalable energy efficiency with quasi-specific cores. In International Symposium on Microarchitecture (MICRO).
    [46]
    Richard Vuduc, James W. Demmel, and Jeff Bilmes. 2004. Statistical models for empirical search-based performance tuning. Int. J. High Perf. Comput. Appl. 18, 1 (Feb. 2004).
    [47]
    Sam Xi, Hans Jacobson, Pradip Bose, Gu-Yeon Wei, and David Brooks. 2015. Quantifying sources of error in McPAT and potential impacts on architectural studies. In International Symposium on High Performance Computer Architecture (HPCA). 577--589.
    [48]
    Yuhao Zhu and Vijay Janapa Reddi. 2013. High-performance and energy-efficient mobile web browsing on big/little systems. In International Symposium on High Performance Computer Architecture (HPCA2013).
    [49]
    Eckart Zitzler and Lothar Thiele. 1999. Multiobjective evolutionary algorithms: A comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 3, 4 (Nov. 1999).

    Cited By

    View all
    • (2019)Generative and multi-phase learning for computer systems optimizationProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3326633(39-52)Online publication date: 22-Jun-2019
    • (2018)Energy-Efficient Actor Execution for SDF Application on Heterogeneous Architectures2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP2018.2018.00083(486-493)Online publication date: Mar-2018
    • (2018)Navigating the Landscape for Real-Time Localization and Mapping for Robotics and Virtual and Augmented RealityProceedings of the IEEE10.1109/JPROC.2018.2856739106:11(2020-2039)Online publication date: Nov-2018

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 13, Issue 4
    December 2016
    648 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/3012405
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 December 2016
    Accepted: 01 October 2016
    Revised: 01 October 2016
    Received: 01 May 2016
    Published in TACO Volume 13, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Heterogeneous
    2. core selection
    3. design space exploration
    4. diversity
    5. flexibility
    6. power-aware
    7. single-ISA

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)43
    • Downloads (Last 6 weeks)6

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Generative and multi-phase learning for computer systems optimizationProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3326633(39-52)Online publication date: 22-Jun-2019
    • (2018)Energy-Efficient Actor Execution for SDF Application on Heterogeneous Architectures2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)10.1109/PDP2018.2018.00083(486-493)Online publication date: Mar-2018
    • (2018)Navigating the Landscape for Real-Time Localization and Mapping for Robotics and Virtual and Augmented RealityProceedings of the IEEE10.1109/JPROC.2018.2856739106:11(2020-2039)Online publication date: Nov-2018

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media