Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey
Public Access

A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors

Published: 08 February 2016 Publication History

Abstract

To meet the needs of a diverse range of workloads, asymmetric multicore processors (AMPs) have been proposed, which feature cores of different microarchitecture or ISAs. However, given the diversity inherent in their design and application scenarios, several challenges need to be addressed to effectively architect AMPs and leverage their potential in optimizing both sequential and parallel performance. Several recent techniques address these challenges. In this article, we present a survey of architectural and system-level techniques proposed for designing and managing AMPs. By classifying the techniques on several key characteristics, we underscore their similarities and differences. We clarify the terminology used in this research field and identify challenges that are worthy of future investigation. We hope that more than just synthesizing the existing work on AMPs, the contribution of this survey will be to spark novel ideas for architecting future AMPs that can make a definite impact on the landscape of next-generation computing systems.

References

[1]
Arunachalam Annamalai, Rance Rodrigues, Israel Koren, and Sandip Kundu. 2013. An opportunistic prediction-based thread scheduling to maximize throughput/watt in AMPs. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 63--72.
[2]
Murali Annavaram, Ed Grochowski, and John Shen. 2005. Mitigating Amdahl’s law through EPI throttling. In Proceedings of the International Symposium on Computer Architecture (ISCA’05). 298--309.
[3]
Amin Ansari, Shuguang Feng, Shantanu Gupta, Josep Torrellas, and Scott Mahlke. 2013. Illusionist: Transforming lightweight cores into aggressive cores on demand. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’13). 436--447.
[4]
ARM. 2015a. big.LITTLE Technology. Retrieved December 29, 2015, from http://www.arm.com/products/processors/technologies/biglittleprocessing.php.
[5]
ARM. 2015b. Cortex-A Series Processors. Retrieved December 29, 2015, from http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.set.cortexa/index.html.
[6]
Saisanthosh Balakrishnan, Ravi Rajwar, Mike Upton, and Konrad Lai. 2005. The impact of performance asymmetry in emerging multicore architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA’05). 506--517.
[7]
Antonio Barbalace, Marina Sadini, Saif Ansary, Christopher Jelesnianski, Akshay Ravichandran, Cagil Kendir, Alastair Murray, and Binoy Ravindran. 2015. Popcorn: Bridging the programmability gap in heterogeneous-ISA platforms. In Proceedings of the European Conference on Computer Systems (EuroSys’15). 29:1--29:16.
[8]
Michela Becchi and Patrick Crowley. 2006. Dynamic thread assignment on heterogeneous multiprocessor architectures. In Proceedings of the Computing Frontiers Conference (CF’06). 29--40.
[9]
Jeffery Brown, Leo Porter, and Dean M. Tullsen. 2011. Fast thread migration via cache working set prediction. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’11). 193--204.
[10]
Ting Cao, Stephen M. Blackburn, Tiejun Gao, and Kathryn S. McKinley. 2012. The yin and yang of power and performance for asymmetric hardware and managed software. In Proceedings of the International Symposium on Computer Architecture (ISCA’12). 225--236.
[11]
Jian Chen and Lizy Kurian John. 2008. Energy-aware application scheduling on a heterogeneous multi-core system. In Proceedings of the International Symposium on Workload Characterization (IISWC’08). 5--13.
[12]
Jian Chen and Lizy Kurian John. 2009. Efficient program scheduling for heterogeneous multi-core processors. In Proceedings of the Design Automation Conference (DAC’09). 927--930.
[13]
Quan Chen and Minyi Guo. 2014. Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures. ACM Transactions on Architecture and Code Optimization 11, 1, 8:1--8:25.
[14]
Nagabhushan Chitlur, Ganapati Srinivasa, Scott Hahn, Pragya K. Gupta, Dheeraj Reddy, David Koufaty, Paul Brett, Abirami Prabhakaran, Li Zhao, Nelson Ijih, Suchit Subhaschandra, Sabina Grover, Xiaowei Jiang, and Ravi Iyer. 2012. QuickIA: Exploring heterogeneous architectures on real prototypes. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’12). 1--8.
[15]
Jih-Ching Chiu, Yu-Liang Chou, and Po-Kai Chen. 2010. Hyperscalar: A novel dynamically reconfigurable multi-core architecture. In Proceedings of the International Conference on Parallel Processing (ICPP’10). 277--286.
[16]
CNXSoft. 2014. ARM Cortex A15/A17 SoCs Comparison—Nvidia Tegra K1 vs Samsung Exynos 5422 vs Rockchip RK3288 vs AllWinner A80. Retrieved December 29, 2015, from http://www.cnx-software.com/2014/05/21/comparison-nvidia-tegra-k1-samsung-exynos-5422-rockchip-rk3288-allwinner-a80/.
[17]
Jason Cong and Bo Yuan. 2012. Energy-efficient scheduling on heterogeneous multi-core architectures. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’12). 345--350.
[18]
Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. 2012. Execution migration in a heterogeneous-ISA chip multiprocessor. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 261--272.
[19]
Stijn Eyerman and Lieven Eeckhout. 2010. Modeling critical sections in Amdahl’s law and its implications for multicore design. In Proceedings of the International Symposium on Computer Architecture (ISCA’10). 362--370.
[20]
Stijn Eyerman and Lieven Eeckhout. 2014. The benefit of SMT in the multi-core era: Flexibility towards degrees of thread-level parallelism. ACM SIGARCH Computer Architecture News 42, 1, 591--606.
[21]
Chris Fallin, Chris Wilkerson, and Onur Mutlu. 2014. The heterogeneous block architecture. In Proceedings of the International Conference on Computer Design (ICCD’14). 386--393.
[22]
Andrei Frumusanu and Ryan Smith. 2015. ARM A53/A57/T760 Investigated—Samsung Galaxy Note 4 Exynos Review. Retrieved December 29, 2015, from http://www.anandtech.com/show/8718/the-samsung-galaxy-note-4-exynos-rev iew/6.
[23]
Giorgis Georgakoudis, Dimitrios S. Nikolopoulos, and Spyros Lalis. 2013. Fast dynamic binary rewriting to support thread migration in shared-ISA asymmetric multicores. In Proceedings of the International Workshop on Code Optimisation for Multi and Many Cores (COSMIC’13). 4:1--4:10.
[24]
Dan Gibson and David A. Wood. 2010. Forwardflow: A scalable core for power-constrained CMPs. ACM SIGARCH Computer Architecture News 38, 14--25.
[25]
Lori Gil. 2015. NVIDIAs Tegra X1 Crushes the Competition. Retrieved December 29, 2015, from http://liliputing.com/2015/02/nvidias-tegra-x1-crushes-the-competition.html.
[26]
Ryan E. Grant and Ahmad Afsahi. 2006. Power-performance efficiency of asymmetric multiprocessors for multi-threaded scientific applications. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’06).
[27]
Ed Grochowski, Ronny Ronen, John Shen, and Hong Wang. 2004. Best of both latency and throughput. In Proceedings of the IEEE International Conference on Computer Design (ICCD’04). 236--243.
[28]
Michael Gschwind, H. Peter Hofstee, Brian Flachs, Martin Hopkins, Yukio Watanabe, and Takeshi Yamazaki. 2006. Synergistic processing in Cell’s multicore architecture. IEEE Micro 26, 2, 10--24.
[29]
Divya P. Gulati, Changkyu Kim, Simha Sethumadhavan, Stephen W. Keckler, and Doug Burger. 2008. Multitasking workload scheduling on flexible-core chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 187--196.
[30]
Shantanu Gupta, Shuguang Feng, Amin Ansari, and Scott Mahlke. 2010. Erasing core boundaries for robust and configurable performance. In Proceedings of the International Symposium on Microarchitecture (MICRO’10). 325--336.
[31]
Vishal Gupta and Ripal Nathuji. 2010. Analyzing performance asymmetric multicore processors for latency sensitive datacenter applications. In Proceedings of the Workshop on Power Aware Computing and Systems (HotPower’10). 1--8.
[32]
Anthony Gutierrez, Ronald G. Dreslinski, and Trevor Mudge. 2014. Evaluating private vs. shared last-level caches for energy efficiency in asymmetric multi-cores. In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS’14). 191--198.
[33]
Mark D. Hill and Michael R. Marty. 2008. Amdahl’s law in the multicore era. IEEE Computer 7, 33--38.
[34]
Houman Homayoun, Vasileios Kontorinis, Amirali Shayan, Ta-Wei Lin, and Dean M. Tullsen. 2012. Dynamically heterogeneous cores through 3D resource pooling. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’12). 1--12.
[35]
Tomas Hruby, Herbert Bos, and Andrew S. Tanenbaum. 2013. When slower is faster: On heterogeneous multicores for reliable systems. In Proceedings of the USENIX Annual Technical Conference (ATC’13). 255--266.
[36]
Ineda. 2015. Ineda Dhanush Wearable Processing Unit.
[37]
Engin Ipek, Meyrem Kirman, Nevin Kirman, and Jose F. Martinez. 2007. Core fusion: Accommodating software diversity in chip multiprocessors. In Proceedings of the International Symposium on Computer Architecture (ISCA’07). 186--197.
[38]
Brian Jeff. 2012. Big.LITTLE system architecture from ARM: Saving power through heterogeneous multiprocessing and task context migration. In Proceedings of the ACM Design Automation Conference (DAC’12).
[39]
José A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt. 2012. Bottleneck identification and scheduling in multithreaded applications. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 223--234.
[40]
José A. Joao, M. Aater Suleman, Onur Mutlu, and Yale N. Patt. 2013. Utility-based acceleration of multithreaded applications on asymmetric CMPs. In Proceedings of the International Symposium on Computer Architecture (ISCA’13). 154--165.
[41]
B. H. H. Juurlink and C. H. Meenderinck. 2012. Amdahl’s law for predicting the future of multicores considered harmful. ACM SIGARCH Computer Architecture News 40, 2, 1--9.
[42]
Vahid Kazempour, Ali Kamali, and Alexandra Fedorova. 2010. AASH: An asymmetry-aware scheduler for hypervisors. ACM SIGPLAN Notices 45, 7, 85--96.
[43]
Omer Khan and Sandip Kundu. 2010. A self-adaptive scheduler for asymmetric multi-cores. In Proceedings of the ACM Great Lakes Symposium on VLSI (GLSVLSI’10). 397--400.
[44]
Khubaib Khubaib, M. Aater Suleman, Milad Hashemi, Chris Wilkerson, and Yale N. Patt. 2012. MorphCore: An energy-efficient microarchitecture for high performance ILP and high throughput TLP. In Proceedings of the International Symposium on Microarchitecture (MICRO’12). 305--316.
[45]
Changkyu Kim, Simha Sethumadhavan, Madhu S. Govindan, Nitya Ranganathan, Divya Gulati, Doug Burger, and Stephen W. Keckler. 2007. Composable lightweight processors. In Proceedings of the International Symposium on Microarchitecture (MICRO’07). 381--394.
[46]
Jun Kim, Joonwon Lee, and Jinkyu Jeong. 2015. Exploiting asymmetric CPU performance for fast startup of subsystem in mobile smart devices. IEEE Transactions on Consumer Electronics 61, 1, 103--111.
[47]
Myungsun Kim, Kibeom Kim, James R. Geraci, and Seongsoo Hong. 2014. Utilization-aware load balancing for the energy efficient operation of the big.LITTLE processor. In Proceedings of the Conference on Design, Automation, and Test in Europe (DATE’14). 223:1--223:4.
[48]
Byeong-Moon Ko, Joonwon Lee, and Heeseung Jo. 2012. AMP aware core allocation scheme for mobile devices. In Proceedings of the IEEE Spring Congress on Engineering and Technology (S-CET’12). 1--4.
[49]
David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Bias scheduling in heterogeneous multi-core architectures. In Proceedings of the European Conference on Computer Systems (EuroSys’10). 125--138.
[50]
Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy Ranganathan, and Dean M. Tullsen. 2003. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Proceedings of the International Symposium on Microarchitecture (MICRO’03). 81--92.
[51]
Rakesh Kumar, Norman P. Jouppi, and Dean M. Tullsen. 2004a. Conjoined-core chip multiprocessing. In Proceedings of the International Symposium on Microarchitecture (MICRO’04). 195--206.
[52]
Rakesh Kumar, Dean M. Tullsen, and Norman P. Jouppi. 2006. Core architecture optimization for heterogeneous chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’06). 23--32.
[53]
Rakesh Kumar, Dean M. Tullsen, Parthasarathy Ranganathan, Norman P. Jouppi, and Keith I. Farkas. 2004b. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. ACM SIGARCH Computer Architecture News 32, 64.
[54]
Youngjin Kwon, Changdae Kim, Seungryoul Maeng, and Jaehyuk Huh. 2011. Virtualizing performance asymmetric multi-core systems. In Proceedings of the International Symposium on Computer Architecture (ISCA’11). 45--56.
[55]
Nagesh B. Lakshminarayana and Hyesoon Kim. 2008. Understanding performance, power and energy behavior in asymmetric multiprocessors. In Proceedings of the International Conference on Computer Design (ICCD’08). 471--477.
[56]
Nagesh B. Lakshminarayana, Jaekyu Lee, and Hyesoon Kim. 2009. Age based scheduling for asymmetric multiprocessors. In Proceedings of the Conference on High Performance Computing Networking, Storage, and Analysis (SC’09). 25:1--25:12.
[57]
Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn. 2007. Efficient operating system scheduling for performance-asymmetric multi-core architectures. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’07). 53:1--53:11.
[58]
Tong Li, Paul Brett, Rob Knauerhase, David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’10). 1--12.
[59]
Felix Xiaozhu Lin, Zhen Wang, Robert LiKamWa, and Lin Zhong. 2012. Reflex: Using low-power processors in smartphones without knowing them. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 13--24.
[60]
Felix Xiaozhu Lin, Zhen Wang, and Lin Zhong. 2014. K2: A mobile operating system for heterogeneous coherence domains. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 285--300.
[61]
Guangshuo Liu, Jinpyo Park, and Diana Marculescu. 2013. Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems. In Proceedings of the International Conference on Computer Design (ICCD’13). 54--61.
[62]
Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Ronald Dreslinski Jr., Thomas F. Wenisch, and Scott Mahlke. 2014. Heterogeneous microarchitectures trump voltage scaling for low-power cores. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’14). 237--250.
[63]
Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Faissal M. Sleiman, Ronald Dreslinski, Thomas F. Wenisch, and Scott Mahlke. 2012. Composite cores: Pushing heterogeneity into a core. In Proceedings of the International Symposium on Microarchitecture (MICRO’12). 317--328.
[64]
Yangchun Luo, Venkatesan Packirisamy, Wei-Chung Hsu, and Antonia Zhai. 2010. Energy efficient speculative threads: Dynamic thread allocation in same-ISA heterogeneous multicore systems. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’10). 453--464.
[65]
Daniel Lustig, Caroline Trippel, Michael Pellauer, and Margaret Martonosi. 2015. ArMOR: Defending against memory consistency model mismatches in heterogeneous architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA’15). 388--400.
[66]
Felipe Lopes Madruga, Henrique C. Freitas, and Philippe Olivier Alexandre Navaux. 2010. Parallel shared-memory workloads performance on asymmetric multi-core architectures. In Proceedings of the Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP’10). 163--169.
[67]
N. Markovic, D. Nemirovsky, O. Unsal, M. Valero, and A. Cristal. 2014. Thread lock section-aware scheduling on asymmetric single-ISA multi-core. IEEE Computer Architecture Letters 14, 2, 160--163.
[68]
Sparsh Mittal. 2014a. A survey of techniques for improving energy efficiency in embedded computing systems. International Journal of Computer Aided Engineering and Technology 6, 4, 440--459.
[69]
Sparsh Mittal. 2014b. Power Management Techniques for Data Centers: A Survey. Technical Report ORNL/TM-2014/381. Oak Ridge National Laboratory, Oak Ridge, TN.
[70]
Sparsh Mittal, Matthew Poremba, Jeffrey Vetter, and Yuan Xie. 2014. Exploring Design Space of 3D NVM and eDRAM Caches Using DESTINY Tool. Technical Report ORNL/TM-2014/636. Oak Ridge National Laboratory, Oak Ridge, TN.
[71]
Sparsh Mittal and Jeffrey Vetter. 2015. A survey of CPU-GPU heterogeneous computing techniques. ACM Computing Surveys 47, 4, 69:1--69:35.
[72]
Jeffrey C. Mogul, Jayaram Mudigonda, Nathan Binkert, Parthasarathy Ranganathan, and Vanish Talwar. 2008. Using asymmetric single-ISA CMPs to save energy on operating systems. IEEE Micro 28, 3, 26--41.
[73]
Tomer Y. Morad, Avinoam Kolodny, and Uri C. Weiser. 2010. Scheduling multiple multithreaded applications on asymmetric and symmetric chip multiprocessors. In Proceedings of the International Symposium on Parallel Architectures, Algorithms, and Programming (PAAP’10). 65--72.
[74]
Tomer Y. Morad, Uri C. Weiser, Avinoam Kolodny, Mateo Valero, and Eduard Ayguade. 2006. Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors. Computer Architecture Letters 5, 1, 14--17.
[75]
Tobias Mühlbauer, Wolf Rödiger, Robert Seilbeck, Alfons Kemper, and Thomas Neumann. 2014. Heterogeneity-conscious parallel query execution: Getting a better mileage while driving faster! In Proceedings of the International Workshop on Data Management on New Hardware (DaMoN’14). 2:1--2:10.
[76]
Janani Mukundan, Saugata Ghose, Robert Karmazin, Engin Ipek, and José F. Martínez. 2012. Overcoming single-thread performance hurdles in the core fusion reconfigurable multicore architecture. In Proceedings of the International Conference on Supercomputing (ICS’12). 101--110.
[77]
Thannirmalai Somu Muthukaruppan, Anuj Pathania, and Tulika Mitra. 2014. Price theory based power management for heterogeneous multi-cores. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’14). 161--176.
[78]
Thannirmalai Somu Muthukaruppan, Mihai Pricopi, Vanchinathan Venkataramani, Tulika Mitra, and Sanjay Vishin. 2013. Hierarchical power management for asymmetric multi-core in dark silicon era. In Proceedings of the Design Automation Conference (DAC’13). 174.
[79]
Hashem Hashemi Najaf-Abadi, Niket Kumar Choudhary, and Eric Rotenberg. 2009. Core-selectability in chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’09). 113--122.
[80]
Hashem H. Najaf-Abadi and Eric Rotenberg. 2009. Architectural contesting. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’09). 189--200.
[81]
Sandeep Navada, Niket K. Choudhary, Salil V. Wadhavkar, and Eric Rotenberg. 2013. A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 133--144.
[82]
Rajiv Nishtala, Daniel Mossé, and Vinicius Petrucci. 2013. Energy-aware thread co-location in heterogeneous multicore processors. In Proceedings of the International Conference on Embedded Software (EMSOFT’13). 1--9.
[83]
NVIDIA. 2011. Variable SMP—A Multi-Core CPU Architecture for Low Power and High Performance. Retrieved December 29, 2015, from http://www.nvidia.com/content/PDF/tegra_white_papers/tegra-whitepaper-0 911b.pdf.
[84]
Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, and Scott Mahlke. 2013. Trace based phase prediction for tightly-coupled heterogeneous cores. In Proceedings of the International Symposium on Microarchitecture. 445--456.
[85]
Sankaralingam Panneerselvam and Michael M. Swift. 2012. Chameleon: Operating system support for dynamic processors. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’12). 99--110.
[86]
George Patsilaras, Niket K. Choudhary, and James Tuck. 2012. Efficiently exploiting memory level parallelism on asymmetric coupled cores in the dark silicon era. ACM Transactions on Architecture and Code Optimization 8, 4, 28:1--28:21.
[87]
Miquel Pericas, Adrian Cristal, Francisco J. Cazorla, Ruben Gonzalez, Daniel A. Jimenez, and Mateo Valero. 2007. A flexible heterogeneous multi-core architecture. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques (PACT’07). 13--24.
[88]
Vinicius Petrucci, Orlando Loques, and Daniel Mossé. 2012. Lucky scheduling for energy-efficient heterogeneous multi-core systems. In Proceedings of the USENIX Conference on Power-Aware Computing and Systems (HotPower’12).
[89]
Dmitry Ponomarev, Gurhan Kucuk, and Kanad Ghose. 2001. Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources. In Proceedings of the International Symposium on Microarchitecture. 90--101.
[90]
Mihai Pricopi and Tulika Mitra. 2012. Bahurupi: A polymorphic heterogeneous multi-core architecture. ACM Transactions on Architecture and Code Optimization 8, 4, 22:1--22:21.
[91]
Mihai Pricopi and Tulika Mitra. 2014. Task scheduling on adaptive multi-core. IEEE Transactions on Computers 63, 10, 2590--2603.
[92]
Mihai Pricopi, Thannirmalai Somu Muthukaruppan, Vanchinathan Venkataramani, Tulika Mitra, and Sanjay Vishin. 2013. Power-performance modeling on asymmetric multi-cores. In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES’13). 1--10.
[93]
Moo-Ryong Ra, Bodhi Priyantha, Aman Kansal, and Jie Liu. 2012. Improving energy efficiency of personal sensing applications with heterogeneous multi-processors. In Proceedings of the ACM Conference on Ubiquitous Computing (Ubicomp’12). 1--10.
[94]
M. Mustafa Rafique, Benjamin Rose, Ali R. Butt, and Dimitrios S. Nikolopoulos. 2009. Supporting MapReduce on large-scale asymmetric multi-core clusters. ACM SIGOPS Operating Systems Review 43, 2, 25--34.
[95]
Behnam Robatmili, Dong Li, Hadi Esmaeilzadeh, Sibi Govindan, Aaron Smith, Andrew Putnam, Doug Burger, and Stephen W. Keckler. 2013. How to implement effective prediction and forwarding for fusable dynamic multicore architectures. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’13). 460--471.
[96]
Rance Rodrigues, Arunachalam Annamalai, Israel Koren, Sandip Kundu, and Omer Khan. 2011. Performance per watt benefits of dynamic core morphing in asymmetric multicores. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’11). 121--130.
[97]
Rance Rodrigues, Israel Koren, and Sandip Kundu. 2014. Performance and power benefits of sharing execution units between a high performance core and a low power core. In Proceedings of the International Conference on VLSI Design (VLSID’14). 204--209.
[98]
Juan Carlos Saez, Alexandra Fedorova, David Koufaty, and Manuel Prieto. 2012. Leveraging core specialization via OS scheduling to improve performance on asymmetric multicore systems. ACM Transactions on Computer Systems 30, 2, 6:1--6:38.
[99]
Juan Carlos Saez, Alexandra Fedorova, Manuel Prieto, and Hugo Vegas. 2010. Operating system support for mitigating software scalability bottlenecks on asymmetric multicore processors. In Proceedings of the Computing Frontiers Conference (CF’10). 31--40.
[100]
Juan Carlos Saez, Adrian Pousa, Fernando Castro, Daniel Chaver, and Manuel Prieto-Matias. 2015. ACFS: A completely fair scheduler for asymmetric single-ISA multicore systems. In Proceedings of the ACM Symposium on Applied Computing (SAC’15).
[101]
Pierre Salverda and Craig Zilles. 2008. Fundamental performance constraints in horizontal fusion of in-order cores. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’08). 252--263.
[102]
Samsung. 2013. SAMSUNG Highlights Innovations in Mobile Experiences Driven by Components, in CES Keynote. Retrieved December 29, 2015, from http://www.samsung.com/us/news/20353.
[103]
Karthikeyan Sankaralingam, Ramadass Nagarajan, Haiming Liu, Changkyu Kim, Jaehyuk Huh, Doug Burger, Stephen W. Keckler, and Charles R. Moore. 2003. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In Proceedings of the International Symposium on Computer Architecture (ISCA’03). 422--433.
[104]
Lina Sawalha and Ronald D. Barnes. 2012. Energy-efficient phase-aware scheduling for heterogeneous multicore processors. In Proceedings of the IEEE Green Technologies Conference. 1--6.
[105]
Daniel Shelepov, Juan Carlos Saez Alcaide, Stacey Jeffery, Alexandra Fedorova, Nestor Perez, Zhi Feng Huang, Sergey Blagodurov, and Viren Kumar. 2009. HASS: A scheduler for heterogeneous multicore systems. ACM SIGOPS Operating Systems Review 43, 2, 66--75.
[106]
Tyler Sondag and Hridesh Rajan. 2009. Phase-guided thread-to-core assignment for improved utilization of performance-asymmetric multi-core processors. In Proceedings of the ICSE Workshop on Multicore Software Engineering. 73--80.
[107]
Sudarshan Srinivasan, Nithesh Kurella, Israel Koren, and Sandip Kundu. 2015. Exploring heterogeneity within a core for improved power efficiency. IEEE Transactions on Parallel and Distributed Systems PP, 99, 1.
[108]
Sudarshan Srinivasan, Rance Rodrigues, Arunachalam Annamalai, Israel Koren, and Sandip Kundu. 2013. A study on polymorphing superscalar processor dynamically to improve power efficiency. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI’13). 46--51.
[109]
Sadagopan Srinivasan, Li Zhao, Ramesh Illikkal, and Ravishankar Iyer. 2011. Efficient interaction between OS and architecture in heterogeneous platforms. ACM SIGOPS Operating Systems Review 45, 1, 62--72.
[110]
Richard Strong, Jayaram Mudigonda, Jeffrey C. Mogul, Nathan Binkert, and Dean Tullsen. 2009. Fast switching of threads between cores. ACM SIGOPS Operating Systems Review 43, 2, 35--45.
[111]
M. Aater Suleman, Onur Mutlu, José A. Joao, Khubaib, and Yale Patt. 2010. Data marshaling for multi-core architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA’10). 441--450.
[112]
M. Aater Suleman, Onur Mutlu, Moinuddin K. Qureshi, and Yale N. Patt. 2009. Accelerating critical section execution with asymmetric multi-core architectures. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 253--264.
[113]
M. Aater Suleman, Yale N. Patt, Eric Sprangle, Anwar Rohillah, Anwar Ghuloum, and Doug Carmean. 2007. Asymmetric Chip Multiprocessors: Balancing Hardware Efficiency and Programmer Efficiency. TR-HPS-2007-001. University of Texas, Austin, TX.
[114]
Hsin-Ching Sun, Bor-Yeh Shen, Wuu Yang, and Jenq-Kuen Lee. 2011. Migrating Java threads with fuzzy control on asymmetric multicore systems for better energy delay product. In Proceedings of the International Conference on Computing and Security.
[115]
Tao Sun, Hong An, Tao Wang, Haibo Zhang, and Xiufeng Sui. 2012. CRQ-based fair scheduling on composable multicore architectures. In Proceedings of the International Conference on Supercomputing (ICS’12). 173--184.
[116]
Ibrahim Takouna, Wesam Dawoud, and Christoph Meinel. 2011. Efficient virtual machine scheduling-policy for virtualized heterogeneous multicore systems. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA’11).
[117]
David Tarjan, Michael Boyer, and Kevin Skadron. 2008. Federation: Repurposing scalar cores for out-of-order instruction issue. In Proceedings of the Design Automation Conference (DAC’08). 772--775.
[118]
Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, and Lieven Eeckhout. 2013. Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’13). 177--187.
[119]
Kenzo Van Craeynest and Lieven Eeckhout. 2013. Understanding fundamental design choices in single-ISA heterogeneous multicore architectures. ACM Transactions on Architecture and Code Optimization 9, 4, 32.
[120]
Kenzo Van Craeynest, Aamer Jaleel, Lieven Eeckhout, Paolo Narvaez, and Joel Emer. 2012. Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In Proceedings of the International Symposium on Computer Architecture (ISCA’12). 213--224.
[121]
Ashish Venkat and Dean M. Tullsen. 2014. Harnessing ISA diversity: Design of a heterogeneous-ISA chip multiprocessor. In Proceedings of the International Symposium on Computer Architecture (ISCA’14). 121--132.
[122]
Jeffrey Vetter and Sparsh Mittal. 2015. Opportunities for nonvolatile memory systems in extreme-scale high performance computing. Computing in Science and Engineering 17, 2, 73--82.
[123]
Carl A. Waldspurger and William E. Weihl. 1994. Lottery scheduling: Flexible proportional-share resource management. In Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI’94).
[124]
Yasuko Watanabe, John D. Davis, and David A. Wood. 2010. WiDGET: Wisconsin decoupled grid execution tiles. In Proceedings of the International Symposium on Computer Architecture (ISCA’10), Vol. 38. 2--13.
[125]
Ryan Whitwam. 2014. Qualcomm Unveils 64-Bit Snapdragon 808 and 810 SoCs: The Apple A7 Stop-Gap Measures Continue. Retrieved December 29, 2015, from http://goo.gl/v4ywMW.
[126]
Youfeng Wu, Shiliang Hu, Edson Borin, and Cheng Wang. 2011. A HW/SW co-designed heterogeneous multi-core virtual machine for energy-efficient general purpose computing. In Proceedings of the International Symposium on Code Generation and Optimization (CGO’11). 236--245.
[127]
Ying Zhang, Lide Duan, Bin Li, Lu Peng, and Srinivasan Sadagopan. 2014a. Energy efficient job scheduling in single-ISA heterogeneous chip-multiprocessors. In Proceedings of the International Symposium on Quality Electronic Design (ISQED’14). 660--666.
[128]
Ying Zhang, Li Zhao, Ramesh Illikkal, Ravi Iyer, Andrew Herdrich, and Lu Peng. 2014b. QoS management on heterogeneous architecture for parallel applications. In Proceedings of the IEEE International Conference on Computer Design (ICCD’14). 332--339.
[129]
Hongtao Zhong, Steven A. Lieberman, and Scott A. Mahlke. 2007. Extending multicore architectures to exploit hybrid parallelism in single-thread applications. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’07). 25--36.
[130]
Yuhao Zhu and Vijay Janapa Reddi. 2013. High-performance and energy-efficient mobile web browsing on big/little systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’13). 13--24.

Cited By

View all
  • (2024)Intermittent Inference: Trading a 1% Accuracy Loss for a 1.9x Throughput SpeedupProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699364(647-660)Online publication date: 4-Nov-2024
  • (2024)CStream: Parallel Data Stream Compression on Multicore Edge DevicesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338686236:11(5889-5904)Online publication date: Nov-2024
  • (2023)Scatter search with stochastic beam search on the coalition formation problemJournal of Industrial and Management Optimization10.3934/jimo.2023119(0-0)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. A Survey of Techniques for Architecting and Managing Asymmetric Multicore Processors

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 48, Issue 3
      February 2016
      619 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/2856149
      • Editor:
      • Sartaj Sahni
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 February 2016
      Accepted: 01 November 2015
      Revised: 01 August 2015
      Received: 01 April 2015
      Published in CSUR Volume 48, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Review
      2. asymmetric multicore processor
      3. big/little system
      4. classification
      5. heterogeneous multicore architecture
      6. reconfigurable AMP

      Qualifiers

      • Survey
      • Research
      • Refereed

      Funding Sources

      • Office of Science
      • Advanced Scientific Computing Research
      • U.S. Department of Energy

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)346
      • Downloads (Last 6 weeks)70
      Reflects downloads up to 13 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Intermittent Inference: Trading a 1% Accuracy Loss for a 1.9x Throughput SpeedupProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699364(647-660)Online publication date: 4-Nov-2024
      • (2024)CStream: Parallel Data Stream Compression on Multicore Edge DevicesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338686236:11(5889-5904)Online publication date: Nov-2024
      • (2023)Scatter search with stochastic beam search on the coalition formation problemJournal of Industrial and Management Optimization10.3934/jimo.2023119(0-0)Online publication date: 2023
      • (2023)Quarantine: Mitigating Transient Execution Attacks with Physical Domain IsolationProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3607199.3607248(207-221)Online publication date: 16-Oct-2023
      • (2023)Towards a SYCL API for Approximate ComputingProceedings of the 2023 International Workshop on OpenCL10.1145/3585341.3585374(1-2)Online publication date: 18-Apr-2023
      • (2023)Lifting Code Generation of Cardiac Physiology Simulation to Novel Compiler TechnologyProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580008(68-80)Online publication date: 17-Feb-2023
      • (2023)Approximation Opportunities in Edge Computing Hardware: A Systematic Literature ReviewACM Computing Surveys10.1145/357277255:12(1-49)Online publication date: 3-Mar-2023
      • (2023)Towards Differentiable Agent-Based SimulationACM Transactions on Modeling and Computer Simulation10.1145/356581032:4(1-26)Online publication date: 11-Jan-2023
      • (2023)Toward Optimal Softcore Carry-aware Approximate Multipliers on Xilinx FPGAsACM Transactions on Embedded Computing Systems10.1145/356424322:4(1-19)Online publication date: 3-Aug-2023
      • (2023)ApproxTrain: Fast Simulation of Approximate Multipliers for DNN Training and InferenceIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.325304542:11(3505-3518)Online publication date: 1-Nov-2023
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media