Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1250662.1250686acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article

Core fusion: accommodating software diversity in chip multiprocessors

Published: 09 June 2007 Publication History

Abstract

This paper presents core fusion, a reconfigurable chip multiprocessor(CMP) architecture where groups of fundamentally independent cores can dynamically morph into a larger CPU, or they can be used as distinct processing elements, as needed at run time by applications. Core fusion gracefully accommodates software diversity and incremental parallelization in CMPs. It provides a single execution model across all configurations, requires no additional programming effort or specialized compiler support, maintains ISA compatibility, and leverages mature micro-architecture technology.

References

[1]
S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. Journal of Molecular Biology, pages 403--410, 1990.
[2]
V. Aslot and R. Eigenmann. Quantitative performance analysis of the SPEC OMPM2001 benchmarks. Scientific Programming, 11(2):105--124, 2003.
[3]
S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The impact of performance asymmetry in emerging multicore architectures. In Intl. Symp. on Computer Architecture, pages 506--517, Madison, Wisconsin, June 2005.
[4]
R. Balasubramonian, S. Dwarkadas, and D. H. Albonesi. Dynamically managing the communication-parallelism trade-off in future clustered processors. In Intl. Symp. on Computer Architecture, pages 275--287, San Diego, CA, June 2003.
[5]
A. Baniasadi and A. Moshovos. Instruction distribution heuristics for quad-cluster, dynamically-scheduled, superscalar processors. In Intl. Symp. on Microarchitecture, pages 337--347, Monterey, CA, December 2000.
[6]
M. Bekerman, S. Jourdan, R. Ronen, G. Kirshenboim, L. Rappoport, A. Yoaz, and U. Weiser. Correlated load-address predictors. In Intl. Symp. on Computer Architecture, pages 54--63, Atlanta, GA, May 1999.
[7]
R. Bhargava and L. K. John. Improving dynamic cluster assignment for clustered trace cache processors. In Intl. Symp. on Computer Architecture, pages 264--274, San Diego, CA, June 2003.
[8]
J. Burns and J.-L. Gaudiot. Area and system clock effects on SMT/CMP processors. In Intl. Conf. on Parallel Architectures and Compilation Techniques, page 211, Barcelona, Spain, September 2001.
[9]
B. Calder and G. Reinman. A comparative survey of load speculation architectures. Journal of Instruction-Level Parallelism, 2, May 2000.
[10]
R. Canal, J.-M. Parcerisa, and A. González. A cost-effective clustered architecture. In Intl. Conf. on Parallel Architectures and Compilation Techniques, pages 160--168, Newport Beach, CA, October 1999.
[11]
R. Canal, J.-M. Parcerisa, and A. González. Dynamic cluster assignment mechanisms. In Intl. Symp. on High-Performance Computer Architecture, pages 132--142, Toulouse, France, January 2000.
[12]
R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. McDonald, and R. Menon. Parallel Programming in OpenMP. Morgan Kaufmann, San Francisco, CA, 2001.
[13]
P. Chaparro, G. Magklis, J. González, and A. González. Distributing the frontend for temperature reduction. In Intl. Symp. on High-Performance Computer Architecture, pages 61--70, San Francisco, CA, February 2005.
[14]
G. Chrysos and J. Emer. Memory dependence prediction using store sets. In Intl. Symp. on Computer Architecture, pages 142--153, Barcelona, Spain, June-July 1998.
[15]
J. D. Collins and D. M. Tullsen. Clustered multithreaded architectures-pursuing both ipc and cycle time. In Intl. Parallel and Distributed Processing Symp., Santa Fe, New Mexico, April 2004.
[16]
A. E.-Moursy, R. Garg, D. H. Albonesi, and S. Dwarkadas. Partitioning multi-threaded processors with a large number of threads. In Intl. Symp. on Performance Analysis of Systems and Software, pages 112--123, Austin, TX, March 2005.
[17]
P. Bai et al. A 65nm logic technology featuring 35nm gate length, enhanced channel strain, 8 cu interconnect layers, low-k ild and 0.57m2 sram cell. In IEEE Intl. Electron Devices Meeting, Washington, DC, December 2005.
[18]
K. I. Farkas, P. Chow, N. P. Jouppi, and Z. Vranesic. The Multicluster architecture: Reducing cycle time through partitioning. In Intl. Symp. on Microarchitecture, pages 149--159, Research Triangle Park, NC, December 1997.
[19]
J. González, F. Latorre, and A. González. Cache organizations for clustered microarchitectures. In Workshop on Memory Performance Issues, pages 46--55, Munich, Germany, June 2004.
[20]
J. L. Henning. SPEC CPU2000: Measuring CPU performance in the new millennium. IEEE Computer, 33(7):28--35, July 2000.
[21]
R. E. Kessler. The Alpha 21264 microprocessor. IEEE Micro, 9(2):24--36, March 1999.
[22]
A. KleinOsowski and D. Lilja. MinneSPEC: A new SPEC benchmark workload for simulation-based computer architecture research. Computer Architecture Letters, 1, June 2002.
[23]
R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In Intl. Symp. on Microarchitecture, pages 81--92, San Diego, CA, December 2003.
[24]
R. Kumar, D. M. Tullsen, P. Ranganathan, N. P. Jouppi, and K. I. Farkas. Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In Intl. Symp. on Computer Architecture, pages 64--75, München, Germany, June 2004.
[25]
R. Kumar, V. Zyuban, and D. M. Tullsen. Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. In Intl. Symp. on Computer Architecture, pages 408--419, Madison, Wisconsin, June 2005.
[26]
F. Latorre, J. González, and A. González. Back-end assignment schemes for clustered multithreaded processors. In Intl. Conf. on Supercomputing, pages 316--325, Malo, France, June-July 2004.
[27]
R. Lawrence, G. Almasi, and H. Rushmeier. A scalable parallel algorithm for self-organizing maps with applications to sparse data mining problems. Technical report, IBM, January 1998.
[28]
K. Mai, T. Paaske, N. Jayasena, R. Ho, W. J. Dally, and M. Horowitz. Smart Memories: a modular reconfigurable architecture. In Intl. Symp. on Computer Architecture, pages 161--171, Vancouver, Canada, June 2000.
[29]
J. F. Martínez, J. Renau, M. C. Huang, M. Prvulovic, and J. Torrellas. Cherry: Checkpointed early resource recycling in out-of-order microprocessors. In Intl. Symp. on Microarchitecture, Istanbul, Turkey, November 2002.
[30]
M. Moudgill, K. Pingali, and S. Vassiliadis. Register renaming and dynamic speculation: An alternative approach. In Intl. Symp. on Microarchitecture, pages 202--213, Austin, TX, December 1993.
[31]
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K. Chang. The case for a single-chip multiprocessor. In Intl. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 2--11, Cambridge, MA, October 1996.
[32]
S. Palacharla, N. P. Jouppi, and J. E. Smith. Complexity-effective superscalar processors. In Intl. Symp. on Computer Architecture, pages 206--218, Denver, CO, June 1997.
[33]
J.-M. Parcesira. Design of Clustered Superscalar Microarchitectures. Ph.D. dissertation, Univ. Polit`ecnica de Catalunya, April 2004.
[34]
J. Pisharath, Y. Liu, W.-K. Liao, A. Choudhary, G. Memik, and J. Parhi. NU-MineBench 2.0. Technical Report CUCIS-2005-08-01, Center for Ultra-Scale Computing and Information Security, Northwestern University, August 2005.
[35]
J. Renau, B. Fraguela, J. Tuck, W. Liu, M. Prvulovic, L. Ceze, S. Sarangi, P. Sack, K. Strauss, and P. Montesinos. http://sesc.sourceforge.net.
[36]
E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. E. Smith. Trace processors. In Intl. Symp. on Microarchitecture, pages 138--148, Research Triangle Park, NC, December 1997.
[37]
K. Sankaralingam, R. Nagarajan, H. Liu, C. Kim, J. Huh, D. Burger, S. W. Keckler, and C. R. Moore. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture. In Intl. Symp. on Computer Architecture, pages 422--433, San Diego, CA, June 2003.
[38]
G. S. Sohi, S. E. Breach, and T. N. Vijaykumar. Multiscalar processors. In Intl. Symp. on Computer Architecture, pages 414--425, Santa Margherita Ligure, Italy, June 1995.
[39]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: Characterization and methodological considerations. In Intl. Symp. on Computer Architecture, pages 24--36, Santa Margherita Ligure, Italy, June 1995.
[40]
H. Zhong, S. A. Lieberman, and S. A. Mahlke. Extending multicore architectures to exploit hybrid parallelism in single-thread applications. In Intl. Symp. on High-Performance Computer Architecture, Phoenix, Arizona, February 2007.
[41]
V. V. Zyuban and P. M. Kogge. Inherently lower-power high-performance superscalar architectures. IEEE Transactions on Computers, 50(3):268--285, March 2001.

Cited By

View all
  • (2022)Malcolm: Multi-agent Learning for Cooperative Load Management at Rack ScaleProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35706116:3(1-25)Online publication date: 8-Dec-2022
  • (2022)AdaMICAProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35503046:3(1-30)Online publication date: 7-Sep-2022
  • (2022)big.VLITTLE: On-Demand Data-Parallel Acceleration for Mobile Systems on Chip2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00025(181-198)Online publication date: Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture
June 2007
542 pages
ISBN:9781595937063
DOI:10.1145/1250662
  • General Chair:
  • Dean Tullsen,
  • Program Chair:
  • Brad Calder
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 35, Issue 2
    May 2007
    527 pages
    ISSN:0163-5964
    DOI:10.1145/1273440
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. chip multiprocessors
  2. reconfigurable architectures
  3. software diversity

Qualifiers

  • Article

Conference

SPAA07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)3
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Malcolm: Multi-agent Learning for Cooperative Load Management at Rack ScaleProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35706116:3(1-25)Online publication date: 8-Dec-2022
  • (2022)AdaMICAProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35503046:3(1-30)Online publication date: 7-Sep-2022
  • (2022)big.VLITTLE: On-Demand Data-Parallel Acceleration for Mobile Systems on Chip2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00025(181-198)Online publication date: Oct-2022
  • (2021)Software-Defined Vector Processing on Manycore FabricsMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480099(392-406)Online publication date: 18-Oct-2021
  • (2021)DiAG: a dataflow-inspired architecture for general-purpose processorsProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446703(93-106)Online publication date: 19-Apr-2021
  • (2021)DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement BottlenecksIEEE Access10.1109/ACCESS.2021.31109939(134457-134502)Online publication date: 2021
  • (2020)Enhancing thread-level parallelism in asymmetric multicores using transparent instruction offloadingProceedings of the 57th ACM/EDAC/IEEE Design Automation Conference10.5555/3437539.3437772(1-6)Online publication date: 20-Jul-2020
  • (2020)CuttleSys: Data-Driven Resource Management for Interactive Services on Reconfigurable Multicores2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO50266.2020.00060(650-664)Online publication date: Oct-2020
  • (2019)ACCEPTORProceedings of the 27th International Conference on Real-Time Networks and Systems10.1145/3356401.3356420(209-219)Online publication date: 6-Nov-2019
  • (2019)Hybrid-DBT: Hardware/Software Dynamic Binary Translation Targeting VLIWIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.286428838:10(1872-1885)Online publication date: Oct-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media