Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Exploring the cache design space for large scale CMPs

Published: 01 November 2005 Publication History
  • Get Citation Alerts
  • Abstract

    With the advent of dual-core chips in the marketplace, small-scale CMP (chip multiprocessor) architectures are becoming commonplace. We expect a continuing trend of increasing the number of cores on a die to maximize the performance/power efficiency of a single chip. We believe an era of large-scale CMPs (LCMPs) with several tens to hundreds of cores is on the way, but as of now architects have little understanding of how best to build a cache hierarchy given such a large number of cores/threads to support. With this in mind, our initial goals are to prune the cache design space for LCMPs by characterizing basic server workload behavior in such an environment.In this paper, we describe the range of methodologies that we are developing to overcome the challenges of exploring the cache design space for LCMP platforms. We then focus on employing a trace-driven approach to characterizing one key server workload (OLTP) in both a homogeneous and a heterogeneous workload environment. We study the effect of increasing threads (from 1 to 128) on a three-level cache hierarchy with emphasis on second and third level caches. We study the effect of varying sizes at these cache levels and show the effects of threads contending for cache space, the effects of prefetching instruction addresses, and the effects of inclusion. We make initial observations and conclusions about the factors on which LCMP cache hierarchy design decisions should be based and discuss future work.

    References

    [1]
    "Azul Compute Appliance," Azul Systems, can be found http://www.azulsystems.com/products/cpools_cappliance.html
    [2]
    P. Barham, B. Dragovic, et al., "Xen and the art of virtualization," In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP 2003), NY, USA, Oct 2003.
    [3]
    L. Hammond, B. Nayfeh, and K. Olukotun, "A Single-Chip Multiprocessor," IEEE Computer, 30(9), 79 -- 85, September 1997. also see http://www-hydra.stanford.edu/
    [4]
    Intel Corporation. "Intel Dual-Core Processors -- The First in the Multi-core Revolution," http://www.intel.com/technology/computing/dual-core/
    [5]
    R. Iyer, "CQoS: A Framework for Enabling QoS in Shared Caches of CMP Platforms," 18th Annual International Conference on Supercomputing (ICS'04), July 2004.
    [6]
    R. Iyer, "On Modeling and Analyzing Cache Hierarchies using CASPER", 11th IEEE/ACM Symposium on Modeling, Analysis and Simulation of Computer and Telecom Systems, Oct 2003.
    [7]
    P. Kongetira, K. Aingaran, and K. Olukotun, "Niagara: A 32-Way Multithreaded Sparc Processor," IEEE Micro 25, 21--29, Mar. 2005
    [8]
    R. Kumar, V. Zyuban, and D. Tullsen, "Interconnections in Multi-core Architectures: Understanding Mechanisms, Overheads and Scaling", 32nd International Symposium on Computer Architecture, June 2005
    [9]
    S-L. Lu and K. Lai, "Implementation of HW$im - A Real-Time Configurable Cache Simulator," FPL 2003: 638--647
    [10]
    D. Marr, F. Binns, et al., "Hyper-Threading Technology Architecture and Microarchitecture," Intel Technology Journal, Vol 3, Issue 1, Feb 2002, can be found at ftp://download.intel.com/technology/itj/2002/volume06issue01/vol6iss1_hyper_threading_technology.pdf
    [11]
    A. Nanda, K. Mak, K. Sugavanam, R. K. Sahoo, V. Soundararajan, and T. Smith, "MemorIES: A programmable, real-time hardware emulation tool for multiprocessor server design," ACM SIGPLAN Notices, Vol. 35, Issue. 11, Nov. 2000.
    [12]
    SPECjAppserver2004 User's Guide, http://www.spec.org/jAppServer2004/docs/UserGuide.html
    [13]
    Sap America Inc., "SAP Standard Benchmarks," http://www.sap.com/solutions/benchmark/index.epx
    [14]
    L. Spracklen and S. Abraham, "Chip Multithreading: Opportunities and Challenges," Industrial Session, 11th International Conference on High Performance Computer Architecture (HPCA-11), San Francisco, 2005.
    [15]
    "TPC-C Design Document", available online on the TPC website at www.tpc.org/tpcc/
    [16]
    D. Tullsen, S. Eggers, and H. Levy, "Simultaneous Multithreading: Maximizing On-chip Parallelism," in 22nd Annual International Symposium on Computer Architecture, June 1995.
    [17]
    R. Uhlig, R. Fishtein, et. al., "SoftSDV: A Presilicon Software Development Environment for the IA-64 Architecture. Intel Technology Journal. Q4, 1999. (http://www.intel.com/technology/itjf)
    [18]
    R. Uhlig, G. Neiger, D. Rodgers, et al., "Intel Virtualization Technology," IEEE Computer Vol 38, Issue 5, May 2005.
    [19]
    L. Zhao, R. Illikkal, S. Makineni and L. Bhuyan, "TCP/IP Cache Characterization in Commercial Server Workloads," 7th Workshop on Computer Architecture Evaluation using Commercial Workloads (CAECW-7), held along with HPCA-10, Feb. 2004.

    Cited By

    View all
    • (2023)HyGain: High-performance, Energy-efficient Hybrid Gain Cell-based Cache HierarchyACM Transactions on Architecture and Code Optimization10.1145/357283920:2(1-20)Online publication date: 1-Mar-2023
    • (2021)A novel low power hybrid cache using GC-EDRAM cellsIntegration, the VLSI Journal10.1016/j.vlsi.2021.07.00581:C(234-245)Online publication date: 1-Nov-2021
    • (2017)Using Multicore Reuse Distance to Study Coherence DirectoriesACM Transactions on Computer Systems10.1145/309270235:2(1-49)Online publication date: 28-Jul-2017
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 33, Issue 4
    Special issue: dasCMP'05
    November 2005
    130 pages
    ISSN:0163-5964
    DOI:10.1145/1105734
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 November 2005
    Published in SIGARCH Volume 33, Issue 4

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)HyGain: High-performance, Energy-efficient Hybrid Gain Cell-based Cache HierarchyACM Transactions on Architecture and Code Optimization10.1145/357283920:2(1-20)Online publication date: 1-Mar-2023
    • (2021)A novel low power hybrid cache using GC-EDRAM cellsIntegration, the VLSI Journal10.1016/j.vlsi.2021.07.00581:C(234-245)Online publication date: 1-Nov-2021
    • (2017)Using Multicore Reuse Distance to Study Coherence DirectoriesACM Transactions on Computer Systems10.1145/309270235:2(1-49)Online publication date: 28-Jul-2017
    • (2017)Optimizing thin client caches for mobile cloud computing:Concurrency and Computation: Practice and Experience10.1002/cpe.404829:11Online publication date: 3-Mar-2017
    • (2016)Identifying Power-Efficient Multicore Cache Hierarchies via Reuse Distance AnalysisACM Transactions on Computer Systems10.1145/285150334:1(1-30)Online publication date: 6-Apr-2016
    • (2015)Exploring Multilevel Cache Hierarchies in Application Specific MPSoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2015.244573634:12(1991-2003)Online publication date: 18-Nov-2015
    • (2015)Studying the impact of multicore processor scaling on directory techniques via reuse distance analysis2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2015.7056065(590-602)Online publication date: Feb-2015
    • (2015)An efficient adaptive block pinning for multicore architecturesMicroprocessors & Microsystems10.1016/j.micpro.2015.02.00639:3(181-188)Online publication date: 1-May-2015
    • (2015)An adaptive migration---replication scheme (AMR) for shared cache in chip multiprocessorsThe Journal of Supercomputing10.1007/s11227-015-1482-071:10(3904-3933)Online publication date: 1-Oct-2015
    • (2014)"CERE"Proceedings of the 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)10.1109/HPCC.2014.97(566-573)Online publication date: 20-Aug-2014
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media