Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3642921.3642956acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrapidoConference Proceedingsconference-collections
research-article

Modeling methodology for multi-die chip design based on gem5/SystemC co-simulation

Published: 06 March 2024 Publication History

Abstract

The paper introduces a modeling methodology aimed at thoroughly exploring the design space of multi-die chip architecture tailored for High-Performance Computing (HPC). For accurate simulations, we leverage the capabilities of gem5’s Ruby for its robust CPU models and cache coherence protocols, providing a comprehensive representation of die architecture. Die-to-die interfaces are modeled using SystemC TLM, offering flexibility to integrate with other simulators. This enables co-simulation with varying abstraction levels, making it well-suited for the design analysis of multi-die chip architecture.
We present, to the best of our knowledge, the first attempt to integrate gem5’s Ruby memory system with SystemC TLM for the modeling of multi-die chip architecture. The benefits of this model are demonstrated through the instantiation of a multi-die design using modern Arm architectures with four compute dies and two memory dies. The multi-die chip’s functionality is validated by executing STREAM Triad with Linux, followed by a comparative performance analysis against a monolithic design.

References

[1]
2012. IEEE Standard for Standard SystemC Language Reference Manual. IEEE Std 1666-2011 (Revision of IEEE Std 1666-2005) (2012), 1–638. https://doi.org/10.1109/IEEESTD.2012.6134619
[2]
2023. Meteor Lake Architecture Overview. https://www.intel.com/content/www/us/en/content-details/788851/meteor-lake-architecture-overview.html [Accessed 2023-09-27].
[3]
2023. UCIe: Universal Chiplet Interconnect Express. https://www.uciexpress.org [Accessed 2023-09-27].
[4]
Arm. 2021. Arm Neoverse V1 reference design Software Developer Guide. https://developer.arm.com/documentation/PJDOC-1779577084-33214/RelG [Accessed 2023-10-27].
[5]
Arm. 2022. AMBA 5 CHI Architecture Specification. https://developer.arm.com/documentation/ihi0050/f [Accessed 2023-10-27].
[6]
Srikant Bharadwaj, Jieming Yin, Bradford Beckmann, and Tushar Krishna. 2020. Kite: A Family of Heterogeneous Interposer Topologies Enabled via Accurate Interconnect Modeling. In 2020 57th ACM/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1109/DAC18072.2020.9218539
[7]
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (aug 2011), 1–7. https://doi.org/10.1145/2024716.2024718
[8]
Amir Charif, Gabriel Busnot, Rania Mameesh, Tanguy Sassolas, and Nicolas Ventroux. 2019. Fast virtual prototyping for embedded computing systems design and exploration. In Proceedings of the Rapid Simulation and Performance Evaluation: Methods and Tools. 1–8.
[9]
Chixiao Chen, Jieming Yin, Yarui Peng, Maurizio Palesi, Wenxu Cao, Letian Huang, Amit Kumar Singh, Haocong Zhi, and Xiaohang Wang. 2022. Design Challenges of Intrachiplet and Interchiplet Interconnection. IEEE Design & Test 39, 6 (2022), 99–109. https://doi.org/10.1109/MDAT.2022.3203005
[10]
CCIX Consortium. 2017-2023. Cache Coherent Interconnect for Accelerators. https://www.ccixconsortium.com [Accessed 2023-09-27].
[11]
CXL Consortium. 2022. Compute Express Link. https://www.computeexpresslink.org [Accessed 2023-09-27].
[12]
Jonathon Evans, Michael Andersch, Vikram Sethi, Gonzalo Brito, and Vishal Mehta. 2022. NVIDIA Grace Hopper Superchip Architecture In-Depth. https://www.intel.com/content/www/us/en/content-details/788851/meteor-lake-architecture-overview.html [Accessed 2023-09-27].
[13]
Jonathon Evans, Ian Finder, Ivan Goldwasser, John Linford, Vishal Mehta, Daniel Ruiz, and Mathias Wagner. 2023. NVIDIA Grace CPU Superchip Architecture In Depth. https://resources.nvidia.com/en-us-grace-cpu/grace-cpu-1 [Accessed 2023-09-27].
[14]
Yinxiao Feng and Kaisheng Ma. 2022. Chiplet Actuary: A Quantitative Cost Model and Multi-Chiplet Architecture Exploration. In Proceedings of the 59th ACM/IEEE Design Automation Conference (San Francisco, California) (DAC ’22). Association for Computing Machinery, New York, NY, USA, 121–126. https://doi.org/10.1145/3489517.3530428
[15]
Yinxiao Feng, Dong Xiang, and Kaisheng Ma. 2023. A Scalable Methodology for Designing Efficient Interconnection Network of Chiplets. In 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA). 1059–1071. https://doi.org/10.1109/HPCA56546.2023.10070981
[16]
Amir Gholami, Zhewei Yao, Sehoon Kim, Michael W Mahoney, and Kurt Keutzer. 2021. AI and Memory Wall. RiseLab Medium Post (2021).
[17]
Thomas Grass, César Allande, Adrià Armejach, Alejandro Rico, Eduard Ayguadé, Jesus Labarta, Mateo Valero, Marc Casas, and Miquel Moreto. 2016. MUSA: a multi-level simulation approach for next-generation HPC machines. In SC’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 526–537.
[18]
Ph.D. John D. McCalpin. [n. d.]. STREAM: Sustainable Memory Bandwidth in High Performance Computers. https://www.cs.virginia.edu/stream [Accessed 2023-10-27].
[19]
Ajaykumar Kannan, Natalie Enright Jerger, and Gabriel H. Loh. 2015. Enabling interposer-based disintegration of multi-core processors. In 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 546–558. https://doi.org/10.1145/2830772.2830808
[20]
Mark LaPedus. 2018. Big Trouble At 3nm. https://semiengineering.com/big-trouble-at-3nm [Accessed 2023-09-27].
[21]
Tao Li, Jie Hou, Jinli Yan, Rulin Liu, Hui Yang, and Zhigang Sun. 2020. Chiplet heterogeneous integration technology—Status and challenges. Electronics 9, 4 (2020), 670.
[22]
Jason Lowe-Power. 2023. Ruby. https://www.gem5.org/documentation/general_docs/ruby [Accessed 2023-09-27].
[23]
Xiaohan Ma, Ying Wang, Yujie Wang, Xuyi Cai, and Yi Han. 2022. Survey on chiplets: interface, interconnect and integration methodology. CCF Transactions on High Performance Computing (2022), 1–10. https://api.semanticscholar.org/CorpusID:247846633
[24]
Christian Menard, Jeronimo Castrillon, Matthias Jung, and Norbert Wehn. 2017. System simulation with gem5 and SystemC: The keystone for full interoperability. In 2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS). 62–69. https://doi.org/10.1109/SAMOS.2017.8344612
[25]
Gordon E Moore. 1998. Cramming more components onto integrated circuits. Proc. IEEE 86, 1 (1998), 82–85.
[26]
Samuel Naffziger, Noah Beck, Thomas Burd, Kevin Lepak, Gabriel H. Loh, Mahesh Subramony, and Sean White. 2021. Pioneering Chiplet Technology and Design for the AMD EPYC™ and Ryzen™ Processor Families : Industrial Product. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). 57–70. https://doi.org/10.1109/ISCA52012.2021.00014 ISSN: 2575-713X.
[27]
Alejandro Nocua, Florent Bruguier, Gilles Sassatelli, and Abdoulaye Gamatie. 2017. ElasticSimMATE: A fast and accurate gem5 trace-driven simulator for multicore systems. In 2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC). IEEE, 1–8.
[28]
Vasil Pano, Ragh Kuttappa, and Baris Taskin. 2019. 3D NoCs with Active Interposer for Multi-Die Systems. In Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip (New York, New York) (NOCS ’19). Association for Computing Machinery, New York, NY, USA, Article 14, 8 pages. https://doi.org/10.1145/3313231.3352380
[29]
The TOP500 project. 2023. The TOP500. https://www.top500.org/
[30]
J. M. Shalf and R. Leland. 2015. Computing beyond Moore’s Law. Computer 48, 12 (dec 2015), 14–23. https://doi.org/10.1109/MC.2015.374
[31]
Siemens. 2023. ModelSim. https://eda.sw.siemens.com/en-US/ic/modelsim/ [Accessed 2023-10-26].
[32]
Francisco Socal. 2023. Moving AMBA forward with multi-chip and CHI C2C. https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/multi-chip-and-chi-c2c [Accessed 2023-10-09].
[33]
Pascal Vivet, Eric Guthmuller, Yvain Thonnart, Gael Pillonnet, César Fuguet, Ivan Miro-Panades, Guillaume Moritz, Jean Durupt, Christian Bernard, Didier Varreau, Julian Pontes, Sébastien Thuries, David Coriat, Michel Harrand, Denis Dutoit, Didier Lattard, Lucile Arnaud, Jean Charbonnier, Perceval Coudrain, Arnaud Garnier, Frédéric Berger, Alain Gueugnot, Alain Greiner, Quentin L. Meunier, Alexis Farcy, Alexandre Arriordaz, Séverine Chéramy, and Fabien Clermidy. 2021. IntAct: A 96-Core Processor With Six Chiplets 3D-Stacked on an Active Interposer With Distributed Interconnects and Integrated Power Management. IEEE Journal of Solid-State Circuits 56, 1 (2021), 79–97. https://doi.org/10.1109/JSSC.2020.3036341
[34]
Lilia Zaourar, Mohamed Benazouz, Ayoub Mouhagir, Fatma Jebali, Tanguy Sassolas, Jean-Christophe Weill, Carlos Falquez, Nam Ho, Dirk Pleiter, Antoni Portero, Estela Suarez, Polydoros Petrakis, Vassilis Papaefstathiou, Manolis Marazakis, Milan Radulovic, Francesc Martinez, Adrià Armejach, Marc Casas, Alejandro Nocua, and Romain Dolbeau. 2021. Multilevel simulation-based co-design of next generation HPC microprocessors. In 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS). 18–29. https://doi.org/10.1109/PMBS54543.2021.00008

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
RAPIDO '24: Proceedings of the 16th Workshop on Rapid Simulation and Performance Evaluation for Design
January 2024
54 pages
ISBN:9798400717918
DOI:10.1145/3642921
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 March 2024

Check for updates

Author Tags

  1. Cache coherency
  2. Chiplets
  3. High-performance Computing
  4. Multi-die chip
  5. SystemC TLM
  6. gem5

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

RAPIDO '24

Acceptance Rates

Overall Acceptance Rate 14 of 28 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 246
    Total Downloads
  • Downloads (Last 12 months)246
  • Downloads (Last 6 weeks)41
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media