research-article

Open access

Compound Memory Models

Authors:

Soham Chakraborty,

Sukarn Agarwal,

Nicolai Oswald,

Vijay NagarajanAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 7, Issue PLDI

Article No.: 153, Pages 1145 - 1168

https://doi.org/10.1145/3591267

Published: 06 June 2023 Publication History

Abstract

Today's mobile, desktop, and server processors are heterogeneous, consisting not only of CPUs but also GPUs and other accelerators. Such heterogeneous processors are starting to expose a shared memory interface across these devices.Given that each of these individual devices typically supports a distinct instruction set architecture and a distinct memory consistency model, it is not clear what the memory consistency model of the heterogeneous machine should be. In this paper, we answer this question by formalizing "compound" memory models: we present a compositional operational model describing the resulting model when devices with distinct consistency models are fused together. We instantiate our model with the compound x86TSO/PTX model -- a CPU enforcing x86TSO and a GPU enforcing the PTX model. A key result is that the x86TSO/PTX compound model retains compiler mappings from the language-based (scoped) C memory model. This means that threads mapped to the x86TSO device can continue to use the already proven C-to-x86TSO compiler mapping, and the same for PTX.

Supplementary Material

Auxiliary Archive (pldi23main-p284-p-archive.zip)

This is the technical supplement (appendix) to our submission.

Download
107.07 KB

References

[1]

Jade Alglave, Will Deacon, Richard Grisenthwaite, Antoine Hacquard, and Luc Maranget. 2021. Armed Cats: Formal Concurrency Modelling at Arm. ACM Trans. Program. Lang. Syst., 43, 2 (2021), Article 8, 54 pages. https://doi.org/10.1145/3458926

Digital Library

[2]

Jade Alglave and Luc Maranget. 2022. herd7 consistency model simulator. http://diy.inria.fr/www/

[3]

Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. Herding cats: modelling, simulation, testing, and data-mining for weak memory. ACM Trans. Program. Lang. Syst., 36, 2 (2014), 7:1–7:74. https://doi.org/10.1145/2627752

Digital Library

[4]

AMD. 2022. AMD Instinct™ MI200 Series Accelerator and Node Architectures. https://chipsandcheese.com/2022/09/18/hot-chips-34-amds-instinct-mi200-architecture/ Accessed: 9th September 2022

[5]

AMD. 2022. AMD ROCm Memory model. https://rocmdocs.amd.com/en/latest/ROCm_Compiler_SDK/ROCm-Codeobj-format.html##memory-model Accessed: 8th August 2022

[6]

ARM. 2011. Atomicity in the ARM Architecture. https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Application-Level-Memory-Model/Memory-types-and-attributes-and-the-memory-order-model/Atomicity-in-the-ARM-architecture Accessed: 20th March 2023

[7]

2018. ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile. Initial v8.4 EAC release

[8]

ARM. 2021. The AMBA CHI Specification. https://developer.arm.com/architectures/system-architectures/amba/amba-5 Accessed: 5th July 2022

[9]

Mark Batty. 2017. Compositional relaxed concurrency. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 375, 2104 (2017), September, https://kar.kent.ac.uk/64300/

[10]

Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. 2011. Mathematizing C++ concurrency. In POPL’11. ACM, 55–66. https://doi.org/10.1145/1926385.1926394

Digital Library

[11]

Soham Chakraborty and Viktor Vafeiadis. 2017. Formalizing the Concurrency Semantics of an LLVM Fragment. In CGO ’17. IEEE, 100–110.

[12]

Soham Chakraborty and Viktor Vafeiadis. 2019. Grounding Thin-Air Reads with Event Structures. 3, POPL (2019), https://doi.org/10.1145/3290383

Digital Library

[13]

CXL. 2022. Compute Express Link. https://www.computeexpresslink.org/ Accessed: 5th July 2022

[14]

Leonardo de Moura and Sebastian Ullrich. 2021. The Lean 4 Theorem Prover and Programming Language. In International Conference on Automated Deduction. 625–635.

Digital Library

[15]

Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. 2016. Modelling the ARMv8 architecture, operationally: concurrency and ISA. In POPL 2016. 608–621.

Digital Library

[16]

Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter Sewell. 2017. Mixed-size concurrency: ARM, POWER, C/C++11, and SC. In POPL’17.

[17]

Andrés Goens, Soham Chakraborty, Susmit Sarkar, Sukarn Agarwal, Nicolai Oswald, and Vijay Nagarajan. 2023. Compound Memory Models: Artifact. Dataset on Zenodo. https://doi.org/10.5281/zenodo.7798646

Digital Library

[18]

Derek R. Hower, Blake A. Hechtman, Bradford M. Beckmann, Benedict R. Gaster, Mark D. Hill, Steven K. Reinhardt, and David A. Wood. 2014. Heterogeneous-race-free Memory Models. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems.

[19]

HSA Foundation. 2012. Heterogeneous System Architecture: A Technical Review.

[20]

Dan Iorga, Alastair F. Donaldson, Tyler Sorensen, and John Wickerson. 2021. The semantics of shared memory in Intel CPU/FPGA systems. Proc. ACM Program. Lang., 5, OOPSLA (2021), 1–28. https://doi.org/10.1145/3485497

Digital Library

[21]

ISO/IEC 14882. 2011. Programming Language C++.

[22]

ISO/IEC 9899. 2011. Programming Language C.

[23]

Daniel Jackson. 2012. Software Abstractions: logic, language, and analysis. MIT press.

[24]

Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. 2017. A Promising Semantics for Relaxed-Memory Concurrency. In POPL’17 (POPL 2017). Association for Computing Machinery, New York, NY, USA. 175–189. isbn:9781450346603 https://doi.org/10.1145/3009837.3009850

Digital Library

[25]

Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. 2017. Repairing Sequential Consistency in C/C++11. In PLDI 2017. 618–632. https://doi.org/10.1145/3062341.3062352 Technical Appendix Available at

Digital Library

[26]

Leslie Lamport. 1979. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Trans. Comput., 28, 9 (1979), Sept., 690–691.

Digital Library

[27]

Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adrià Armejach, Nils Asmussen, Srikant Bharadwaj, Gabe Black, Gedare Bloom, Bobby R. Bruce, Daniel Rodrigues Carvalho, Jerónimo Castrillón, Lizhong Chen, Nicolas Derumigny, Stephan Diestelhorst, Wendy Elsasser, Marjan Fariborz, Amin Farmahini Farahani, Pouya Fotouhi, Ryan Gambord, Jayneel Gandhi, Dibakar Gope, Thomas Grass, Bagus Hanindhito, Andreas Hansson, Swapnil Haria, Austin Harris, Timothy Hayes, Adrian Herrera, Matthew Horsnell, Syed Ali Raza Jafri, Radhika Jagtap, Hanhwi Jang, Reiley Jeyapaul, Timothy M. Jones, Matthias Jung, Subash Kannoth, Hamidreza Khaleghzadeh, Yuetsu Kodama, Tushar Krishna, Tommaso Marinelli, Christian Menard, Andrea Mondelli, Tiago Mück, Omar Naji, Krishnendra Nathella, Hoa Nguyen, Nikos Nikoleris, Lena E. Olson, Marc S. Orr, Binh Pham, Pablo Prieto, Trivikram Reddy, Alec Roelke, Mahyar Samani, Andreas Sandberg, Javier Setoain, Boris Shingarov, Matthew D. Sinclair, Tuan Ta, Rahul Thakur, Giacomo Travaglini, Michael Upton, Nilay Vaish, Ilias Vougioukas, Zhengrong Wang, Norbert Wehn, Christian Weis, David A. Wood, Hongil Yoon, and Éder F. Zulian. 2020. The gem5 Simulator: Version 20.0+. CoRR, abs/2007.03152 (2020), arXiv:2007.03152. arxiv:2007.03152

[28]

Daniel Lustig, Sameer Sahasrabuddhe, and Olivier Giroux. 2019. A Formal Analysis of the NVIDIA PTX Memory Consistency Model. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, Providence, RI, USA, April 13-17, 2019, Iris Bahar, Maurice Herlihy, Emmett Witchel, and Alvin R. Lebeck (Eds.). ACM, 257–270. https://doi.org/10.1145/3297858.3304043

Digital Library

[29]

Paul E. McKenney. 2017. Is Parallel Programming Hard, And, If So, What Can You Do About It? (v2017.01.02a). CoRR, abs/1701.00854 (2017), arXiv:1701.00854. arxiv:1701.00854

[30]

Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, and David A. Wood. 2020. A Primer on Memory Consistency and Cache Coherence, Second Edition. Morgan & Claypool Publishers. https://doi.org/10.2200/S00962ED2V01Y201910CAC049

[31]

NVIDIA. 2019. CUDA Toolkit Documentation - PTX ISA.

[32]

NVIDIA. 2022. NVIDIA debuts Grace CPU Superschip. https://nvidianews.nvidia.com/news/nvidia-introduces-grace-cpu-superchip Accessed: 22nd August 2022

[33]

Nicolai Oswald, Vijay Nagarajan, Daniel J. Sorin, Vasilis Gavrielatos, Theo Olausson, and Reece Carr. 2022. HeteroGen: Automatic Synthesis of Heterogeneous Cache Coherence Protocols. In IEEE International Symposium on High-Performance Computer Architecture, HPCA 2022, Seoul, South Korea, April 2-6, 2022. IEEE, 756–771. https://doi.org/10.1109/HPCA53966.2022.00061

[34]

Scott Owens, Susmit Sarkar, and Peter Sewell. 2009. A Better x86 Memory Model: x86-TSO. In TPHOLs. 391–407. https://doi.org/10.1007/978-3-642-03359-9_27

Digital Library

[35]

Anton Podkopaev, Ori Lahav, and Viktor Vafeiadis. 2019. Bridging the Gap between Programming Languages and Hardware Weak Memory Models. Proc. ACM Program. Lang., 3, POPL (2019), https://doi.org/10.1145/3290382

Digital Library

[36]

Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams. 2012. Synchronising C/C++ and POWER. In PLDI’12. ACM, 311–322. https://doi.org/10.1145/2254064.2254102

Digital Library

[37]

Peter Sewell. 2022. C/C++11 mappings to processors. https://www.cl.cam.ac.uk/ pes20/cpp/cpp0xmappings.html Accessed April 5, 2023

[38]

Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, and Magnus O. Myreen. 2010. X86-TSO: A Rigorous and Usable Programmer’s Model for X86 Multiprocessors. Commun. ACM, 53, 7 (2010), July, 89–97. issn:0001-0782 https://doi.org/10.1145/1785414.1785443

Digital Library

[39]

The OpenCAPI Consortium. 2021. The OpenCAPI Consortium. https://opencapi.org/ Accessed: 5th July 2022

[40]

Andrew Waterman and Krste Asanovic. 2019. The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA. https://github.com/riscv/riscv-isa-manual/releases/download/Ratified-IMAFDQC/riscv-spec-20191213.pdf

[41]

Sizhuo Zhang, Muralidaran Vijayaraghavan, and Arvind. 2017. Weak Memory Models: Balancing Definitional Simplicity and Implementation Flexibility. In 26th International Conference on Parallel Architectures and Compilation Techniques, PACT 2017, Portland, OR, USA, September 9-13, 2017. IEEE Computer Society, 288–302. https://doi.org/10.1109/PACT.2017.29

Cited By

de Vilhena PLahav OVafeiadis VRaad A(2024)Extending the C/C++ Memory Model with Inline AssemblyProceedings of the ACM on Programming Languages10.1145/36897498:OOPSLA2(1081-1107)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689749
Mavrogeorgis NVasiladiotis CMu PKhordadi AFranke BBarbalace ARodríguez GSadayappan PSukumaran-Rajam A(2024)UNIFICO: Thread Migration in Heterogeneous-ISA CPUs without State TransformationProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641565(86-99)Online publication date: 17-Feb-2024
https://dl.acm.org/doi/10.1145/3640537.3641565

Index Terms

Compound Memory Models

Recommendations

Brief Announcement: Preserving Happens-before in Persistent Memory
SPAA '16: Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures

Nonvolatile, byte-addressable memory (NVM) will soon be commercially available, but registers and caches are expected to remain transient on most machines. Without careful management, the data preserved in the wake of a crash are likely to be ...
Automatic fence insertion for shared memory multiprocessing
ICS '03: Proceedings of the 17th annual international conference on Supercomputing

In general, the hardware memory consistency model in a multiprocessor system is not identical to the memory model at the programming language level. Consequently, the programming language memory model must be mapped onto the hardware memory model. ...
Relationships between memory models

There have been many proposals of shared memory systems, each one providing different types of memory coherence for interprocess communication. However, they have usually been defined using different formalisms. This makes it difficult to compare among ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 7, Issue PLDI

June 2023

2020 pages

EISSN:2475-1421

DOI:10.1145/3554310

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2023

Published in PACMPL Volume 7, Issue PLDI

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

Engineering and Physical Sciences Research Council

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
857
Total Downloads

Downloads (Last 12 months)550
Downloads (Last 6 weeks)66

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

de Vilhena PLahav OVafeiadis VRaad A(2024)Extending the C/C++ Memory Model with Inline AssemblyProceedings of the ACM on Programming Languages10.1145/36897498:OOPSLA2(1081-1107)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689749
Mavrogeorgis NVasiladiotis CMu PKhordadi AFranke BBarbalace ARodríguez GSadayappan PSukumaran-Rajam A(2024)UNIFICO: Thread Migration in Heterogeneous-ISA CPUs without State TransformationProceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction10.1145/3640537.3641565(86-99)Online publication date: 17-Feb-2024
https://dl.acm.org/doi/10.1145/3640537.3641565

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents