Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3009837.3009839acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article

Mixed-size concurrency: ARM, POWER, C/C++11, and SC

Published: 01 January 2017 Publication History

Abstract

Previous work on the semantics of relaxed shared-memory concurrency has only considered the case in which each load reads the data of exactly one store. In practice, however, multiprocessors support mixed-size accesses, and these are used by systems software and (to some degree) exposed at the C/C++ language level. A semantic foundation for software, therefore, has to address them.
We investigate the mixed-size behaviour of ARMv8 and IBM POWER architectures and implementations: by experiment, by developing semantic models, by testing the correspondence between these, and by discussion with ARM and IBM staff. This turns out to be surprisingly subtle, and on the way we have to revisit the fundamental concepts of coherence and sequential consistency, which change in this setting. In particular, we show that adding a memory barrier between each instruction does not restore sequential consistency. We go on to extend the C/C++11 model to support non-atomic mixed-size memory accesses.
This is a necessary step towards semantics for real-world shared-memory concurrent code, beyond litmus tests.

References

[1]
L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput., C-28(9):690– 691, 1979.
[2]
L. M. Censier and P. Feautrier. A new solution to coherence problems in multicache systems. IEEE Trans. Comput., 27(12):1112–1118, December 1978.
[3]
William W. Collier. Principles of architecture for systems of parallel processes. Technical Report TR 00.3100, IBM Poughkeepsie, 1981.
[4]
Michel Dubois, Christoph Scheurich, and Faye A. Briggs. Memory access buffering in multiprocessors. In Proc. ISCA ’86, pages 434– 442, 1986.
[5]
J. Misra. Axioms for memory access in asynchronous hardware systems. ACM Trans. Program. Lang. Syst., 8(1):142–153, 1986.
[6]
Dennis Shasha and Marc Snir. Efficient and correct execution of parallel programs that share memory. ACM Trans. Program. Lang. Syst., 10(2):282–312, April 1988.
[7]
James R. Goodman. Cache consistency and sequential consistency. Technical Report Technical Report 61, IEEE Scalable Coherent Interface (SCI) Working Group, March 1989.
[8]
Sarita V. Adve and Mark D. Hill. Weak ordering — a new definition. In Proc. ISCA ’90, pages 2–14. ACM, 1990.
[9]
Kourosh Gharachorloo, Daniel Lenoski, James Laudon, Phillip Gibbons, Anoop Gupta, and John Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proc. ISCA ’90, pages 15–26. ACM, 1990.
[10]
William W. Collier. Reasoning About Parallel Architectures. Prentice-Hall, Inc., 1992.
[11]
Pradeep S. Sindhu, Jean-Marc Frailong, and Michel Cekleov. Formal Specification of Memory Models, pages 25–41. Springer US, 1992.
[12]
Prince Kohli, Gil Neiger, and Mustaque Ahamad. A characterization of scalable shared memories. In ICPP: International Conference on Parallel Processing, pages 332–335, 1993.
[13]
F. Corella, J. M. Stone, and C. M. Barton. A formal specification of the PowerPC shared memory architecture. Technical Report RC18638, IBM, 1993.
[14]
David L Dill, Seungjoon Park, and Andreas G. Nowatzyk. Formal specification of abstract memory models. In Proceedings of the 1993 Symposium on Research on Integrated Systems, pages 38–52. MIT Press, 1993.
[15]
The SPARC Architecture Manual, Version 9. SPARC Int., Inc., 1994.
[16]
Hagit Attiya and Roy Friedman. Programming DEC-Alpha based multiprocessors the easy way (extended abstract). In Proc. SPAA, pages 157–166, New York, NY, USA, 1994. ACM.
[17]
José M. Bernabéu-Aubán and Vicente Cholvi-juan. Formalizing memory coherency models. Journal of Computing and Information, 1:653–672, 1994.
[18]
K. Gharachorloo. Memory consistency models for shared-memory multiprocessors. WRL Research Report, 95(9), 1995.
[19]
Mustaque Ahamad, Gil Neiger, James E. Burns, Prince Kohli, and Phillip W. Hutto. Causal memory: definitions, implementation, and programming. Distributed Computing, 9(1):37–49, 1995.
[20]
Lisa Higham, Jalal Kawash, and Nathaly Verwaal. Weak memory consistency models. Part I: Definitions and comparisons. Technical report, Department of Computer Science, University of Calgary, 1998.
[21]
Prosenjit Chatterjee and Ganesh Gopalakrishnan. Towards a formal model of shared memory consistency for Intel Itaniumtm. In 19th International Conference on Computer Design (ICCD 2001), September 2001, Austin, TX, USA, pages 515–518, 2001.
[22]
Intel. A formal specification of Intel Itanium processor family memory ordering, 2002. http://download.intel.com/design/ Itanium/Downloads/25142901.pdf.
[23]
A. Adir, H. Attiya, and G. Shurek. Information-flow models for shared memory with an application to the PowerPC architecture. IEEE Trans. Parallel Distrib. Syst., 14(5):502–515, 2003.
[24]
Yue Yang, Ganesh Gopalakrishnan, Gary Lindstrom, and Konrad Slind. Nemos: A framework for axiomatic and executable specifications of memory consistency models. In 18th International Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, New Mexico, USA, 2004.
[25]
Lisa Higham, LillAnne Jackson, and Jalal Kawash. Programmercentric conditions for Itanium memory consistency. In Proceedings of the 8th International Conference on Distributed Computing and Networking, ICDCN’06, pages 58–69. Springer-Verlag, 2006.
[26]
Arvind Arvind and Jan-Willem Maessen. Memory model = instruction reordering + store atomicity. In Proc. ISCA ’06, pages 29–40. IEEE Computer Society, 2006.
[27]
N. Chong and S. Ishtiaq. Reasoning about the ARM weakly consistent memory model. In MSPC, 2008.
[28]
Susmit Sarkar, Peter Sewell, Francesco Zappa Nardelli, Scott Owens, Tom Ridge, Thomas Braibant, Magnus Myreen, and Jade Alglave. The semantics of x86-CC multiprocessor machine code. In Proc. POPL 2009, pages 379–391, January 2009.
[29]
J. Alglave, A. Fox, S. Ishtiaq, M. O. Myreen, S. Sarkar, P. Sewell, and F. Zappa Nardelli. The semantics of Power and ARM multiprocessor machine code. In Proc. DAMP 2009, January 2009.
[30]
J. Alglave, L. Maranget, S. Sarkar, and P. Sewell. Fences in weak memory models. In Proc. CAV, 2010.
[31]
Scott Owens, Susmit Sarkar, and Peter Sewell. A better x86 memory model: x86-TSO. In Proceedings of TPHOLs 2009: Theorem Proving in Higher Order Logics, LNCS 5674, pages 391–407, 2009.
[32]
Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, and Magnus O. Myreen. x86-TSO: A rigorous and usable programmer’s model for x86 multiprocessors. Communications of the ACM, 53(7):89–97, July 2010. (Research Highlights).
[33]
Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. Understanding POWER multiprocessors. In Proc. PLDI ’11, pages 175–186, 2011.
[34]
Mark Batty, Kayvan Memarian, Scott Owens, Susmit Sarkar, and Peter Sewell. Clarifying and Compiling C/C++ Concurrency: from C++11 to POWER. In Proc. POPL 2012, pages 509–520, 2012.
[35]
Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave, and Derek Williams. Synchronising C/C++ and POWER. In Proceedings of PLDI 2012, the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation (Beijing), pages 311–322, 2012.
[36]
Luc Maranget, Susmit Sarkar, and Peter Sewell. A tutorial introduction to the ARM and POWER relaxed memory models. Draft available from http://www.cl.cam.ac.uk/~pes20/ ppc-supplemental/test7.pdf, 2012.
[37]
Jade Alglave, Luc Maranget, and Michael Tautschnig. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM TOPLAS, 36(2):7:1–7:74, July 2014.
[38]
Kathryn E. Gray, Gabriel Kerneis, Dominic Mulligan, Christopher Pulte, Susmit Sarkar, and Peter Sewell. An integrated concurrency and core-ISA architectural envelope definition, and test oracle, for IBM POWER multiprocessors. In Proc. MICRO-48, the 48th Annual IEEE/ACM International Symposium on Microarchitecture, December 2015.
[39]
Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will Deacon, and Peter Sewell. Modelling the ARMv8 architecture, operationally: Concurrency and ISA. In Proceedings of POPL: the 43rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2016.
[40]
Sizhuo Zhang, Arvind, and Muralidaran Vijayaraghavan. Taming weak memory models. CoRR, abs/1606.05416, 2016.
[41]
Linux kernel lockrefs. https://lwn.net/Articles/565734/, http://git.kernel.org/cgit/linux/kernel/git/torvalds/ linux.git/tree/lib/lockref.c, http://git.kernel.org/cgit/ linux/kernel/git/torvalds/linux.git/tree/include/linux/ lockref.h.
[42]
ARM Ltd. ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile), 2015. ARM DDI 0487A.h (ID092915).
[43]
Power ISATM Version 2.07. IBM, 2013.
[44]
Jade Alglave, Luc Maranget, Susmit Sarkar, and Peter Sewell. Litmus: running tests against hardware. In Proceedings of TACAS 2011, pages 41–44. Springer-Verlag, 2011.
[45]
H.-J. Boehm and S. Adve. Foundations of the C++ concurrency memory model. In Proc. PLDI, 2008.
[46]
M. Batty, S. Owens, S. Sarkar, P. Sewell, and T. Weber. Mathematizing C++ concurrency. In Proc. POPL, 2011.
[47]
Yatin A. Manerkar, Caroline Trippel, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. Counterexamples and proof loophole for the C/C++ to POWER and ARMv7 trailing-sync compiler mappings. CoRR, abs/1611.01507, 2016.
[48]
Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. Repairing sequential consistency in C/C++11. Note, available at http://plv.mpi-sws.org/scfix/, 2016.
[49]
Susmit Sarkar and Peter Sewell. Corrigendum: C/C++11 to POWER concurrency compilation scheme correctness proof. Note, available at http://www.cl.cam.ac.uk/users/pes20/cppppc/corrigendum. html, December 2016.
[50]
P. Cenciarelli, A. Knapp, and E. Sibilio. The Java memory model: Operationally, denotationally, axiomatically. In ESOP, 2007.
[51]
J. Ševˇcík and D. Aspinall. On validity of program transformations in the Java memory model. In ECOOP, 2008.
[52]
Mark Batty, Kayvan Memarian, Kyndylan Nienhuis, Jean Pichon-Pharabod, and Peter Sewell. The problem of programming language concurrency semantics. In Proceedings of ESOP 2015, 2015.
[53]
Jean Pichon-Pharabod and Peter Sewell. A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions. In Proceedings of POPL, 2016.
[54]
Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathryn E. Gray, Ali Sezgin, Mark Batty, and Peter Sewell. Supplementary material. http://www.cl.cam.ac.uk/ ~pes20/popl17/,
[55]
Dominic P. Mulligan, Scott Owens, Kathryn E. Gray, Tom Ridge, and Peter Sewell. Lem: reusable engineering of real-world semantics. In Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming, pages 175–188, 2014.
[56]
P. Becker, editor. Programming Languages — C++. 2011. ISO/IEC 14882:2011. http://www.open-std.org/jtc1/sc22/wg21/docs/ papers/2011/n3242.pdf.
[57]
Mark John Batty. The C11 and C++11 Concurrency Model. PhD thesis, University of Cambridge Computer Laboratory, 2014.
[58]
P. E. McKenney and R. Silvera. Example POWER implementation for C/C++ memory model. http://www.rdrop.com/users/paulmck/ scalability/paper/N2745r.2011.03.04a.html, 2011.
[59]
Jade Alglave and Luc Maranget. The diy tool. http://diy.inria. fr/.
[60]
Mark Batty, Mike Dodds, and Alexey Gotsman. Library abstraction for C/C++ concurrency. In Proc. POPL ’13, pages 235–248. ACM, 2013.
[61]
Aaron Turon, Viktor Vafeiadis, and Derek Dreyer. GPS: Navigating weak memory with ghosts, protocols, and separation. In Proc. OOPSLA ’14, 2014.
[62]
Richard Bornat, Jade Alglave, and Matthew J. Parkinson. New lace and arsenic: adventures in weak memory with a program logic. CoRR, abs/1512.01416, 2015.

Cited By

View all
  • (2025)Model Checking C/C++ with Mixed-Size AccessesProceedings of the ACM on Programming Languages10.1145/37049119:POPL(2232-2252)Online publication date: 9-Jan-2025
  • (2024)An Axiomatic Basis for Computer Programming on the Relaxed Arm-A Architecture: The AxSL LogicProceedings of the ACM on Programming Languages10.1145/36328638:POPL(604-637)Online publication date: 5-Jan-2024
  • (2024)Analyzing the memory ordering models of the Apple M1Journal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103102149:COnline publication date: 1-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages
January 2017
901 pages
ISBN:9781450346603
DOI:10.1145/3009837
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ISA
  2. Relaxed Memory Models
  3. mixed-size
  4. semantics

Qualifiers

  • Research-article

Funding Sources

Conference

POPL '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 824 of 4,130 submissions, 20%

Upcoming Conference

POPL '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)8
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Model Checking C/C++ with Mixed-Size AccessesProceedings of the ACM on Programming Languages10.1145/37049119:POPL(2232-2252)Online publication date: 9-Jan-2025
  • (2024)An Axiomatic Basis for Computer Programming on the Relaxed Arm-A Architecture: The AxSL LogicProceedings of the ACM on Programming Languages10.1145/36328638:POPL(604-637)Online publication date: 5-Jan-2024
  • (2024)Analyzing the memory ordering models of the Apple M1Journal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103102149:COnline publication date: 1-Apr-2024
  • (2024)Efficiently Adapting Stateless Model Checking for C11/C++11 to Mixed-Size AccessesProgramming Languages and Systems10.1007/978-981-97-8943-6_17(346-364)Online publication date: 23-Oct-2024
  • (2023)Compound Memory ModelsProceedings of the ACM on Programming Languages10.1145/35912677:PLDI(1145-1168)Online publication date: 6-Jun-2023
  • (2023)PipeSynth: Automated Synthesis of Microarchitectural Axioms for Memory ConsistencyProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582056(513-527)Online publication date: 25-Mar-2023
  • (2023)Kater: Automating Weak Memory Model Metatheory and Consistency CheckingProceedings of the ACM on Programming Languages10.1145/35712127:POPLOnline publication date: 11-Jan-2023
  • (2023)TOSTING: Investigating Total Store Ordering on ARMArchitecture of Computing Systems10.1007/978-3-031-42785-5_10(139-152)Online publication date: 26-Aug-2023
  • (2022)Mixed-proxy extensions for the NVIDIA PTX memory consistency modelProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3533045(1058-1070)Online publication date: 18-Jun-2022
  • (2022)Relaxed Memory ConsistencyA Primer on Memory Consistency and Cache Coherence10.1007/978-3-031-01764-3_5(55-89)Online publication date: 28-Mar-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media