Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

The scalable commutativity rule: designing scalable software for multicore processors

Published: 24 July 2017 Publication History

Abstract

Developing software that scales on multicore processors is an inexact science dominated by guesswork, measurement, and expensive cycles of redesign and reimplementation. Current approaches are workload-driven and, hence, can reveal scalability bottlenecks only for known workloads and available software and hardware. This paper introduces an interface-driven approach to building scalable software. This approach is based on the scalable commutativity rule, which, informally stated, says that whenever interface operations commute, they can be implemented in a way that scales. We formalize this rule and prove it correct for any machine on which conflict-free operations scale, such as current cache-coherent multicore machines. The rule also enables a better design process for scalable software: programmers can now reason about scalability from the earliest stages of interface definition through software design, implementation, and evaluation.

References

[1]
Appavoo, J., da Silva, D., Krieger, O., Auslander, M., Ostrowski, M., Rosenburg, B., Waterland, A., Wisniewski, R.W., Xenidis, J., Stumm, M., Soares, L. Experience distributing objects in an SMMP OS. ACM Trans. Comput. Syst. 25, 3 (August 2007).
[2]
Attiya, H., Guerraoui, R., Hendler, D., Kuznetsov, P., Michael, M.M., Vechev, M. Laws of order: Expensive synchronization in concurrent algorithms cannot be eliminated. In Proceedings of the 38th ACM Symposium on Principles of Programming Languages (Austin, TX, January 2011), 487--498.
[3]
Attiya, H., Hillel, E., Milani, A. Inherent limitations on disjoint-access parallel implementations of transactional memory. In Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures (Calgary, Canada, August 2009), 69--78.
[4]
Bernstein, P.A., Goodman, N. Concurrency control in distributed database systems. ACM Comput. Surv. 13, 2 (June 1981), 185--221.
[5]
Boyd-Wickizer, S., Clements, A., Mao, Y., Pesterev, A., Kaashoek, M.F., Morris, R., Zeldovich, N. An analysis of Linux scalability to many cores. In Proceedings of the 9th Symposium on Operating Systems Design and Implementation (OSDI) (Vancouver Canada, October 2010).
[6]
Clements, A.T. The scalable commutativity rule: Designing scalable software for multicore processors. PhD thesis, Massachusetts Institute of Technology (June 2014).
[7]
Clements, A.T., Kaashoek, M.F., Zeldovich, N. RadixVM: Scalable address spaces for multithreaded applications (revised 2014--0--08--05). In Proceedings of the ACM EuroSys Conference (Prague, Czech Republic, April 2013), 211--224.
[8]
Clements, A.T., Kaashoek, M.F., Zeldovich, N., Morris, R.T., Kohler, E. The scalable commutativity rule: Designing scalable software for multicore processors. ACM Trans. Comput. Syst. 32, 4 (January 2015), 10:1--10:47.
[9]
Ellen, F., Lev, Y., Luchango, V., Moir, M. SNZI: Scalable nonzero indicators. In Proceedings of the 26th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (Portland, OR, August 2007), 13--22.
[10]
Herlihy, M., Koskinen, E. Transactional boosting: A methodology for highly-concurrent transactional objects. In Proceedings of the 13th ACM Symposium on Principles and Practice of Parallel Programming (Salt Lake City, UT, February 2008), 207--216.
[11]
Herlihy, M.P., Wing, J.M. Linearizability: A correctness condition for concurrent objects. ACM Trans. Programm. Lang. Syst. 12, 3 (1990), 463--492.
[12]
Israeli, A., Rappoport, L. Disjoint-access-parallel implementations of strong shared memory primitives. In Proceedings of the 13th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (Los Angeles, CA, August 1994), 151--160.
[13]
McKenney, P.E. Differential profiling. Softw. Pract. Exp. 29, 3 (1999), 219--234.
[14]
Mellor-Crummey, J.M., Scott, M.L. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9, 1 (1991), 21--65.
[15]
Papamarcos, M.S., Patel, J.H. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the 11th Annual International Symposium on Computer Architecture (Ann Arbor, MI, June 1984), 348--354.
[16]
Prabhu, P., Ghosh, S., Zhang, Y., Johnson, N.P., August, D.I. Commutative set: A language extension for implicit parallel programming. In Proceedings of the 2011 ACM SIGPLAN Conference on Programming Language Design and Implementation (San Jose, CA, June 2011), 1--11.
[17]
Rinard, M.C., Diniz, P.C. Commutativity analysis: A new analysis technique for parallelizing compilers. ACM Trans. Programm. Lang. Syst. 19, 6 (November 1997), 942--991.
[18]
Roy, A., Hand, S., Harris, T. Exploring the limits of disjoint access parallelism. In Proceedings of the 1st USENIX Workshop on Hot Topics in Parallelism (Berkeley, CA, March 2009).
[19]
Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M. Conflict-free replicated data types. In Proceedings of the 13th International Conference on Stabilization, Safety, and Security of Distributed Systems (Grenoble, France, October 2011), 386--400.
[20]
Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M. Convergent and commutative replicated data types. Bull. EATCS 104 (June 2011), 67--88.
[21]
Steele, G.L., Jr. Making asynchronous parallelism safe for the world. In Proceedings of the 17th ACM Symposium on Principles of Programming Languages (San Francisco, CA, January 1990), 218--231.
[22]
Weihl, W.E. Commutativity-based concurrency control for abstract data types. IEEE Trans. Comput. 37, 12 (December 1988), 1488--1505.

Cited By

View all
  • (2021)A Graph Transformation System formalism for correctness of Transactional Memory algorithmsProceedings of the 25th Brazilian Symposium on Programming Languages10.1145/3475061.3475080(49-57)Online publication date: 27-Sep-2021
  • (2019)A Graph Transformation System formalism for Software Transactional Memory OpacityProceedings of the XXIII Brazilian Symposium on Programming Languages10.1145/3355378.3355387(3-10)Online publication date: 23-Sep-2019
  • (2019)Streaming 1.9 Billion Hypersparse Network Updates per Second with D4M2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916508(1-6)Online publication date: Sep-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 60, Issue 8
August 2017
92 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3127343
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2017
Published in CACM Volume 60, Issue 8

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Sloan Research Fellowship
  • Google
  • Quanta
  • NSF
  • Microsoft Research New Faculty Fellowship

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,113
  • Downloads (Last 6 weeks)15
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)A Graph Transformation System formalism for correctness of Transactional Memory algorithmsProceedings of the 25th Brazilian Symposium on Programming Languages10.1145/3475061.3475080(49-57)Online publication date: 27-Sep-2021
  • (2019)A Graph Transformation System formalism for Software Transactional Memory OpacityProceedings of the XXIII Brazilian Symposium on Programming Languages10.1145/3355378.3355387(3-10)Online publication date: 23-Sep-2019
  • (2019)Streaming 1.9 Billion Hypersparse Network Updates per Second with D4M2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916508(1-6)Online publication date: Sep-2019
  • (2019)Hypersparse Neural Network Analysis of Large-Scale Internet Traffic2019 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC.2019.8916263(1-11)Online publication date: Sep-2019
  • (2018)TabulaROSA: Tabular Operating System Architecture for Massively Parallel Heterogeneous Compute Engines2018 IEEE High Performance extreme Computing Conference (HPEC)10.1109/HPEC.2018.8547577(1-8)Online publication date: Sep-2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media