article

Transactional Memory Coherence and Consistency

Authors:

Brian D. Carlstrom,

Manohar K. Prabhu,

Christos Kozyrakis,

Kunle OlukotunAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 32, Issue 2

Page 102

https://doi.org/10.1145/1028176.1006711

Published: 02 March 2004 Publication History

Abstract

In this paper, we propos a new shared memory model: Transactionalmemory Coherence and Consistency (TCC).TCC providesa model in which atomic transactions are always the basicunit of parallel work, communication, memory coherence, andmemory reference consistency.TCC greatly simplifies parallelsoftware by eliminating the need for synchronization using conventionallocks and semaphores, along with their complexities.TCC hardware must combine all writes from each transaction regionin a program into a single packet and broadcast this packetto the permanent shared memory state atomically as a large block.This simplifies the coherence hardware because it reduces theneed for small, low-latency messages and completely eliminatesthe need for conventional snoopy cache coherence protocols, asmultiple speculatively written versions of a cache line may safelycoexist within the system.Meanwhile, automatic, hardware-controlledrollback of speculative transactions resolves any correctnessviolations that may occur when several processors attemptto read and write the same data simultaneously.The cost of thissimplified scheme is higher interprocessor bandwidth.To explore the costs and benefits of TCC, we study the characterisitcsof an optimal transaction-based memory system, and examinehow different design parameters could affect the performanceof real systems.Across a spectrum of applications, the TCC modelitself did not limit available parallelism.Most applications areeasily divided into transactions requiring only small write buffers,on the order of 4-8 KB.The broadcast requirements of TCCare high, but are well within the capabilities of CMPs and small-scaleSMPs with high-speed interconnects.

References

[1]

{1} S.V. Adve and K. Gharachorloo, "Shared Memory Consistency Models: A Tutorial," IEEE Computer, Vol. 29 No. 12, pp. 66-76, Dec. 1996.

Digital Library

[2]

{2} S.V. Adve and M.D. Hill, "Weak Ordering: A New Definition," Proc. of the 17th Annual International Symposium on Computer Architecture, June 1990.

Digital Library

[3]

{3} A. Agarwal, J. L. Hennessy, R. Simoni, and M.A. Horowitz, "An Evaluation of Directory Schemes for Cache Coherence," Proceedings of the 15th International Symposium on Computer Architecture, June 1988.

Digital Library

[4]

{4} A. Ahmed, P. Conway, B. Hughes, F. Weber, "AMD Opteron^TM Shared Memory MP Systems," Conference Record of Hot Chips 14, Stanford, CA, Aug. 2003.

[5]

{5} D. Bossen, J. Tendler, K. Reick, "Power4 system design for high reliability," IEEE MICRO Magazine, Vol. 22 No. 2, pp. 16-24, March-April 2002.

Digital Library

[6]

{6} Byte Magazine, jBYTEmark Benchmark, http://www.byte.com, CMP Media LLC, 1999.

[7]

{7} A. Charlesworth, "Starfire" Extending the SMP Envelope," IEEE Micro Magazine, Vol. 18 No. 1, pp. 39-49, Jan.-Feb. 1998.

Digital Library

[8]

{8} M.K. Chen and K. Olukotun, "The Jrpm System for Dynamically parallelizing Java Programs," Proceedings of the 30th International Symposium on Computer Architecture (ISCA), pp. 434-445, June 2003.

Digital Library

[9]

{9} M. Dubois, C. Scheurich, and F. Briggs, "Synchronization, Coherence, and Event Ordering," IEEE Computer, February 1988.

Digital Library

[10]

{10} M. Franklin and G. Sohi, "ARB: A hardware mechanism for dynamic reordering of memory references," IEEE Transactions on Computers, Vol. 45 No. 5, pp. 552-571, May 1996.

Digital Library

[11]

{11} K. D. Gharachorloo, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessey, "Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors," Proceedings of the 17th International Symposium on Computer Architecture, June 1990.

Digital Library

[12]

{12} C. Gniady, B. Falsafi, and T. N. Vijaykumar, "Is SC + ILP = RC," Proceedings of the 26th Annual International Symposium on Computer Architecture, pp. 162-171, May 1999.

Digital Library

[13]

{13} J. R. Goodman, "Using Cache Memory to Reduce Processor-Memory Traffic," Proceedings of the 10th Annual International Symposium on Computer Architecture, June 1983.

Digital Library

[14]

{14} S. Gopal, T. N. Vijaykumar, J. E. Smith and G. S. Sohi, "Speculative Versioning Cache," Proceedings of the Fourth International Symposium on High-Performance Computer Architecture (HPCA-4), Feb. 1998.

Digital Library

[15]

{15} J. Gray and A. Reuter, Transaction Proceesing: Concepts and Techniques, Morgan Kaufmann, 1993.

Digital Library

[16]

{16} L. Hammond, B. Hubbert, M. Siu, M. Prabha, M. Chen, and K. Olukotun, "The Stanford Hydra CMP," IEEE MICRO Magazine, March-April 2000.

Digital Library

[17]

{17} M. Herlihy and J. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures," Proceedings of the 20th International Symposium on Computer Architecture, pp. 289-300, 1993.

Digital Library

[18]

{18} Java Grande Forum, Java Grande Benchmark Suite, http://www.epcc.ed.ac. uk/javagrande/, 2000.

[19]

{19} R. Kalla, B. Sinharoy, and J. Tendler, "Simultaneous Multithreading Implementation in POWER5," Conference Record of Hot Chips 15 Symposium, Stanford, CA, Aug. 2003.

[20]

{20} V. Krishnan and J. Torrellas, " A Chip Multiprocessor Architecture with Speculative Multithreading," IEEE Transactions on Computers, Special Issue on Multithreaded Architecture, September 1999.

Digital Library

[21]

{21} H. T. Kung and J. T. Robinson, "On Optimistic Methods for Concurrency Control," ACM Transactions on Database Systems, Vol. 6 No. 2, June 1981.

Digital Library

[22]

{22} J. P. Lamport, "How to make a Multiprocessor Computer that Correctly Executes Multiprocess Programs," IEEE Transactions on Computers, Vol. 28 No. 9, pp. 690-691, 1979.

Digital Library

[23]

{23} D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. L. Hennessy, "The Stanford DASH Multiprocessor," Proceedings of the 17th International Symposium on Computer Architecture, June 1990.

[24]

{24} M. Martin, M. Hill, and D. Wood, "Token Coherence: Decoupling Performance and Correctness," Proceedings of the 30th International Symposium on Computer Architecture, pp. 182-193, June 2003.

Digital Library

[25]

{25} J. Martinez and J. Torrellas, "Speculative Synchronization: Applying Thread-Level Speculation to Parallel Applications," Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), October, 2002.

Digital Library

[26]

{26} C. McNairy and D. Soltis, "Itanium 2 Processor Microarchitecture," IEEE MICRO Magazine, Vol. 23 No. 2, pp. 44-55, March-April 2003.

Digital Library

[27]

{27} J. Moreira, S. Midkiff, M. Gupta, and P. Artigas, Numerically Intensive Java, IBM at http://www.alphaworks.ibm.com/tech/ninja/, April 1999.

[28]

{28} M. Papamarcos and J. Patel, "A Low Overhead Coherence Solution for Multiprocessors with Private Cache Memories," Proceedings of the 11th Annual International Symposium on Computer Architecture, June 1984.

Digital Library

[29]

{29} M. K. Prabhu and K. Olukotun, "Using Thread-Level Speculation to Simplify Manual Parallelization," Proceedings of the Principles and Practice of Parallel Programming (PPoPP), pp. 1-12, June 2003.

Digital Library

[30]

{30} R. Rajwar and J. Goodman, "Transactional Lock-Free Execution of Lock-Based Programs," Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), October 2002.

Digital Library

[31]

{31} R. Rajwar and J. Goodman, "Speculation Lock Elision: Enabling Highly Concurrent Multithreaded Execution," Proceedings of the 34th International Symposium on Microarchitecture (MICRO-34), December 2001.

Digital Library

[32]

{32} P. Rundberg and P. Stenstrom, "Reordered Speculative Execution of Critical Sections," Proceedings of the 2002 International Conference on Parallel Processing (ICPP '02), Feb. 2002.

[33]

{33} G. Sohi, S. Breach, and T. Vijaykumar, "Multiscalar Processors," Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp. 414-425, June 1995.

Digital Library

[34]

{34} Standard Performance Evaluation Corporation, SPEC ^*, http://www. specbench.org/, Warrnton, VA, 1995-2000.

[35]

{35} Stanford Parallel Applications for Shared Memory (SPLASH-2), http:// www-flash.stanford.edu/apps/SPLASH/

[36]

{36} J. Steffan and T. Mowry, "The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization," Proceedings of the Fourth International Symposium of High-Performance Computer Architecture, Las Vegas, Nevada, 1998.

Digital Library

[37]

{37} T. Wilkinson, Kaffe Virtual Machine, http://kaffe.org, 197-2002.

[38]

{38} S. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH2 Programs: Characterization and Methodological Considerations," Proceedings of the 22nd International Symposium on Computer Architecture, pp. 24-36, June 1995.

Digital Library

Cited By

Wan LChao FLi QHan J(2024)LockillerTM: Enhancing Performance Lower Bounds in Best-Effort Hardware Transactional Memory2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00081(865-875)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00081
Quislant RGutierrez EPlata O(2024)Exploring Multiprocessor Approaches to Time Series AnalysisJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104855(104855)Online publication date: Feb-2024
https://doi.org/10.1016/j.jpdc.2024.104855
Larus JRajwar RLarus JRajwar R(2022)Hardware-Supported Transactional MemoryTransactional Memory10.1007/978-3-031-01719-3_4(131-205)Online publication date: 17-Oct-2022
https://doi.org/10.1007/978-3-031-01719-3_4
Show More Cited By

Recommendations

Transactional Memory Coherence and Consistency
ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture

In this paper, we propos a new shared memory model: Transactionalmemory Coherence and Consistency (TCC).TCC providesa model in which atomic transactions are always the basicunit of parallel work, communication, memory coherence, andmemory reference ...
Safe privatization in transactional memory
PPoPP '18

Transactional memory (TM) facilitates the development of concurrent applications by letting the programmer designate certain code blocks as atomic. Programmers using a TM often would like to access the same data both inside and outside transactions, ...
Programming with transactional coherence and consistency (TCC)
ASPLOS '04

Transactional Coherence and Consistency (TCC) offers a way to simplify parallel programming by executing all code within transactions. In TCC systems, transactions serve as the fundamental unit of parallel work, communication and coherence. As each ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 32, Issue 2

ISCA 2004

March 2004

373 pages

ISSN:0163-5964

DOI:10.1145/1028176

Issue’s Table of Contents

ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture
June 2004
373 pages
ISBN:0769521436

Copyright © 2004 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 March 2004

Published in SIGARCH Volume 32, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

381
Total Citations
View Citations
2,758
Total Downloads

Downloads (Last 12 months)78
Downloads (Last 6 weeks)12

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wan LChao FLi QHan J(2024)LockillerTM: Enhancing Performance Lower Bounds in Best-Effort Hardware Transactional Memory2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00081(865-875)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00081
Quislant RGutierrez EPlata O(2024)Exploring Multiprocessor Approaches to Time Series AnalysisJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104855(104855)Online publication date: Feb-2024
https://doi.org/10.1016/j.jpdc.2024.104855
Larus JRajwar RLarus JRajwar R(2022)Hardware-Supported Transactional MemoryTransactional Memory10.1007/978-3-031-01719-3_4(131-205)Online publication date: 17-Oct-2022
https://doi.org/10.1007/978-3-031-01719-3_4
Poudel PSharma G(2021)Adaptive Versioning in Transactional Memory SystemsAlgorithms10.3390/a1406017114:6(171)Online publication date: 31-May-2021
https://doi.org/10.3390/a14060171
Overbeek RBlanchette JHriţcu C(2020)Formalizing determinacy of concurrent revisionsProceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3372885.3373820(258-269)Online publication date: 20-Jan-2020
https://dl.acm.org/doi/10.1145/3372885.3373820
Vurdelja ISustran ZProtic JDraskovic D(2020)Survey of machine learning application in transactional memory2020 28th Telecommunications Forum (TELFOR)10.1109/TELFOR51502.2020.9306547(1-4)Online publication date: 24-Nov-2020
https://doi.org/10.1109/TELFOR51502.2020.9306547
Maeng KLucia BMcKinley KFisher K(2019)Supporting peripherals in intermittent systems with just-in-time checkpointsProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314613(1101-1116)Online publication date: 8-Jun-2019
https://dl.acm.org/doi/10.1145/3314221.3314613
Rambo EShang YErnst R(2019)Providing Integrity in Real-Time Networks-on-ChipIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.290647127:8(1907-1920)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1109/TVLSI.2019.2906471
Wang ZKozuch MMowry TSeshadri V(2019)Multiversioned Page Overlays: Enabling Faster Serializable Hardware Transactional Memory2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT.2019.00038(395-408)Online publication date: Sep-2019
https://doi.org/10.1109/PACT.2019.00038
Quislant RGutierrez EZapata EPlata O(2019)Improving hardware transactional memory parallelization of computational geometry algorithms using privatizing transactionsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2019.04.018Online publication date: May-2019
https://doi.org/10.1016/j.jpdc.2019.04.018
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents