Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Transactional Memory Coherence and Consistency

Published: 02 March 2004 Publication History

Abstract

In this paper, we propos a new shared memory model: Transactionalmemory Coherence and Consistency (TCC).TCC providesa model in which atomic transactions are always the basicunit of parallel work, communication, memory coherence, andmemory reference consistency.TCC greatly simplifies parallelsoftware by eliminating the need for synchronization using conventionallocks and semaphores, along with their complexities.TCC hardware must combine all writes from each transaction regionin a program into a single packet and broadcast this packetto the permanent shared memory state atomically as a large block.This simplifies the coherence hardware because it reduces theneed for small, low-latency messages and completely eliminatesthe need for conventional snoopy cache coherence protocols, asmultiple speculatively written versions of a cache line may safelycoexist within the system.Meanwhile, automatic, hardware-controlledrollback of speculative transactions resolves any correctnessviolations that may occur when several processors attemptto read and write the same data simultaneously.The cost of thissimplified scheme is higher interprocessor bandwidth.To explore the costs and benefits of TCC, we study the characterisitcsof an optimal transaction-based memory system, and examinehow different design parameters could affect the performanceof real systems.Across a spectrum of applications, the TCC modelitself did not limit available parallelism.Most applications areeasily divided into transactions requiring only small write buffers,on the order of 4-8 KB.The broadcast requirements of TCCare high, but are well within the capabilities of CMPs and small-scaleSMPs with high-speed interconnects.

References

[1]
{1} S.V. Adve and K. Gharachorloo, "Shared Memory Consistency Models: A Tutorial," IEEE Computer, Vol. 29 No. 12, pp. 66-76, Dec. 1996.
[2]
{2} S.V. Adve and M.D. Hill, "Weak Ordering: A New Definition," Proc. of the 17th Annual International Symposium on Computer Architecture, June 1990.
[3]
{3} A. Agarwal, J. L. Hennessy, R. Simoni, and M.A. Horowitz, "An Evaluation of Directory Schemes for Cache Coherence," Proceedings of the 15th International Symposium on Computer Architecture, June 1988.
[4]
{4} A. Ahmed, P. Conway, B. Hughes, F. Weber, "AMD OpteronTM Shared Memory MP Systems," Conference Record of Hot Chips 14, Stanford, CA, Aug. 2003.
[5]
{5} D. Bossen, J. Tendler, K. Reick, "Power4 system design for high reliability," IEEE MICRO Magazine, Vol. 22 No. 2, pp. 16-24, March-April 2002.
[6]
{6} Byte Magazine, jBYTEmark Benchmark, http://www.byte.com, CMP Media LLC, 1999.
[7]
{7} A. Charlesworth, "Starfire" Extending the SMP Envelope," IEEE Micro Magazine, Vol. 18 No. 1, pp. 39-49, Jan.-Feb. 1998.
[8]
{8} M.K. Chen and K. Olukotun, "The Jrpm System for Dynamically parallelizing Java Programs," Proceedings of the 30th International Symposium on Computer Architecture (ISCA), pp. 434-445, June 2003.
[9]
{9} M. Dubois, C. Scheurich, and F. Briggs, "Synchronization, Coherence, and Event Ordering," IEEE Computer, February 1988.
[10]
{10} M. Franklin and G. Sohi, "ARB: A hardware mechanism for dynamic reordering of memory references," IEEE Transactions on Computers, Vol. 45 No. 5, pp. 552-571, May 1996.
[11]
{11} K. D. Gharachorloo, J. Laudon, P. Gibbons, A. Gupta, and J. L. Hennessey, "Memory Consistency and Event Ordering in Scalable Shared-Memory Multiprocessors," Proceedings of the 17th International Symposium on Computer Architecture, June 1990.
[12]
{12} C. Gniady, B. Falsafi, and T. N. Vijaykumar, "Is SC + ILP = RC," Proceedings of the 26th Annual International Symposium on Computer Architecture, pp. 162-171, May 1999.
[13]
{13} J. R. Goodman, "Using Cache Memory to Reduce Processor-Memory Traffic," Proceedings of the 10th Annual International Symposium on Computer Architecture, June 1983.
[14]
{14} S. Gopal, T. N. Vijaykumar, J. E. Smith and G. S. Sohi, "Speculative Versioning Cache," Proceedings of the Fourth International Symposium on High-Performance Computer Architecture (HPCA-4), Feb. 1998.
[15]
{15} J. Gray and A. Reuter, Transaction Proceesing: Concepts and Techniques, Morgan Kaufmann, 1993.
[16]
{16} L. Hammond, B. Hubbert, M. Siu, M. Prabha, M. Chen, and K. Olukotun, "The Stanford Hydra CMP," IEEE MICRO Magazine, March-April 2000.
[17]
{17} M. Herlihy and J. Moss, "Transactional Memory: Architectural Support for Lock-Free Data Structures," Proceedings of the 20th International Symposium on Computer Architecture, pp. 289-300, 1993.
[18]
{18} Java Grande Forum, Java Grande Benchmark Suite, http://www.epcc.ed.ac. uk/javagrande/, 2000.
[19]
{19} R. Kalla, B. Sinharoy, and J. Tendler, "Simultaneous Multithreading Implementation in POWER5," Conference Record of Hot Chips 15 Symposium, Stanford, CA, Aug. 2003.
[20]
{20} V. Krishnan and J. Torrellas, " A Chip Multiprocessor Architecture with Speculative Multithreading," IEEE Transactions on Computers, Special Issue on Multithreaded Architecture, September 1999.
[21]
{21} H. T. Kung and J. T. Robinson, "On Optimistic Methods for Concurrency Control," ACM Transactions on Database Systems, Vol. 6 No. 2, June 1981.
[22]
{22} J. P. Lamport, "How to make a Multiprocessor Computer that Correctly Executes Multiprocess Programs," IEEE Transactions on Computers, Vol. 28 No. 9, pp. 690-691, 1979.
[23]
{23} D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. L. Hennessy, "The Stanford DASH Multiprocessor," Proceedings of the 17th International Symposium on Computer Architecture, June 1990.
[24]
{24} M. Martin, M. Hill, and D. Wood, "Token Coherence: Decoupling Performance and Correctness," Proceedings of the 30th International Symposium on Computer Architecture, pp. 182-193, June 2003.
[25]
{25} J. Martinez and J. Torrellas, "Speculative Synchronization: Applying Thread-Level Speculation to Parallel Applications," Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), October, 2002.
[26]
{26} C. McNairy and D. Soltis, "Itanium 2 Processor Microarchitecture," IEEE MICRO Magazine, Vol. 23 No. 2, pp. 44-55, March-April 2003.
[27]
{27} J. Moreira, S. Midkiff, M. Gupta, and P. Artigas, Numerically Intensive Java, IBM at http://www.alphaworks.ibm.com/tech/ninja/, April 1999.
[28]
{28} M. Papamarcos and J. Patel, "A Low Overhead Coherence Solution for Multiprocessors with Private Cache Memories," Proceedings of the 11th Annual International Symposium on Computer Architecture, June 1984.
[29]
{29} M. K. Prabhu and K. Olukotun, "Using Thread-Level Speculation to Simplify Manual Parallelization," Proceedings of the Principles and Practice of Parallel Programming (PPoPP), pp. 1-12, June 2003.
[30]
{30} R. Rajwar and J. Goodman, "Transactional Lock-Free Execution of Lock-Based Programs," Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-X), October 2002.
[31]
{31} R. Rajwar and J. Goodman, "Speculation Lock Elision: Enabling Highly Concurrent Multithreaded Execution," Proceedings of the 34th International Symposium on Microarchitecture (MICRO-34), December 2001.
[32]
{32} P. Rundberg and P. Stenstrom, "Reordered Speculative Execution of Critical Sections," Proceedings of the 2002 International Conference on Parallel Processing (ICPP '02), Feb. 2002.
[33]
{33} G. Sohi, S. Breach, and T. Vijaykumar, "Multiscalar Processors," Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp. 414-425, June 1995.
[34]
{34} Standard Performance Evaluation Corporation, SPEC *, http://www. specbench.org/, Warrnton, VA, 1995-2000.
[35]
{35} Stanford Parallel Applications for Shared Memory (SPLASH-2), http:// www-flash.stanford.edu/apps/SPLASH/
[36]
{36} J. Steffan and T. Mowry, "The Potential for Using Thread-Level Data Speculation to Facilitate Automatic Parallelization," Proceedings of the Fourth International Symposium of High-Performance Computer Architecture, Las Vegas, Nevada, 1998.
[37]
{37} T. Wilkinson, Kaffe Virtual Machine, http://kaffe.org, 197-2002.
[38]
{38} S. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH2 Programs: Characterization and Methodological Considerations," Proceedings of the 22nd International Symposium on Computer Architecture, pp. 24-36, June 1995.

Cited By

View all
  • (2024)LockillerTM: Enhancing Performance Lower Bounds in Best-Effort Hardware Transactional Memory2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00081(865-875)Online publication date: 27-May-2024
  • (2024)Exploring Multiprocessor Approaches to Time Series AnalysisJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104855(104855)Online publication date: Feb-2024
  • (2022)Hardware-Supported Transactional MemoryTransactional Memory10.1007/978-3-031-01719-3_4(131-205)Online publication date: 17-Oct-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News
ACM SIGARCH Computer Architecture News  Volume 32, Issue 2
ISCA 2004
March 2004
373 pages
ISSN:0163-5964
DOI:10.1145/1028176
Issue’s Table of Contents
  • cover image ACM Conferences
    ISCA '04: Proceedings of the 31st annual international symposium on Computer architecture
    June 2004
    373 pages
    ISBN:0769521436

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 March 2004
Published in SIGARCH Volume 32, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)78
  • Downloads (Last 6 weeks)12
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)LockillerTM: Enhancing Performance Lower Bounds in Best-Effort Hardware Transactional Memory2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00081(865-875)Online publication date: 27-May-2024
  • (2024)Exploring Multiprocessor Approaches to Time Series AnalysisJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104855(104855)Online publication date: Feb-2024
  • (2022)Hardware-Supported Transactional MemoryTransactional Memory10.1007/978-3-031-01719-3_4(131-205)Online publication date: 17-Oct-2022
  • (2021)Adaptive Versioning in Transactional Memory SystemsAlgorithms10.3390/a1406017114:6(171)Online publication date: 31-May-2021
  • (2020)Formalizing determinacy of concurrent revisionsProceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3372885.3373820(258-269)Online publication date: 20-Jan-2020
  • (2020)Survey of machine learning application in transactional memory2020 28th Telecommunications Forum (TELFOR)10.1109/TELFOR51502.2020.9306547(1-4)Online publication date: 24-Nov-2020
  • (2019)Supporting peripherals in intermittent systems with just-in-time checkpointsProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314613(1101-1116)Online publication date: 8-Jun-2019
  • (2019)Providing Integrity in Real-Time Networks-on-ChipIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2019.290647127:8(1907-1920)Online publication date: 1-Aug-2019
  • (2019)Multiversioned Page Overlays: Enabling Faster Serializable Hardware Transactional Memory2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT.2019.00038(395-408)Online publication date: Sep-2019
  • (2019)Improving hardware transactional memory parallelization of computational geometry algorithms using privatizing transactionsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2019.04.018Online publication date: May-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media