research-article

Free access

ACID Support for Compute eXpress Link Memory Transactions

Authors: Ellis Giles, Peter VarmanAuthors Info & Claims

SC-W '24: Proceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis

Pages 982 - 995

https://doi.org/10.1109/SCW63240.2024.00138

Published: 11 February 2025 Publication History

Abstract

With the recent explosive growth in worldwide data and data processing demands, the need to support a large volume of transactions on shared data is increasing in both high performance computing and datacenter processing. A recent innovation in server architectures is the use of disaggregated memory based on the Compute eXpress Link (CXL) interconnect protocol. This memory architecture is increasing in popularity as it allows for dynamic demand-sensitive resizing of aggregated memory, support for heterogeneous memory types, and sharing of data amongst supported processors and devices, including computational accelerators. However, while this new memory architecture alleviates many concerns in datacenter architectures, the data integrity when using memory based transactions over CXL faces many challenges.

To solve for these challenges, we describe a novel solution for providing ACID (Atomicity, Consistency, Isolation, Durability) transactions in a CXL-based disaggregated memory architecture.

We call this solution Transactional CXL or TCXL. TCXL requires no changes to the existing processor microarchitectures and is implemented in a software library with a back-end controller that can be embedded in a CXL controller or as a stand alone CXL Device or implemented on a host.

The transactions support persistent memory durable transactions and in-memory volatile transactions which can be in a pooled memory expansion for a single processor or shared amongst multiple processors. TCXL also supports processor based Hardware Transactional Memory (HTM) based transactions both on processor and over CXL. We evaluate TCXL by extending a CXL simulator and executing micro-benchmarks. In addition to gaining the benefits of using CXL, we show TCXL outperforms other approaches.

References

[1]

F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner, "SAP HANA Database: Data management for modern business applications", SIGMOD Rec., vol. 40, no. 4, pp. 45--51, Jan. 2012.

Digital Library

[2]

V. Raman, G. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman et al., "Db2 with blu acceleration: So much more than just a column store", Proceedings of the VLDB Endowment, vol. 6, no. 11, pp. 1080--1091, 2013.

Digital Library

[3]

R. Palamuttam, R. M. Mogrovejo, C. Mattmann, B. Wilson, K. Whitehall, R. Verma, L. McGibbney, and P. Ramirez, "Scispark: Applying in-memory distributed computing to weather event detection and tracking", in Big Data (Big Data), 2015 IEEE International Conference on. IEEE, 2015, pp. 2020--2026.

Digital Library

[4]

G. Team, "Gridgain: In-memory computing platform", 2007.

[5]

X. Meng, J. Bradley, B. Yavuz, E. Sparks, S. Venkataraman, D. Liu, J. Freeman, D. Tsai, M. Amde, S. Owen et al., "MLlib: Machine learning in apache spark", Journal of Machine Learning Research, vol. 17, no. 34, pp. 1--7, 2016.

Digital Library

[6]

D. Das Sharma, R. Blankenship, and D. Berger, "An introduction to the compute express link (cxl) interconnect", ACM Comput. Surv., jun 2024, just Accepted. [Online]. Available: https://doi.org/10.1145/3669900

Digital Library

[7]

M. Herlihy and J. E. B. Moss, Transactional Memory: Architectural support for lock-free data structures. ACM, 1993, vol. 21,2.

[8]

R. Rajwar and J. R. Goodman, "Speculative lock elision: Enabling highly concurrent multithreaded execution", in Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture. IEEE Computer Society, 2001, pp. 294--305.

[9]

H. Li, D. S. Berger, L. Hsu, D. Ernst, P. Zardoshti, S. Novakovic, M. Shah, S. Rajadnya, S. Lee, I. Agarwal et al., "Pond: Cxl-based memory pooling systems for cloud platforms", in Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, 2023, pp. 574--587.

Digital Library

[10]

D. Gouk, M. Kwon, H. Bae, S. Lee, and M. Jung, "Memory pooling with cxl", IEEE Micro, vol. 43, no. 2, pp. 48--57, 2023.

Digital Library

[11]

C. Jo, H. Kim, H. Geng, and B. Egger, "Rackmem: A tailored caching layer for rack scale computing", in Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, ser. PACT '20. New York, NY, USA: Association for Computing Machinery, 2020, p. 467--480. [Online]. Available: https://doi.org/10.1145/3410463.3414643

Digital Library

[12]

C.-S. Li, H. Franke, C. Parris, B. Abali, M. Kesavan, and V. Chang, "Composable architecture for rack scale big data computing", Future Generation Computer Systems, vol. 67, pp. 180--193, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167739X16302631

[13]

Intel, "Intel rack scale design architecture." [Online]. Available: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/rack-scale-design-architecture-white-paper.pdf

[14]

I. Super Micro Computer, "Supermicro rsd rack scale design solution brief", May 2018. [Online]. Available: https://www.supermicro.com/solutions/Solution-Brief_Supermicro-RSD.pdf

[15]

D. Gouk, S. Lee, M. Kwon, and M. Jung, "Direct access, High-Performance memory disaggregation with DirectCXL", in 2022 USENIX Annual Technical Conference (USENIX ATC 22). Carlsbad, CA: USENIX Association, Jul. 2022, pp. 287--294. [Online]. Available: https://www.usenix.org/conference/atc22/presentation/gouk

[16]

C. Wang, K. He, R. Fan, X. Wang, W. Wang, and Q. Hao, "Cxl over ethernet: A novel fpga-based memory disaggregation design in data centers", in 2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2023, pp. 75--82.

[17]

E. Giles, K. Doshi, and P. Varman, "Hardware Transactional Persistent Memory", in Proceedings of the International Symposium on Memory Systems, ser. MEMSYS '18. New York, NY, USA: Association for Computing Machinery, 2018, pp. 190--205. [Online]. Available: https://doi.org/10.1145/3240302.3240305

Digital Library

[18]

L. Pu, K. A. Doshi, E. R. Giles, and P. J. Varman, "Non-Intrusive Persistence with a Backend NVM Controller", IEEE Computer Architecture Letters, vol. 15, no. 1, pp. 29--32, Jan 2016. [Online]. Available: https://doi.org/10.1109/LCA.2015.2443105

Digital Library

[19]

K. A. Doshi, E. R. Giles, and P. J. Varman, "Atomic Persistence for SCM with a Non-intrusive Backend Controller", in The 22nd International Symposium on High-Performance Computer Architecture (HPCA). IEEE, March 2016, pp. 77--89. [Online]. Available: https://doi.org/10.1109/HPCA.2016.7446055

[20]

E. Giles, K. Doshi, and P. Varman, "SoftWrAP: A lightweight framework for transactional support of storage class memory", in Mass Storage Systems and Technologies (MSST), 2015 31st Symposium on, May 2015, pp. 1--14. [Online]. Available: https://doi.org/10.1109/MSST.2015.7208276

[21]

E. Giles, K. Doshi, and P. Varman, "Continuous Checkpointing of HTM Transactions in NVM", in Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management, ser. ISMM 2017. New York, NY, USA: ACM, 2017, pp. 70--81. [Online]. Available: http://doi.acm.org/10.1145/3092255.3092270

Digital Library

[22]

D. Sharma and I. Agarwal, "Compute express link 3.0 white paper", 2024. [Online]. Available: https://computeexpresslink.org/wp-content/uploads/2023/12/CXL_3.0_white-paper_FINAL.pdf

[23]

Intel Corporation, "Intel Transactional Synchronization Extensions", in Intel Architecture Instruction Set Extensions Programming Reference, February 2012, ch. 8, http://software.intel.com/.

[24]

ARM, "Overview of arm transactional memory extension", March 2022. [Online]. Available: https://developer.arm.com/documentation/102873/0100/Hardware-Transactional-Memory

[25]

H. Q. Le, G. L. Guthrie, D. E. Williams, M. M. Michael, B. G. Frey, W. J. Starke, C. May, R. Odaira, and T. Nakaike, "Transactional memory support in the ibm power8 processor", IBM Journal of Research and Development, vol. 59, no. 1, pp. 8:1--8:14, 2015.

Digital Library

[26]

Intel Corporation, "Intel Architecture Instruction Set Extensions Programming Reference", October 2014, http://software.intel.com/.

[27]

Intel Corporation, "Intel 64 and IA-32 Architectures Software Developer Manual", http://www.software.intel.com/.

[28]

H. Avni, E. Levy, and A. Mendelson, "Hardware transactions in non-volatile memory", in Proceedings of the 29th International Symposium on Distributed Computing - Volume 9363, ser. DISC 2015. New York, NY, USA: Springer-Verlag New York, Inc., 2015, pp. 617--630.

[29]

H. Avni and T. Brown, "PHyTM: Persistent hybrid transactional memory", Proceedings of the VLDB Endowment, vol. 10, no. 4, pp. 409--420, 2016.

Digital Library

[30]

M. Liu, M. Zhang, K. Chen, X. Qian, Y. Wu, W. Zheng, and J. Ren, "DudeTM: Building Durable Transactions with Decoupling for Persistent Memory", in Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS '17. New York, NY, USA: ACM, 2017, pp. 329--343. [Online]. Available: http://doi.acm.org/10.1145/3037697.3037714

Digital Library

[31]

A. Joshi, V. Nagarajan, M. Cintra, and S. Viglas, "Dhtm: Durable hardware transactional memory", in 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2018, pp. 452--465.

Digital Library

[32]

Z. Wang, H. Yi, R. Liu, M. Dong, and H. Chen, "Persistent transactional memory", IEEE Computer Architecture Letters, vol. 14, no. 1, pp. 58--61, Jan 2015.

Digital Library

[33]

D. Castro, P. Romano, and J. Barreto, "Hardware transactional memory meets memory persistency", in 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2018, pp. 368--377.

[34]

D. Dice, O. Shalev, and N. Shavit, "Transactional Locking II", in Distributed Computing. Springer, 2006, pp. 194--208.

Digital Library

[35]

A. Puri, K. Bellamkonda, K. Narreddy, J. Jose, V. Tamarapalli, and V. Narayanan, "DRackSim: Simulating CXL-enabled Large-Scale Disaggregated Memory Systems", in Proceedings of the 38th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, ser. SIGSIM-PADS '24. New York, NY, USA: Association for Computing Machinery, 2024, pp. 3--14. [Online]. Available: https://doi.org/10.1145/3615979.3656059

Digital Library

[36]

J. Zhao, S. Li, D. H. Yoon, Y. Xie, and N. P. Jouppi, "Kiln: Closing the performance gap between systems with and without persistence support", in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, ser. MICRO-46. New York, NY, USA: ACM, 2013, pp. 421--432. [Online]. Available: http://doi.acm.org/10.1145/2540708.2540744

Digital Library

[37]

J. H. Ahn, S. Li, O. Seongil, and N. P. Jouppi, "Mcsima+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling", in Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on. IEEE, 2013, pp. 74--85.

[38]

C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: building customized program analysis tools with dynamic instrumentation", in ACM Sigplan Notices, vol. 40, no. 6. ACM, 2005, pp. 190--200.

[39]

P. Rosenfeld, E. Cooper-Balis, and B. Jacob, "Dramsim2: A cycle accurate memory system simulator", Computer Architecture Letters, vol. 10, no. 1, pp. 16--19, 2011.

Digital Library

[40]

D. Dice and N. Shavit, "Understanding tradeoffs in software transactional memory", in Code Generation and Optimization, 2007. CGO'07. International Symposium on. IEEE, 2007, pp. 21--33.

Digital Library

[41]

C. C. Minh, "TL2-x86, a port of tl2 to x86 architecture", in On GitHub, ccaominh, tl2-x86. Stanford, 2015. [Online]. Available: https://github.com/ccaominh/tl2-x86

[42]

T. Bingmann, "Stx b+ tree c++", in http://panthema.net/2007/stx-btree/.

[43]

A. Lerner and G. Alonso, "Cxl and the return of scale-up database engines", 2024.

Digital Library

[44]

M. Zhang, T. Ma, J. Hua, Z. Liu, K. Chen, N. Ding, F. Du, J. Jiang, T. Ma, and Y. Wu, "Partial failure resilient memory management system for (cxl-based) distributed shared memory", in Proceedings of the 29th Symposium on Operating Systems Principles, ser. SOSP '23. New York, NY, USA: Association for Computing Machinery, 2023, pp. 658--674. [Online]. Available: https://doi.org/10.1145/3600006.3613135

Digital Library

[45]

S. Pelley, P. M. Chen, and T. F. Wenisch, "Memory persistency", in Computer Architecture (ISCA), 2014 ACM/IEEE 41st International Symposium on. IEEE, 2014, pp. 265--276.

[46]

J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee, "Better I/O through byte-addressable, persistent memory", in Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles, ser. SOSP '09. New York, NY, USA: ACM, 2009, pp. 133--146. [Online]. Available: http://doi.acm.org/10.1145/1629575.1629589

Digital Library

[47]

S. Venkatraman, N. Tolia, P. Ranganathan, and R. H. Campbell, "Consistent and durable data structures for non-volatile byte addressable memory", in Proceedings of 9th Usenix Conference on File and Storage Technologies. ACM Press, 2011, pp. 61--76.

[48]

A. Joshi, V. Nagarajan, S. Viglas, and M. Cintra, "ATOM: Atomic durability in non-volatile memory through hardware logging", in 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017.

[49]

D. Narayanan and O. Hodson, "Whole-system persistence", in Proceedings of 17th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, 2012, pp. 401--410.

Digital Library

[50]

M. K. Qureshi, V. Srinivasan, and J. A. Rivers, "Scalable high performance main memory system using phase-change memory technology", in Proceedings of the 36th Annual International Symposium on Computer Architecture, ser. ISCA '09. New York, NY, USA: ACM, 2009, pp. 24--33. [Online]. Available: http://doi.acm.org/10.1145/1555754.1555760

Digital Library

[51]

P. Zhou, Y. Du, Y. Zhang, and J. Yang, "Fine-grained QoS scheduling for PCM-based main memory systems", in Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on. IEEE, 2010, pp. 1--12.

[52]

J. Zhao, O. Mutlu, and Y. Xie, "Firm: Fair and high-performance memory control for persistent memory systems", in Microarchitecture (MICRO), 2014 47th Annual IEEE/ACM International Symposium on. IEEE, 2014, pp. 153--165.

Digital Library

[53]

E. Giles, K. Doshi, and P. Varman, "Bridging the programming gap between persistent and volatile memory using wrap", in Proceedings of the ACM International Conference on Computing Frontiers, ser. CF '13. New York, NY, USA: ACM, 2013, pp. 30:1--30:10. [Online]. Available: http://doi.acm.org/10.1145/2482767.2482806

Digital Library

[54]

E. Giles, K. Doshi, and P. Varman, "Brief announcement: Hardware transactional storage class memory", in Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, ser. SPAA '17. New York, NY, USA: ACM, 2017, pp. 375--378. [Online]. Available: http://doi.acm.org/10.1145/3087556.3087589

Digital Library

[55]

H. Volos, A. J. Tack, and M. Swift, "Mnemosyne: Lightweight persistent memory", in Proceedings of 16th International Conference on Architectural Support for Programming Languages and Operating Systems. ACM Press, 2011, pp. 91--104.

Digital Library

[56]

D. R. Chakrabarti, H.-J. Boehm, and K. Bhandari, "Atlas: Leveraging locks for non-volatile memory consistency", in Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, ser. OOPSLA '14. New York, NY, USA: ACM, 2014, pp. 433--452. [Online]. Available: http://doi.acm.org/10.1145/2660193.2660224

Digital Library

[57]

A. Chatzistergiou, M. Cintra, and S. D. Viglas, "Rewind: Recovery write-ahead system for in-memory non-volatile data-structures", Proceedings of the VLDB Endowment, vol. 8, no. 5, pp. 497--508, 2015.

Digital Library

[58]

J. Ren, J. Zhao, S. Khan, J. Choi, Y. Wu, and O. Mutlu, "ThyNVM: Enabling software-transparent crash consistency in persistent memory systems", in Proceedings of the 48th International Symposium on Microarchitecture. ACM, 2015, pp. 672--685.

Digital Library

[59]

R. Filipe, S. Issa, P. Romano, and J. Barreto, "Stretching the capacity of hardware transactional memory in ibm power architectures", in Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming, ser. PPoPP '19. ACM, Feb. 2019. [Online]. Available: http://dx.doi.org/10.1145/3293883.3295714

Digital Library

[60]

C. Fu, L. Wan, and J. Han, "Losatm: A hardware transactional memory integrated with a low-overhead scenario-awareness conflict manager", IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 12, pp. 4849--4862, 2022.

Digital Library

[61]

S. Park, M. Prvulovic, and C. J. Hughes, "Pleasetm: Enabling transaction conflict management in requester-wins hardware transactional memory", in 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2016, pp. 285--296.

[62]

D. Dice, M. Herlihy, and A. Kogan, "Improving parallelism in hardware transactional memory", ACM Trans. Archit. Code Optim., vol. 15, no. 1, mar 2018. [Online]. Available: https://doi.org/10.1145/3177962

Digital Library

[63]

S. Park, C. J. Hughes, and M. Prvulovic, "Forgive-tm: Supporting lazy conflict detection in eager hardware transactional memory", in 2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT). IEEE, 2019, pp. 192--204.

Digital Library

[64]

L. Yen, J. Bobba, M. R. Marty, K. E. Moore, H. Volos, M. D. Hill, M. M. Swift, and D. A. Wood, "Logtm-se: Decoupling hardware transactional memory from caches", in 2007 IEEE 13th International Symposium on High Performance Computer Architecture, 2007, pp. 261--272.

Digital Library

[65]

J. Jeong, J. Hong, S. Maeng, C. Jung, and Y. Kwon, "Unbounded hardware transactional memory for a hybrid dram/nvm memory system", in 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 2020, pp. 525--538.

Index Terms

ACID Support for Compute eXpress Link Memory Transactions

Index terms have been assigned to the content through auto-classification.

Recommendations

Versioned boxes as the basis for memory transactions
Special issue: Synchronization and concurrency in object-oriented languages

In this paper, we propose the use of Versioned Boxes, which keep a history of values, as the basis for language-level memory transactions. Unlike previous work on software transactional memory, in our proposal read-only transactions never conflict with ...
Invalidating transactions: optimizations, theory, guarantees, and unification
Scheduling memory transactions in distributed systems

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SC-W '24: Proceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis

November 2024

2235 pages

ISBN:9798350355543

Sponsors

SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing

Publisher

IEEE Press

Publication History

Published: 11 February 2025

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

SC '24

Sponsor:

SIGHPC

SC '24: The International Conference for High Performance Computing, Networking, Storage, and Analysis

November 17 - 22, 2024

GA, Atlanta, USA

Acceptance Rates

Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten