research-article

Open access

Project PBerry: FPGA Acceleration for Remote Memory

Authors:

Aasheesh Kolli,

Andreas Nowatzyk,

Jayneel Gandhi,

Pratap SubrahmanyamAuthors Info & Claims

HotOS '19: Proceedings of the Workshop on Hot Topics in Operating Systems

Pages 127 - 135

https://doi.org/10.1145/3317550.3321424

Published: 13 May 2019 Publication History

Abstract

Recent research efforts propose remote memory systems that pool memory from multiple hosts. These systems rely on the virtual memory subsystem to track application memory accesses and transparently offer remote memory to applications. We outline several limitations of this approach, such as page fault overheads and dirty data amplification. Instead, we argue for a fundamentally different approach: leverage the local host's cache coherence traffic to track application memory accesses at cache line granularity. Our approach uses emerging cache-coherent FPGAs to expose cache coherence events to the operating system. This approach not only accelerates remote memory systems by reducing dirty data amplification and by eliminating page faults, but also enables other use cases, such as live virtual machine migration, unified virtual memory, security and code analysis. All of these use cases open up many promising research directions.

References

[1]

CCIX. https://www.ccixconsortium.com.

[2]

Enzian, a research computer built by the Systems Group at ETH Zürich. http://www.enzian.systems/index.html.

[3]

P.Haul. https://criu.org/P.Haul.

[4]

Pin - a dynamic binary instrumentation tool. https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool.

[5]

Redis: open-source, in-memory data structure store. https://redis.io.

[6]

Serving DNNs in real time at datacenter scale with Project Brainwave. https://www.microsoft.com/en-us/research/uploads/prod/2018/03/mi0218_Chung-2018Mar25.pdf.

[7]

Marcos K. Aguilera, Nadav Amit, Irina Calciu, Xavier Deguillard, Jayneel Gandhi, Stanko Novakovic, Arun Ramanathan, Pratap Subrahmanyam, Lalith Suresh, Kiran Tati, Rajesh Venkatasubramanian, and Michael Wei. Remote regions: a simple abstraction for remote memory. In USENIX Annual Technical Conference (ATC), Boston, MA, 2018.

Digital Library

[8]

Marcos K. Aguilera, Nadav Amit, Irina Calciu, Xavier Deguillard, Jayneel Gandhi, Pratap Subrahmanyam, Lalith Suresh, Kiran Tati, Rajesh Venkatasubramanian, and Michael Wei. Remote memory in the age of fast networks. In ACM Symposium on Cloud Computing (SoCC), 2017.

Digital Library

[9]

Cristiana Amza, Alan L. Cox, Shandya Dwarkadas, Pete Keleher, Honghui Lu, Ramakrishnan Rajamony, Weimin Yu, and Willy Zwaenepoel. TreadMarks: Shared memory computing on networks of workstations. IEEE Computer, February 1996.

Digital Library

[10]

Joshua Auerbach, David F. Bacon, Perry Cheng, and Rodric Rabbah. Lime: A Java-compatible and synthesizable language for heterogeneous architectures. 2010.

[11]

Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. Attack of the killer microseconds. Communications of the ACM, March 2017.

Digital Library

[12]

J. K. Bennett, J. B. Carter, and W. Zwaenepoel. Munin: Distributed shared memory based on type-specific memory coherence. In ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), March 1990.

Digital Library

[13]

Abhishek Bhattacharjee. Translation-triggered prefetching. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.

Digital Library

[14]

Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, July 1970.

Digital Library

[15]

M. Blott and K. Vissers. Dataflow architectures for 10 Gbps line-rate key-value-stores. In IEEE Hot Chips 25 Symposium (HCS), 2013.

[16]

Greg Bronevetsky, Daniel Marques, Keshav Pingali, Peter Szwed, and Martin Schulz. Application-level checkpointing for shared memory programs. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2004.

Digital Library

[17]

Derek Bruening, Qin Zhao, and Saman Amarasinghe. Transparent dynamic instrumentation. In International Conference on Virtual Execution Environments (VEE), 2012.

Digital Library

[18]

Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, and Marcos K. Aguilera. Black-box concurrent data structures for NUMA architectures. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2017.

Digital Library

[19]

Marco Chiappetta, Erkay Savas, and Cemal Yilmaz. Real time detection of cache-based side-channel attacks using hardware performance counters. Applied Soft Computing, 49(C), December 2016.

Digital Library

[20]

Christopher Clark, Keir Fraser, Steven H, Jacob Gorm Hansen, Eric Jul, Christian Limpach, Ian Pratt, and Andrew Warfield. Live migration of virtual machines. In Symposium on Networked Systems Design and Implementation (NSDI), 2005.

Digital Library

[21]

Convey Computer. The Convey HC-2 Computer. Architectural Overview. https://www.micron.eom/~/media/documents/products/white-paper/wp_convey_hc2_architectual_overview.pdf, 2012.

[22]

Aleksandar Dragojević, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. FaRM: Fast remote memory. In Symposium on Networked Systems Design and Implementation (NSDI), April 2014.

Digital Library

[23]

Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. No compromises: distributed transactions with consistency, availability, and performance. In ACM Symposium on Operating Systems Principles (SOSP), October 2015.

Digital Library

[24]

Jake Edge. DAX, mmap(), and a "go faster" flag. https://lwn.net/Articles/684828/.

[25]

Peter X. Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. Network requirements for resource disaggregation. In Symposium on Operating Systems Design and Implementation (OSDI), October 2016.

Digital Library

[26]

G. Gibb, J. W. Lockwood, J. Naous, P. Hartke, and N. McKeown. NetFPGA: An open platform for teaching how to build Gigabit-rate network switches and routers. IEEE Transactions on Education, 2008.

Digital Library

[27]

Heiner Giefers, Raphael Polig, and Christoph Hagleitner. Accelerating arithmetic kernels with coherent attached fpga coprocessors. In Design, Automation & Test in Europe (DATE), 2015.

Digital Library

[28]

Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G Shin. Efficient memory disaggregation with Infiniswap. In Symposium on Networked Systems Design and Implementation (NSDI), 2017.

Digital Library

[29]

Mark Harris. Unified Memory in CUDA 6. https://devblogs.nvidia.com/unified-memory-in-cuda-6/.

[30]

Zecheng He and Ruby B. Lee. How secure is your cache against side-channel attacks? In International Symposium on Microarchitecture (MICRO), 2017.

Digital Library

[31]

Zhenhao He, David Sidler, Zsolt István, and Gustavo Alonso. A flexible k-means operator for hybrid databases. In "International Conference on Field Programmable Logic and Applications (FPL)", 2018.

[32]

John L. Hennessy and David A. Patterson. Computer Architecture, Fifth Edition: A Quantitative Approach. Morgan Kaufmann Publishers Inc., 2011.

Digital Library

[33]

Michael Henson and Stephen Taylor. Memory encryption: A survey of existing techniques. ACM Computing Surveys, March 2014.

Digital Library

[34]

Michael R. Hines, Umesh Deshpande, and Kartik Gopalan. Post-copy live migration of virtual machines. Operating Systems Review, July 2009.

Digital Library

[35]

Intel. EPT-based Sub-Page Permissions. https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf.

[36]

Intel. Intel® Architecture Instruction Set Extensions Programming Reference. https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf.

[37]

Intel. Intel® Xeon®+FPGA Platform for the Data Center. http://reconfigurablecomputing4themasses.net/files/2.2%20PK.pdf.

[38]

Intel. Page Modification Logging for Virtual Machine Monitor White Paper. https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/page-modification-logging-vmm-white-paper.pdf.

[39]

Daniel Jacobowitz. ptrace() event tracing. https://lwn.net/Articles/10369/.

[40]

Ahmed Khawaja, Joshua Landgraf, Rohith Prakash, Michael Wei, Eric Schkufza, and Christopher J. Rossbach. Sharing, protection, and compatibility for reconfigurable fabric with amorphos. In Symposium on Operating Systems Design and Implementation (OSDI), Carlsbad, CA, 2018.

Digital Library

[41]

Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. In International Symposium on Computer Architecture (ISCA), 2014.

Digital Library

[42]

Andi Kleen. Machine check handling on Linux. https://www.halobates.de/mce.pdf.

[43]

David Koeplinger, Christina Delimitrou, Raghu Prabhakar, Christos Kozyrakis, Yaqi Zhang, and Kunle Olukotun. Automatic generation of efficient accelerators for reconfigurable hardware. In International Symposium on Computer Architecture (ISCA), 2016.

Digital Library

[44]

Maysam Lavasani, Hari Angepat, and Derek Chiou. An FPGA-based in-line accelerator for Memcached. IEEE Computer Architecture Letters, 2014.

Digital Library

[45]

Kai Li and Paul Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems (TOCS), November 1989.

Digital Library

[46]

Kevin T. Lim, Yoshio Turner, Jose Renato Santos, Alvin AuYoung, Jichuan Chang, Parthasarathy Ranganathan, and Thomas F. Wenisch. System-level implications of disaggregated memory. In IEEE Symposium on High Performance Computer Architecture (HPCA), February 2012.

Digital Library

[47]

Liu Ling, Neal Oliver, Chitlur Bhushan, Wang Qigang, Alvin Chen, Shen Wenbo, Yu Zhihong, Arthur Sheiman, Ian McCallum, Joseph Grecco, Henry Mitchel, Liu Dong, and Prabhat Gupta. High-performance, energy-efficient platforms using in-socket fpga accelerators. In International Symposium on Field Programmable Gate Arrays (FPGA), 2009.

Digital Library

[48]

Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: Building customized program analysis tools with dynamic instrumentation. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2005.

Digital Library

[49]

Divya Mahajan, Jongse Park, Emmanuel Amaro, Hardik Sharma, Amir Yazdanbakhsh, Joon Kyung Kim, and Hadi Esmaeilzadeh. TABLA: A unified template-based framework for accelerating statistical machine learning. In IEEE Symposium on High Performance Computer Architecture (HPCA), 2016.

[50]

Yandong Mao, Robert Morris, and Frans Kaashoek. Optimizing MapReduce for multicore architectures. Technical Report MIT-CSAIL-TR-2010-020, May 2010.

[51]

Mellanox. Mellanox Innova™ IPsec 4 Lx Ethernet Adapter Card User Manual. http://www.mellanox.com/related-docs/prod_software/Mellanox_InnovaJPsec_4_Lx_Ethernet_Adapter_Card_User_Manual_rev_1_3.pdf.

[52]

Microsoft. Project Catapult. https://www.microsoft.com/en-us/research/project/project-catapult.

[53]

Microsoft. SDN for the Cloud. https://conferences.sigcomm.org/sigcomm/2015/pdf/papers/keynote.pdf.

[54]

David Mulnix. Intel Xeon Processor Scalable Family Technical Overview. https://software.intel.com/en-us/articles/intel-xeon-processor-scalable-family-technical-overview.

[55]

Onur Mutlu, Saugata Ghose, Juan Gómez-Luna, and Rachata Ausavarungnirun. Processing data where it makes sense: Enabling in-memory computation. Microprocessors and Microsystems, 2019.

[56]

Vijay Nagarajan and Rajiv Gupta. Architectural support for shadow memory in multiprocessors. In International Conference on Virtual Execution Environments (VEE), 2009.

Digital Library

[57]

Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. Latency-tolerant software distributed shared memory. In USENIX Annual Technical Conference (ATC), July 2015.

Digital Library

[58]

Neal Oliver, Rahul R. Sharma, Stephen Chang, Bhushan Chitlur, Elkin Garcia, Joseph Grecco, Aaron Grier, Nelson Ijih, Yaping Liu, Pratik Marolia, Henry Mitchel, Suchit Subhaschandra, Arthur Sheiman, Tim Whisonant, and Prabhat Gupta. A reconfigurable computing system based on a cache-coherent fabric. In International Conference on Reconfigurable Computing and FPGAs (ReConFig), 2011.

Digital Library

[59]

OpenCAPI consortium. http://opencapi.org.

[60]

Muhsen Owaida, David Sidler, Kaan Kara, and Gustavo Alonso. Centaur: A framework for hybrid CPU-FPGA databases. In International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2017.

[61]

Mark S. Papamarcos and Janak H. Patel. A low-overhead coherence solution for multiprocessors with private cache memories. In International Symposium on Computer Architecture (ISCA), 1984.

Digital Library

[62]

Mathias Payer, Boris Bluntschli, and Thomas R. Gross. Dynsec: On-the-fly code rewriting and repair. In Hot Topics in Software Upgrades, 2013.

[63]

Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. Linearly compressed pages: A low-complexity, low-latency main memory compression framework. In International Symposium on Microarchitecture (MICRO), 2013.

Digital Library

[64]

Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. A reconfigurable fabric for accelerating large-scale datacenter services. In International Symposium on Computer Architecture (ISCA), 2014.

Digital Library

[65]

Daniel J. Scales, Kourosh Gharachorloo, and Chandramohan A. Thekkath. Shasta: A low overhead, software-only approach for supporting fine-grain shared memory. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1996.

Digital Library

[66]

Ioannis Schoinas, Babak Falsafi, Alvin R. Lebeck, Steven K. Reinhardt, James R. Larus, and David A. Wood. Fine-grain access control for distributed shared memory. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1994.

Digital Library

[67]

Vivek Seshadri, Abhishek Bhowmick, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. The dirty-block index. In International Symposium on Computer Architecture (ISCA), 2014.

Digital Library

[68]

Vivek Seshadri, Gennady Pekhimenko, Olatunji Ruwase, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, Todd C. Mowry, and Trishul Chilimbi. Page overlays: An enhanced virtual memory framework to enable fine-grained memory management. In International Symposium on Computer Architecture (ISCA), 2015.

Digital Library

[69]

Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. LegoOS: A disseminated, distributed OS for hardware resource disaggregation. In Symposium on Operating Systems Design and Implementation (OSDI), Carlsbad, CA, 2018.

Digital Library

[70]

Yizhou Shan, Shin-Yeh Tsai, and Yiying Zhang. Distributed shared persistent memory. In ACM Symposium on Cloud Computing (SoCC), 2017.

Digital Library

[71]

Yongming Shen, Michael Ferdman, and Peter Milder. Maximizing CNN accelerator efficiency through resource partitioning. In International Symposium on Computer Architecture (ISCA), 2017.

Digital Library

[72]

Navin Shenoy. A Milestone in Moving Data. https://newsroom.intel.com/editorials/milestone-moving-data.

[73]

David Sidler, Zsolt István, Muhsen Owaida, Kaan Kara, and Gustavo Alonso. doppioDB: A hardware accelerated database. In International Conference on Management of Data (SIGMOD), 2017.

Digital Library

[74]

Mario Smarduch. Enhanced Live Migration For Intensive Memory Loads. https://events.static.linuxfound.org/sites/events/files/slides/CloudOpen-Japan-2015.pdf.

[75]

Kshitij Sudan, Niladrish Chatterjee, David Nellans, Manu Awasthi, Rajeev Balasubramonian, and Al Davis. Micro-pages: Increasing DRAM efficiency with locality-aware data placement. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2010.

Digital Library

[76]

Bharat Sukhwani, Thomas Roewer, Charles L. Haymes, Kyu-Hyoun Kim, Adam J. McPadden, Daniel M. Dreps, Dean Sanner, Jan Van Lunteren, and Sameh Asaad. Contutto: A novel FPGA-based prototyping platform enabling innovation in the memory subsystem of a server class processor. In International Symposium on Microarchitecture (MICRO), 2017.

Digital Library

[77]

A. Tran, M. Smith, and J. Miller. A hardware-assisted tool for fast, full code coverage analysis. In International Symposium on Software Reliability Engineering (ISSRE), 2008.

Digital Library

[78]

Irina Chihaia Tuduce and Thomas R. Gross. Adaptive main memory compression. In USENIX Annual Technical Conference (ATC), April 2005.

Digital Library

[79]

Haris Volos, Andres Jaan Tack, and Michael M. Swift. Mnemosyne: Lightweight persistent memory. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2011.

Digital Library

[80]

Carl A. Waldspurger. Memory resource management in VMware ESX server. In Symposium on Operating Systems Design and Implementation (OSDI), December 2002.

Digital Library

[81]

Emmett Witchel, Josh Cates, and Krste Asanovic. Mondrian memory protection. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2002.

Digital Library

[82]

Yiying Zhang, Jian Yang, Amirsaman Memaripour, and Steven Swanson. Mojim: A reliable and highly-available non-volatile memory system. In International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2015.

Digital Library

[83]

Qin Zhao, Derek Bruening, and Saman Amarasinghe. Efficient memory shadowing for 64-bit architectures. In International Symposium on Memory Management (ISMM), 2010.

Digital Library

Cited By

Sha MCai YWang SPhan LLi FTan K(2024)Object-oriented Unified Encrypted Memory Management for Heterogeneous Memory ArchitecturesProceedings of the ACM on Management of Data10.1145/36549582:3(1-29)Online publication date: 30-May-2024
https://doi.org/10.1145/3654958
Tsalapatis EHancock RHossain RMashtizadeh ATsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)MemSnap μCheckpoints: A Data Single Level Store for Fearless PersistenceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651334(622-638)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651334
Guo ZHe ZZhang YDruschel PKaufmann AMace JFlinn JSeltzer M(2023)Mira: A Program-Behavior-Guided Far Memory SystemProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613157(692-708)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613157
Show More Cited By

Index Terms

Project PBerry: FPGA Acceleration for Remote Memory
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Distributed memory

Recommendations

Rethinking software runtimes for disaggregated memory
ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

Disaggregated memory can address resource provisioning inefficiencies in current datacenters. Multiple software runtimes for disaggregated memory have been proposed in an attempt to make disaggregated memory practical. These systems rely on the virtual ...
Low-energy volatile STT-RAM cache design using cache-coherence-enabled adaptive refresh

Spin-Torque Transfer RAM (STT-RAM) is a promising candidate for SRAM replacement because of its excellent features, such as fast read access, high density, low leakage power, and CMOS technology compatibility. However, wide adoption of STT-RAM as cache ...
Boosting performance of directory-based cache coherence protocols with coherence bypass at subpage granularity and a novel on-chip page table
CF '16: Proceedings of the ACM International Conference on Computing Frontiers

Chip multiprocessors (CMPs) require effective cache coherence protocols as well as fast virtual-to-physical address translation mechanisms for high performance. Directory-based cache coherence protocols are the state-of-the-art approaches in many-core ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HotOS '19: Proceedings of the Workshop on Hot Topics in Operating Systems

May 2019

227 pages

ISBN:9781450367271

DOI:10.1145/3317550

Copyright © 2019 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

HotOS '19

Sponsor:

SIGOPS

HotOS '19: Workshop on Hot Topics in Operating Systems

May 13 - 15, 2019

Bertinoro, Italy

Upcoming Conference

HOTOS '25

Sponsor:
sigops

Workshop on Hot Topics in Operating Systems

May 14 - 16, 2025

Banff or Lake Louise , AB , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
1,729
Total Downloads

Downloads (Last 12 months)222
Downloads (Last 6 weeks)27

Reflects downloads up to 03 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sha MCai YWang SPhan LLi FTan K(2024)Object-oriented Unified Encrypted Memory Management for Heterogeneous Memory ArchitecturesProceedings of the ACM on Management of Data10.1145/36549582:3(1-29)Online publication date: 30-May-2024
https://doi.org/10.1145/3654958
Tsalapatis EHancock RHossain RMashtizadeh ATsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)MemSnap μCheckpoints: A Data Single Level Store for Fearless PersistenceProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651334(622-638)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620666.3651334
Guo ZHe ZZhang YDruschel PKaufmann AMace JFlinn JSeltzer M(2023)Mira: A Program-Behavior-Guided Far Memory SystemProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613157(692-708)Online publication date: 23-Oct-2023
https://dl.acm.org/doi/10.1145/3600006.3613157
Zuo GMa JQuinn AKasikci BAamodt TJerger NSwift M(2023)Vidi: Record Replay for Reconfigurable HardwareProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582040(806-820)Online publication date: 25-Mar-2023
https://dl.acm.org/doi/10.1145/3582016.3582040
Wang CHe KFan RWang XWang WHao Q(2023)CXL over Ethernet: A Novel FPGA-based Memory Disaggregation Design in Data Centers2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM57271.2023.00017(75-82)Online publication date: May-2023
https://doi.org/10.1109/FCCM57271.2023.00017
Ewais MChow P(2023)Disaggregated Memory in the Datacenter: A SurveyIEEE Access10.1109/ACCESS.2023.325040711(20688-20712)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3250407
Cui SJin LNguyen KWang CSerafini MXu H(2022)SemSwapProceedings of the 13th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3546591.3547531(9-17)Online publication date: 23-Aug-2022
https://dl.acm.org/doi/10.1145/3546591.3547531
Waddington DHershcovitch MSundararaman SDickey CGavrilovska AAltınbüken DBinnig C(2022)A case for using cache line deltas for high frequency VM snapshottingProceedings of the 13th Symposium on Cloud Computing10.1145/3542929.3563481(526-539)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1145/3542929.3563481
Bhardwaj AThornley TPawar VAchermann RZellweger GStutsman RAnwar ASkourtis DKannan SMa X(2022)Cache-coherent accelerators for persistent memory crash consistencyProceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3538643.3539752(37-44)Online publication date: 27-Jun-2022
https://dl.acm.org/doi/10.1145/3538643.3539752
Cock DRamdas ASchwyn DGiardino MTurowski AHe ZHossle NKorolija DLicciardello MMartsenko KAchermann RAlonso GRoscoe TFalsafi BFerdman MLu SWenisch T(2022)Enzian: an open, general, CPU/FPGA platform for systems software researchProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507742(434-451)Online publication date: 28-Feb-2022
https://dl.acm.org/doi/10.1145/3503222.3507742
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents