Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

FlashNet: Flash/Network Stack Co-Design

Published: 04 December 2018 Publication History
  • Get Citation Alerts
  • Abstract

    During the past decade, network and storage devices have undergone rapid performance improvements, delivering ultra-low latency and several Gbps of bandwidth. Nevertheless, current network and storage stacks fail to deliver this hardware performance to the applications, often due to the loss of I/O efficiency from stalled CPU performance. While many efforts attempt to address this issue solely on either the network or the storage stack, achieving high-performance for networked-storage applications requires a holistic approach that considers both.
    In this article, we present FlashNet, a software I/O stack that unifies high-performance network properties with flash storage access and management. FlashNet builds on RDMA principles and abstractions to provide a direct, asynchronous, end-to-end data path between a client and remote flash storage. The key insight behind FlashNet is to co-design the stack’s components (an RDMA controller, a flash controller, and a file system) to enable cross-stack optimizations and maximize I/O efficiency. In micro-benchmarks, FlashNet improves 4kB network I/O operations per second (IOPS by 38.6% to 1.22M, decreases access latency by 43.5% to 50.4μs, and prolongs the flash lifetime by 1.6-5.9× for writes. We illustrate the capabilities of FlashNet by building a Key-Value store and porting a distributed data store that uses RDMA on it. The use of FlashNet’s RDMA API improves the performance of KV store by 2× and requires minimum changes for the ported data store to access remote flash devices.

    References

    [1]
    2018. RDMA communication manager API. Retrieved July 2018 from https://linux.die.net/man/7/rdma_cm.
    [2]
    Irfan Ahmad, Ajay Gulati, and Ali Mashtizadeh. 2011. vIC: Interrupt coalescing for virtual machine storage device IO. In Proceedings of the 2011 USENIX Conference (USENIX ATC’11). USENIX Association, Berkeley, CA, 45--58.
    [3]
    Jens Axboe. 2018. Flexible I/O tester. Retrieved July 2018 from https://linux.die.net/man/1/fio.
    [4]
    Mahesh Balakrishnan, Dahlia Malkhi, Vijayan Prabhakaran, Ted Wobber, Michael Wei, and John D. Davis. 2012. CORFU: A shared log design for flash clusters. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12). USENIX Association, Berkeley, CA, 1--14.
    [5]
    Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. Attack of the killer microseconds. Commun. ACM 60, 4 (Mar. 2017), 48--54.
    [6]
    Stephen Bates. 2015. Donard: NVM Express for Peer-2-Peer between SSDs and other PCIe Devices. Retrieved July 2018 from http://www.snia.org/sites/default/files/SDC15_presentations/nvme_fab/StephenBates_Donard_NVM_Express_Peer-2_Peer.pdf.
    [7]
    Adam Belay, George Prekas, Ana Klimovic, Samuel Grossman, Christos Kozyrakis, and Edouard Bugnion. 2014. IX: A protected dataplane operating system for high throughput and low latency. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 49--65.
    [8]
    Matias Bjørling, Jens Axboe, David Nellans, and Philippe Bonnet. 2013. Linux block IO: Introducing multi-queue SSD access on multi-core systems. In Proceedings of the 6th International Systems and Storage Conference (SYSTOR’13). ACM, New York, NY, Article 22, 10 pages.
    [9]
    M. A. Blumrich, K. Li, R. Alpert, C. Dubnicki, E. W. Felten, and J. Sandberg. 1994. Virtual memory mapped network interface for the SHRIMP multicomputer. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA’94). IEEE Computer Society Press, Los Alamitos, CA, 142--153.
    [10]
    Greg Buzzard, David Jacobson, Milon Mackey, Scott Marovich, and John Wilkes. 1996. An implementation of the hamlyn sender-managed interface architecture. In Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation (OSDI’96). ACM, New York, NY, 245--259.
    [11]
    Adrian M. Caulfield, Todor I. Mollov, Louis Alex Eisner, Arup De, Joel Coburn, and Steven Swanson. 2012. Providing safe, user space access to fast, solid state disks. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII). ACM, New York, NY, 387--400.
    [12]
    Adrian M. Caulfield and Steven Swanson. 2013. QuickSAN: A storage area network for fast, distributed, solid state disks. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). ACM, New York, NY, 464--474.
    [13]
    Mallikarjun Chadalapaka, Hemal Shah, Uri Elzur, Patricia Thaler, and Michael Ko. 2003. A study of iSCSI extensions for RDMA (iSER). In Proceedings of the ACM SIGCOMM Workshop on Network-I/O Convergence: Experience, Lessons, Implications (NICELI’03). ACM, New York, NY, 209--219.
    [14]
    Lei Chai, Xiangyong Ouyang, Ranjit Noronha, and Dhabaleswar K. Panda. 2007. pNFS/PVFS2 over InfiniBand: Early experiences. In Proceedings of the 2nd International Workshop on Petascale Data Storage: Held in Conjunction with Supercomputing’07 (PDSW’07). ACM, New York, NY, 5--11.
    [15]
    Li-Pin Chang, Tei-Wei Kuo, and Shi-Wu Lo. 2004. Real-time garbage collection for flash-memory storage systems of real-time embedded systems. ACM Trans. Embed. Comput. Syst. 3, 4 (Nov. 2004), 837--863.
    [16]
    Brendan Cully, Jake Wires, Dutch Meyer, Kevin Jamieson, Keir Fraser, Tim Deegan, Daniel Stodden, Geoffrey Lefebvre, Daniel Ferstay, and Andrew Warfield. 2014. Strata: Scalable high-performance storage on virtualized non-volatile memory. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). USENIX Association, Berkeley, CA, 17--31.
    [17]
    Matt DeBergalis, Peter Corbett, Steve Kleiman, Arthur Lent, Dave Noveck, Tom Talpey, and Mark Wittle. 2003. The direct access file system. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). USENIX Association, Berkeley, CA, 175--188.
    [18]
    Chet Douglas. 2015. RDMA with PMEM: Software mechanisms for enabling access to remote persistent memory. Retrieved July 2018 from http://www.snia.org/sites/default/files/SDC15_presentations/persistant_mem/ChetDouglas_RDMA_with_PM.pdf.
    [19]
    Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, and Miguel Castro. 2014. FaRM: Fast remote memory. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 401--414.
    [20]
    Aleksandar Dragojević, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: Distributed transactions with consistency, availability, and performance. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, NY, 54--70.
    [21]
    D. R. Engler, M. F. Kaashoek, and J. O’Toole, Jr. 1995. Exokernel: An operating system architecture for application-level resource management. In Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP’95). ACM, New York, NY, 251--266.
    [22]
    Roman Pletka et al. 2018. Management of next-generation NAND flash to achieve enterprise-level endurance and latency targets (unpublshed).
    [23]
    Blake G. Fitch et al. 2013. Blue Gene Active Storage (BGAS) for High Performance BG/Q I/O and Scalable Data-centric Analytics. Retrieved July 2018 from https://www.fz-juelich.de/SharedDocs/Downloads/IAS/JSC/EN/slides/bgas/bgas-fitch.pdf?__blob=publicationFile.
    [24]
    Philip Werner Frey. 2010. Zero-Copy Network Communication: An Applicability Study of iWARP beyond Micro Benchmarks. Ph.D. Dissertation. ETH Zurich. Dissertation Number 19001.
    [25]
    Philip Werner Frey and Gustavo Alonso. 2009. Minimizing the hidden cost of RDMA. In Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems (ICDCS’09). IEEE Computer Society, Washington, DC, 553--560.
    [26]
    Garth A. Gibson, David F. Nagle, Khalil Amiri, Jeff Butler, Fay W. Chang, Howard Gobioff, Charles Hardin, Erik Riedel, David Rochberg, and Jim Zelenka. 1998. A cost-effective, high-bandwidth storage architecture. In Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII). ACM, New York, NY, 92--103.
    [27]
    Garth A. Gibson, David F. Nagle, Khalil Amiri, Fay W. Chang, Eugene M. Feinberg, Howard Gobioff, Chen Lee, Berend Ozceri, Erik Riedel, David Rochberg, and Jim Zelenka. 1997. File server scaling with network-attached secure disks. In Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’97). ACM, New York, NY, 272--284.
    [28]
    Zvika Guz, Harry (Huan) Li, Anahita Shayesteh, and Vijay Balakrishnan. 2017. NVMe-over-fabrics performance characterization and the path to low-overhead flash disaggregation. In Proceedings of the 10th ACM International Systems and Storage Conference (SYSTOR’17). ACM, New York, NY, Article 16, 9 pages.
    [29]
    Sangjin Han, Scott Marshall, Byung-Gon Chun, and Sylvia Ratnasamy. 2012. MegaPipe: A new programming interface for scalable network I/O. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, 135--148.
    [30]
    Red Hat. 2018. GlusterFS. Retrieved July 2018 from http://www.gluster.org/.
    [31]
    Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch hashing. In Proceedings of the 22nd International Symposium on Distributed Computing (DISC’08). Springer-Verlag, Berlin, 350--364.
    [32]
    Dean Hildebrand and Peter Honeyman. 2005. Exporting storage systems in a scalable manner with pNFS. In Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’05). IEEE Computer Society, Los Alamitos, CA, 18--27.
    [33]
    Torsten Hoefler, Robert B. Ross, and Timothy Roscoe. 2015. Distributing the data plane for remote storage access. In Proceedings of the 15th USENIX Conference on Hot Topics in Operating Systems (HotOS’15). USENIX Association, Berkeley, CA.
    [34]
    Xiao-Yu Hu, Robert Haas, and Eleftheriou Evangelos. 2011. Container marking: Combining data placement, garbage collection and wear levelling for flash. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS’11). IEEE Computer Society, Los Alamitos, CA, 237--247.
    [35]
    NVM Express Inc. 2016. NVM Express over Fabrics Specification 1.0. Retrieved July 2018 from http://www.nvmexpress.org/wp-content/uploads/NVMe_over_Fabrics_1_0_Gold_20160605-1.pdf.
    [36]
    Solarflare Communications Inc. 2018. OpenOnload. Retrieved July 2018 from http://www.openonload.org/.
    [37]
    Apache Crail (Incubating). 2018. A High-Performance Distributed Data Store for the Apache Ecosystem. Retrieved July 2018 from http://crail.incubator.apache.org/.
    [38]
    Intel. 2018. DPDK: Data Plane Development Kit. Retrieved July 2018 from http://dpdk.org/.
    [39]
    Intel. 2018. Intel Optane SSD 900P Series. Retrieved July 2018 from https://www.intel.com/content/www/us/en/products/memory-storage/solid-state-drives/gaming-enthusiast-ssds/optane-900p-series.html.
    [40]
    Intel. 2018. Intel’s 3D XPoint Technology Products—What’s Available and What’s Coming Soon. Retrieved July 2018 from https://software.intel.com/en-us/articles/3d-xpoint-technology-products.
    [41]
    Nikolas Ioannou, Kornilios Kourtis, and Ioannis Koltsidas. 2018. Elevating commodity storage with the SALSA host translation layer. In Proceedings of the 26th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS'18). 277--292.
    [42]
    Eun Young Jeong, Shinae Woo, Muhammad Jamshed, Haewon Jeong, Sunghwan Ihm, Dongsu Han, and KyoungSoo Park. 2014. mTCP: A highly scalable user-level TCP stack for multicore systems. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 489--502.
    [43]
    Abhijeet Joglekar, Michael E. Kounavis, and Frank L. Berry. 2005. A scalable and high performance software iSCSI implementation. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies - Volume 4 (FAST’05). USENIX Association, Berkeley, CA, 267--280.
    [44]
    Scott M. Johnson. 2014. Violin and Microsoft’s High-Performance, All-Flash Enterprise Storage. Retrieved July 2018 from https://insightsblog.violinsystems.com/blog/violin-and-microsoft-windows-flash-array.
    [45]
    Rick Jones et al. 2018. Netperf: A network performance benchmark. Retrieved July 2018 from https://github.com/HewlettPackard/netperf.
    [46]
    William K. Josephson, Lars A. Bongo, David Flynn, and Kai Li. 2010. DFS: A file system for virtualized flash storage. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST’10). USENIX Association, Berkeley, CA, 85--100.
    [47]
    M. Frans Kaashoek, Dawson R. Engler, Gregory R. Ganger, Hector M. Briceño, Russell Hunt, David Mazières, Thomas Pinckney, Robert Grimm, John Jannotti, and Kenneth Mackenzie. 1997. Application performance and flexibility on exokernel systems. In Proceedings of the th ACM Symposium on Operating Systems Principles (SOSP’97). ACM, New York, NY, 52--65.
    [48]
    Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2014. Using RDMA efficiently for key-value services. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM’14). ACM, New York, NY, 295--306.
    [49]
    Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. FaSST: Fast, scalable and simple distributed transactions with two-sided (RDMA) datagram RPCs. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 185--201.
    [50]
    Hyeong-Jun Kim, Young-Sik Lee, and Jin-Soo Kim. 2016. NVMeDirect: A user-space I/O framework for application-specific optimization on NVMe SSDs. In Proceedings of the 8th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’16). USENIX Association, Berkeley, CA, 41--45. http://dl.acm.org/citation.cfm?id=3026852.3026861
    [51]
    Ana Klimovic, Christos Kozyrakis, Eno Thereska, Binu John, and Sanjeev Kumar. 2016. Flash storage disaggregation. In Proceedings of the 11th European Conference on Computer Systems (EuroSys’16). ACM, New York, NY, Article 29, 15 pages.
    [52]
    Ana Klimovic, Heiner Litz, and Christos Kozyrakis. 2017. ReFlex: Remote flash == Local flash. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). ACM, New York, NY, 345--359.
    [53]
    Kenneth C. Knowlton. 1965. A fast storage allocator. Commun. ACM 8, 10 (Oct. 1965), 623--624.
    [54]
    Evangelos Koukis, Anastassios Nanos, and Nectarios Koziris. 2010. GMBlock: Optimizing data movement in a block-level storage sharing system over myrinet. Cluster Comput. 13, 4 (Dec. 2010), 349--372.
    [55]
    Changman Lee, Dongho Sim, Joo-Young Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, Berkeley, CA, 273--286.
    [56]
    Edward K. Lee and Chandramohan A. Thekkath. 1996. Petal: Distributed virtual disks. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VII). ACM, New York, NY, 84--92.
    [57]
    Sungjin Lee, Ming Liu, Sangwoo Jun, Shuotao Xu, Jihong Kim, and Arvind Arvind. 2016. Application-managed flash. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, Berkeley, CA, 339--353.
    [58]
    Ilya Lesokhin, Haggai Eran, Shachar Raindel, Guy Shapiro, Sagi Grimberg, Liran Liss, Muli Ben-Yehuda, Nadav Amit, and Dan Tsafrir. 2017. Page fault support for network controllers. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). ACM, New York, NY, 449--466.
    [59]
    Bo Li, Panyong Zhang, Zhigang Huo, and Dan Meng. 2009. Early experiences with write-write design of NFS over RDMA. In Proceedings of the 2009 IEEE International Conference on Networking, Architecture, and Storage (NAS’09). IEEE Computer Society, Los Alamitos, CA, 303--308.
    [60]
    Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 429--444.
    [61]
    Xiaoyi Lu, Dipti Shankar, Shashank Gugnani, and Dhabaleswar K. Panda. 2016. High-performance design of apache spark with RDMA and its benefits on various workloads. In Proceedings of the IEEE International Conference on Big Data. 253--262.
    [62]
    Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, and Margo I. Seltzer. 2003. Making the most out of direct-access network attached storage. In Proceedings of the 2nd USENIX Conference on File and Storage Technologies (FAST’03). USENIX Association, Berkeley, CA, 189--202.
    [63]
    Kostas Magoutis, Salimah Addetia, Alexandra Fedorova, Margo I. Seltzer, Jeffrey S. Chase, Andrew J. Gallatin, Richard Kisley, Rajiv Wickremesinghe, and Eran Gabber. 2002. Structure and performance of the direct access file system. In Proceedings of the General Track of the Annual Conference on USENIX Annual Technical Conference (ATEC’02). USENIX Association, Berkeley, CA, 1--14.
    [64]
    Ilias Marinos, Robert N. M. Watson, and Mark Handley. 2014. Network stack specialization for performance. In Proceedings of the 2014 ACM Conference on SIGCOMM (SIGCOMM’14). ACM, New York, NY, 175--186.
    [65]
    Bernard Metzler. 2018. SoftiWARP: Software iWARP kernel driver and user library for Linux. Retrieved July 2018 from https://github.com/zrlio/softiwarp.
    [66]
    James Mickens, Edmund B. Nightingale, Jeremy Elson, Krishna Nareddy, Darren Gehring, Bin Fan, Asim Kadav, Vijay Chidambaram, and Osama Khan. 2014. Blizzard: Fast, cloud-scale block storage for cloud-oblivious applications. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation (NSDI’14). USENIX Association, Berkeley, CA, 257--273. http://dl.acm.org/citation.cfm?id=2616448.2616473.
    [67]
    Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using One-sided RDMA reads to build a fast, CPU-efficient key-value store. In Proceedings of the 2013 USENIX Conference on Annual Technical Conference (USENIX ATC’13). USENIX Association, Berkeley, CA, 103--114.
    [68]
    Mihir Nanavati, Malte Schwarzkopf, Jake Wires, and Andrew Warfield. 2015. Non-volatile storage. Commun. ACM 59, 1 (Dec. 2015), 56--63.
    [69]
    Wael Noureddine. 2015. Implementing NVMe over Fabrics. Retrieved July 2018 from http://www.snia.org/sites/default/files/SDC15_presentations/networking/WaelNoureddine_Implementing_%20NVMe_revision.pdf.
    [70]
    John Ousterhout, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin Park, Henry Qin, Mendel Rosenblum, Stephen Rumble, Ryan Stutsman, and Stephen Yang. 2015. The RAMCloud storage system. ACM Trans. Comput. Syst. 33, 3, Article 7 (Aug. 2015), 55 pages.
    [71]
    Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel. 1999. IO-lite: A unified I/O buffering and caching system. In Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI’99). USENIX Association, Berkeley, CA, 15--28.
    [72]
    Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T. Morris. 2012. Improving network connection locality on multicore systems. In Proceedings of the 7th ACM European Conference on Computer Systems (EuroSys’12). ACM, New York, NY, 337--350.
    [73]
    Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe. 2014. Arrakis: The operating system is the control plane. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 1--16. http://dl.acm.org/citation.cfm?id=2685048.2685050
    [74]
    Jonas Pfefferle, Patrick Stuedi, Animesh Trivedi, Bernard Metzler, Ionnis Koltsidas, and Thomas R. Gross. 2015. A hybrid I/O virtualization framework for RDMA-capable network interfaces. In Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE’15). ACM, New York, NY, 17--30.
    [75]
    Luigi Rizzo. 2012. Netmap: A novel framework for fast packet I/O. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC’12). USENIX Association, Berkeley, CA, 101--112.
    [76]
    Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comp. Syst. 10, 1 (Feb. 1992), 26--52.
    [77]
    Felix Schürmann, Fabien Delalondre, Pramod S. Kumbhar, John Biddiscombe, Miguel Gila, Davide Tacchella, Alessandro Curioni, Bernard Metzler, Peter Morjan, Joachim Fenkes, Michele M. Franceschini, Robert S. Germain, Lars Schneidenbach, T. J. Ward, and Blake G. Fitch. 2014. Rebasing I/O for scientific computing: Leveraging storage class memory in an IBM bluegene/Q supercomputer. In Proceedings of the 29th International Conference on Supercomputing (ISC’14), Vol. 8488. Springer-Verlag, New York, 331--347.
    [78]
    Sudharsan Seshadri, Mark Gahagan, Sundaram Bhaskaran, Trevor Bunker, Arup De, Yanqin Jin, Yang Liu, and Steven Swanson. 2014. Willow: A user-programmable SSD. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 67--80.
    [79]
    Yizhou Shan, Shin-Yeh Tsai, and Yiying Zhang. 2017. Distributed shared persistent memory. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC’17). ACM, New York, NY, 323--337.
    [80]
    Dong In Shin, Young Jin Yu, Hyeong S. Kim, Jae Woo Choi, Do Yung Jung, and Heon Y. Yeom. 2013. Dynamic interval polling and pipelined post I/O processing for low-latency storage class memory. In Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’13). USENIX Association, Berkeley, CA.
    [81]
    Woong Shin, Qichen Chen, Myoungwon Oh, Hyeonsang Eom, and Heon Y. Yeom. 2014. OS I/O path optimizations for flash solid-state drives. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (USENIX ATC’14). USENIX Association, Berkeley, CA, 483--488.
    [82]
    V. Srinivasan, Brian Bulkowski, Wei-Ling Chu, Sunil Sayyaparaju, Andrew Gooding, Rajkumar Iyer, Ashish Shinde, and Thomas Lopatic. 2016. Aerospike: Architecture of a real-time operational DBMS. Proc. VLDB Endow. 9, 13 (Sept. 2016), 1389--1400.
    [83]
    Patrick Stuedi, Animesh Trivedi, and Bernard Metzler. 2012. Wimpy nodes with 10GbE: Leveraging one-sided operations in soft-RDMA to boost memcached. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference (USENIX ATC’12). USENIX Association, Berkeley, CA, 347--353.
    [84]
    Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, Radu Stoica, Bernard Metzler, Nikolas Ioannou, and Ioannis Koltsidas. 2017. Crail: A high-performance I/O architecture for distributed data processing. IEEE Bull. Techn. Committee on Data Eng. 40, 1 (Mar. 2017), 40--52.
    [85]
    Nisha Talagala. 2012. Native Flash Support for Applications. Retrieved July 2018 from http://www.flashmemorysummit.com/English/Collaterals/Proceedings/2012/20120823_S304B_Talagala.pdf.
    [86]
    Tom Talpey. 2015. Remote Access to Ultra-low Latency Storage. Retrieved July 2018 from https://www.snia.org/sites/default/files/SDC15_presentations/persistant_mem/Talpey-Remote_Access_Storage.pdf.
    [87]
    Mellanox Technologies. 2018. RDMA Aware Networks Programming User Manual. Retrieved July 2018 from http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf.
    [88]
    Mellanox Technologies. 2018. Software RDMA over Converged Ethernet (RoCE). Retrieved July 2018 from https://github.com/SoftRoCE.
    [89]
    Animesh Trivedi, Bernard Metzler, and Patrick Stuedi. 2011. A case for RDMA in clouds: Turning supercomputer networking into commodity. In Proceedings of the Second Asia-Pacific Workshop on Systems (APSys’11). ACM, New York, NY, Article 17, 5 pages.
    [90]
    Animesh Trivedi, Patrick Stuedi, Bernard Metzler, Clemens Lutz, Martin Schmatz, and Thomas R. Gross. 2015. RStore: A direct-access DRAM-based data store. In Proceedings of the 35th IEEE International Conference on Distributed Computing Systems (ICDCS’15). 674--685.
    [91]
    Animesh Trivedi, Patrick Stuedi, Bernard Metzler, Roman Pletka, Blake G. Fitch, and Thomas R. Gross. 2013. Unified high-performance I/O: One stack to rule them all. In Proceedings of the 14th USENIX Conference on Hot Topics in Operating Systems (HotOS’13). USENIX Association, Berkeley, CA.
    [92]
    Shin-Yeh Tsai and Yiying Zhang. 2017. LITE kernel RDMA support for datacenter applications. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP’17). ACM, New York, NY, 306--324.
    [93]
    T. von Eicken, A. Basu, V. Buch, and W. Vogels. 1995. U-Net: A user-level network interface for parallel and distributed computing. In Proceedings of the F15th ACM Symposium on Operating Systems Principles (SOSP’95). ACM, New York, NY, 40--53.
    [94]
    Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast in-memory transaction processing using RDMA and HTM. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP’15). ACM, New York, NY, 87--104.
    [95]
    Zev Weiss, Sriram Subramanian, Swaminathan Sundararaman, Nisha Talagala, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2015. ANViL: Advanced virtualization for modern non-volatile memory devices. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). USENIX Association, Berkeley, CA, 111--118.
    [96]
    Brent Welch, Marc Unangst, Zainul Abbasi, Garth Gibson, Brian Mueller, Jason Small, Jim Zelenka, and Bin Zhou. 2008. Scalable performance of the panasas parallel file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST’08). USENIX Association, Berkeley, CA, Article 2, 17 pages.
    [97]
    Tom White. 2009. Hadoop: The Definitive Guide (1st ed.). O’Reilly Media, Inc.
    [98]
    John Wilkes. 1992. Hamlyn—An Interface for Sender-based Communications. Technical Report HPL-OSR-92-13. Palo Alto, CA.
    [99]
    Dimitrios Xinidis, Angelos Bilas, and Michail D. Flouris. 2005. Performance evaluation of commodity iSCSI-based storage systems. In Proceedings of the 22nd IEEE /13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’05). IEEE Computer Society, Los Alamitos, CA, 261--269.
    [100]
    Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Tameesh Suri, Manu Awasthi, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. 2015. Performance analysis of NVMe SSDs and their implication on real world databases. In Proceedings of the 8th ACM International Systems and Storage Conference (SYSTOR’15). ACM, New York, NY, Article 6, 11 pages.
    [101]
    Jisoo Yang, Dave B. Minturn, and Frank Hady. 2012. When poll is better than interrupt. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST’12). USENIX Association, Berkeley, CA, 25--32.
    [102]
    Yiwen Zhang, Juncheng Gu, Youngmoon Lee, Mosharaf Chowdhury, and Kang G. Shin. 2017. Performance isolation anomalies in RDMA. In Proceedings of the Workshop on Kernel-Bypass Networks (KBNets’17). ACM, New York, NY, 43--48.
    [103]
    Yiying Zhang, Jian Yang, Amirsaman Memaripour, and Steven Swanson. 2015. Mojim: A reliable and highly-available non-volatile memory system. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). ACM, New York, NY, 3--18.

    Cited By

    View all
    • (2023)DPFS: DPU-Powered File System VirtualizationProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594769(1-7)Online publication date: 5-Jun-2023
    • (2023)Performance Characterization of Modern Storage Stacks: POSIX I/O, libaio, SPDK, and io_uringProceedings of the 3rd Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems10.1145/3578353.3589545(35-45)Online publication date: 8-May-2023
    • (2023)Performance Characterization of NVMe Flash Devices with Zoned Namespaces (ZNS)2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00018(118-131)Online publication date: 31-Oct-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Storage
    ACM Transactions on Storage  Volume 14, Issue 4
    Special Section on Systor 2017 and Regular Papers
    November 2018
    175 pages
    ISSN:1553-3077
    EISSN:1553-3093
    DOI:10.1145/3297750
    • Editor:
    • Sam H. Noh
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 December 2018
    Accepted: 01 July 2018
    Received: 01 April 2018
    Published in TOS Volume 14, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. RDMA
    2. flash
    3. network storage
    4. operating systems
    5. performance

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)28
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)DPFS: DPU-Powered File System VirtualizationProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594769(1-7)Online publication date: 5-Jun-2023
    • (2023)Performance Characterization of Modern Storage Stacks: POSIX I/O, libaio, SPDK, and io_uringProceedings of the 3rd Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems10.1145/3578353.3589545(35-45)Online publication date: 8-May-2023
    • (2023)Performance Characterization of NVMe Flash Devices with Zoned Namespaces (ZNS)2023 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER52292.2023.00018(118-131)Online publication date: 31-Oct-2023
    • (2021)Toward a better understanding and evaluation of tree structures on flash SSDsProceedings of the VLDB Endowment10.14778/3430915.343092614:3(364-377)Online publication date: 9-Dec-2021
    • (2021)The Demikernel Datapath OS Architecture for Microsecond-scale Datacenter SystemsProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483569(195-211)Online publication date: 26-Oct-2021

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media