Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2367589.2367595acmotherconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

GPUstore: harnessing GPU computing for storage systems in the OS kernel

Published: 04 June 2012 Publication History

Abstract

Many storage systems include computationally expensive components. Examples include encryption for confidentiality, checksums for integrity, and error correcting codes for reliability. As storage systems become larger, faster, and serve more clients, the demands placed on their computational components increase and they can become performance bottlenecks. Many of these computational tasks are inherently parallel: they can be run independently for different blocks, files, or I/O requests. This makes them a good fit for GPUs, a class of processor designed specifically for high degrees of parallelism: consumer-grade GPUs have hundreds of cores and are capable of running hundreds of thousands of concurrent threads. However, because the software frameworks built for GPUs have been designed primarily for the long-running, data-intensive workloads seen in graphics or high-performance computing, they are not well-suited to the needs of storage systems.
In this paper, we present GPUstore, a framework for integrating GPU computing into storage systems. GPUstore is designed to match the programming models already used these systems. We have prototyped GPUstore in the Linux kernel and demonstrate its use in three storage subsystems: file-level encryption, block-level encryption, and RAID 6 data recovery. Comparing our GPU-accelerated drivers with the mature CPU-based implementations in the Linux kernel, we show performance improvements of up to an order of magnitude.

References

[1]
R. Bhaskar, P. K. Dubey, V. Kumar, and A. Rudra. Efficient Galois field arithmetic on SIMD architectures. In Proceedings of the Symposium on Parallel Algorithms and Architectures, 2003.
[2]
P. Bhatotia, R. Rodrigues, and A. Verma. Shredder: GPU-accelerated incremental storage and computation. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST), 2012.
[3]
M. Blaum, J. Brady, J. Bruck, and J. Menon. EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures. In Proceedings of the 21st Annual International Symposium on Computer Architecture (ISCA), 1994.
[4]
M. Blaze. A cryptographic file system for UNIX. In Proceedings of the 1st ACM Conference on Computer and Communications Security, 1993.
[5]
A. Brinkmann and D. Eschweiler. A microdriver architecture for error correcting codes inside the Linux kernel. In Proccedings of the SC09 Conference, 2009.
[6]
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: stream computing on graphics hardware. In Proceedings of the ACM SIGGRAPH Annual Conference, 2004.
[7]
P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. RAID: High-performance, reliable secondary storage. ACM Computing Surveys, 26(2): 145--185, 1994.
[8]
P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar. Row-diagonal parity for double disk failure correction. In Proceedings of the 3rd USENIX Symposium on File and Storage Technologies (FAST), 2004.
[9]
M. L. Curry, H. L. Ward, A. Skjellum, and R. Brightwell. A lightweight, GPU-based software RAID system. In International Confernece on Parallel Processing (ICPP).
[10]
M. L. Curry, A. Skjellum, H. L. Ward, and R. Brightwell. Gibraltar: A Reed-Solomon coding library for storage applications on programmable graphics processors. Concurrency and Computation: Practice and Experience, 2010.
[11]
FastestSSD.com. SSD ranking: The fastest solid state drives, Apr. 2012. http://www.fastestssd.com/featured/ssd-rankings-the-fastest-solid-state-drives/#pcie; accessed April 27, 2012.
[12]
I. Gelado, J. E. Stone, J. Cabezas, S. Patel, N. Navarro, and W. Hwu. An asymmetric distributed shared memory model for heterogeneous parallel systems. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, 2010.
[13]
A. Gharaibeh, S. Al-Kiswany, S. Gopalakrishnan, and M. Ripeanu. A GPU accelerated storage system. In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC), 2010.
[14]
S. Han, K. Jang, K. Park, and S. Moon. PacketShader: a GPU-accelerated software router. In Proceedings of the ACM SIGCOMM Conference, 2010.
[15]
T. D. Han and T. S. Abdelrahman. hiCUDA: a high-level directive-based language for GPU programming. In Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units (GPGPU), 2009.
[16]
O. Harrison and J. Waldron. Practical symmetric key cryptography on modern graphics hardware. In Proceedings of the 17th USENIX Security Symposium, 2008.
[17]
O. Harrison and J. Waldron. GPU accelerated cryptography as an OS service. In Transactions on Computational Science XI. Springer-Verlag, 2010.
[18]
A. H. Hormati, M. Samadi, M. Woh, T. Mudge, and S. Mahlke. Sponge: portable stream programming on graphics engines. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2011.
[19]
K. Jang, S. Han, S. Han, S. Moon, and K. Park. SSLShader: cheap SSL acceleration with commodity processors. In Proceedings of the 8th USENIX conference on Networked Systems Design and Implementation (NSDI), 2011.
[20]
A. Kashyap and A. Kashyap. File system extensibility and reliability using an in-kernel database. Technical report, Stony Brook University, 2004.
[21]
S. Kato, M. McThrow, C. Maltzahn, and S. Brandt. Gdev: First-class GPU resource management in the operating system. In Proceedings of the USENIX Annual Technical Conference (ATC), June 2012.
[22]
Khronos Group. OpenCL Specification 1.1. http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf.
[23]
P. Nath, B. Urgaonkar, and A. Sivasubramaniam. Evaluating the usefulness of content addressable storage for high-performance data intensive applications. In Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC), 2008.
[24]
NVIDIA Inc. CUDA C Programming Guide 4.0.
[25]
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krger, A. Lefohn, and T. J. Purcell. A survey of general-purpose computation on graphics hardware. Computer Graphics Forum, 26(1): 80--113, 2007.
[26]
V. S. Pai, P. Druschel, and W. Zwaenepoel. IO-Lite: a unified I/O buffering and caching system. ACM Transactions on Computer Systems, 18: 37--66, February 2000.
[27]
H. Pang, K.-L. Tan, and X. Zhou. Stegfs: A steganographic file system. International Conference on Data Engineering, 2003.
[28]
S. Patil, G. Sivathanu, and E. Zadok. I3fs: An in-kernel integrity checker and intrusion detection file system. In Proceedings of the 18th Annual Large Installation System Administration Conference (LISA), 2004.
[29]
T. Prabhu, S. Ramalingam, M. Might, and M. Hall. EigenCFA: accelerating flow analysis with GPUs. In Proceedings of the 38th ACM Symposium on Principles of Programming Languages (POPL), 2011.
[30]
S. Quinlan and S. Dorward. Venti: A new approach to archival data storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST), 2002.
[31]
I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8(2): 300--304, 1960.
[32]
C. J. Rossbach, J. Currey, M. Silberstein, B. Ray, and E. Witchel. PTask: Operating system abstractions to manage GPUs as compute devices. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP), Oct. 2011.
[33]
J. Schindler, S. Shete, and K. A. Smith. Improving throughput for small disk requests with proximal I/O. In Proceedings of the 9th USENIX conference on File and Stroage Technologies (FAST), 2011.
[34]
S. Ueng, M. Lathara, S. S. Baghsorkhi, and W. Hwu. CUDA-Lite: Reducing GPU programming complexity. In Proceedings of the Workshop on Languages and Compilers for Parallel Computing, 2008.
[35]
C. Ungureanu, B. Atkin, A. Aranya, S. Gokhale, S. Rago, G. Calkowski, C. Dubnicki, and A. Bohra. HydraFS: a high-throughput file system for the HYDRAstor content-addressable storage system. In Proceedings of the 8th USENIX conference on File and Storage Technologies (FAST), 2010.
[36]
G. Vasiliadis, S. Antonatos, M. Polychronakis, E. P. Markatos, and S. Ioannidis. Gnort: High performance network intrusion detection using graphics processors. In Proceedings of the 11th International Symposium on Recent Advances in Intrusion Detection (RAID), 2008.
[37]
S. Watanabe. Solaris 10 ZFS Essentials. Prentice Hall, 2009.
[38]
Y. Weinsberg, D. Dolev, T. Anker, M. Ben-Yehuda, and P. Wyckoff. Tapping into the fountain of CPUs: on operating system support for programmable devices. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008.
[39]
E. Zadok, I. Badulescu, and A. Shender. Cryptfs: A stackable vnode level encryption file system. Technical Report CUCS-021-98, Computer Science, Columbia University, 1998.

Cited By

View all
  • (2023)Towards a Machine Learning-Assisted Kernel with LAKEProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575697(846-861)Online publication date: 27-Jan-2023
  • (2019)Speculative encryption on GPU applied to cryptographic file systemsProceedings of the 17th USENIX Conference on File and Storage Technologies10.5555/3323298.3323307(93-105)Online publication date: 25-Feb-2019
  • (2019)Performance evaluation of container-based virtualization for high performance computing environmentsRevista UIS Ingenierías10.18273/revuin.v18n4-201900318:4(31-42)Online publication date: 16-Jul-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SYSTOR '12: Proceedings of the 5th Annual International Systems and Storage Conference
June 2012
183 pages
ISBN:9781450314480
DOI:10.1145/2367589
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • The Technion - Israel Institute of Techn.: The Technion - Israel Institute of Technology

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2012

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

SYSTOR '12
Sponsor:
  • The Technion - Israel Institute of Techn.

Acceptance Rates

Overall Acceptance Rate 108 of 323 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Towards a Machine Learning-Assisted Kernel with LAKEProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575697(846-861)Online publication date: 27-Jan-2023
  • (2019)Speculative encryption on GPU applied to cryptographic file systemsProceedings of the 17th USENIX Conference on File and Storage Technologies10.5555/3323298.3323307(93-105)Online publication date: 25-Feb-2019
  • (2019)Performance evaluation of container-based virtualization for high performance computing environmentsRevista UIS Ingenierías10.18273/revuin.v18n4-201900318:4(31-42)Online publication date: 16-Jul-2019
  • (2019)Augmenting Operating Systems with OpenCL AcceleratorsACM Transactions on Design Automation of Electronic Systems10.1145/331556924:3(1-29)Online publication date: 28-Mar-2019
  • (2018)Cooperative GPGPU Scheduling for Consolidating Server WorkloadsIEICE Transactions on Information and Systems10.1587/transinf.2018EDP7027E101.D:12(3019-3037)Online publication date: 1-Dec-2018
  • (2018)Research Challenges for Network Function Virtualization - Re-Architecting Middlebox for High Performance and Efficient, Elastic and Resilient Platform to Create New Services -IEICE Transactions on Communications10.1587/transcom.2017EBI0001E101.B:1(96-122)Online publication date: 2018
  • (2018)G-CRS: GPU Accelerated Cauchy Reed-Solomon CodingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.279143829:7(1484-1498)Online publication date: 1-Jul-2018
  • (2018)An Application Framework for Migrating GPGPU Cloud Applications2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)10.1109/CloudCom2018.2018.00026(62-66)Online publication date: Dec-2018
  • (2018)GPU-accelerated high-performance encoding and decoding of hierarchical RAID in virtual machinesThe Journal of Supercomputing10.1007/s11227-017-1969-y74:11(5865-5888)Online publication date: 1-Nov-2018
  • (2017)CatalystACM SIGPLAN Notices10.1145/3140607.305076052:7(44-59)Online publication date: 8-Apr-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media