Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2485732.2485740acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

Linux block IO: introducing multi-queue SSD access on multi-core systems

Published: 30 June 2013 Publication History

Abstract

The IO performance of storage devices has accelerated from hundreds of IOPS five years ago, to hundreds of thousands of IOPS today, and tens of millions of IOPS projected in five years. This sharp evolution is primarily due to the introduction of NAND-flash devices and their data parallel design. In this work, we demonstrate that the block layer within the operating system, originally designed to handle thousands of IOPS, has become a bottleneck to overall storage system performance, specially on the high NUMA-factor processors systems that are becoming commonplace. We describe the design of a next generation block layer that is capable of handling tens of millions of IOPS on a multi-core system equipped with a single storage device. Our experiments show that our design scales graciously with the number of cores, even on NUMA systems with multiple sockets.

References

[1]
Improving network performance in multi-core systems. Intel Corporation, 2007.
[2]
J. Axboe. Linux Block IO present and future. Ottawa Linux Symposium, 2004.
[3]
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schupbach, and S. Akhilesh. The multikernel: a new OS architecture for scalable multicore systems. Symposium on Operating Systems Principles, 2009.
[4]
M. Bjørling, P. Bonnet, L. Bouganim, and N. Dayan. The necessary death of the block device interface. In Conference on Innovative Data Systems Research, 2013.
[5]
S. Boyd-wickizer, A. T. Clements, Y. Mao, A. Pesterev, M. F. Kaashoek, R. Morris, and N. Zeldovich. An Analysis of Linux Scalability to Many Cores. Operating Systems Design and Implementation, 2010.
[6]
G. W. Burr, M. J. Breitwisch, M. Franceschini, D. Garetto, K. Gopalakrishnan, B. Jackson, C. Lam, and A. Luis. Phase change memory technology. Journal of Vacuum Science and Technology B, 28(2):223--262, 2010.
[7]
A. M. Caulfield, A. De, J. Coburn, T. I. Mollov, R. K. Gupta, and S. Swanson. Moneta: A high-performance storage array architecture for next-generation, non-volatile memories. In Proceedings of The 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010.
[8]
A. M. Caulfield, T. I. Mollov, L. A. Eisner, A. De, J. Coburn, and S. Swanson. Providing safe, user space access to fast, solid state disks. SIGARCH Comput. Archit. News, 40(1):387--400, Mar. 2012.
[9]
S. Cho, C. Park, H. Oh, S. Kim, Y. Y. Yi, and G. Ganger. Active Disk Meets Flash: A Case for Intelligent SSDs. Technical Report CMU-PDL-11-115, 2011.
[10]
Completely Fair Queueing (CFQ) Scheduler. http://en.wikipedia.org/wiki/CFQ.
[11]
J. Condit, E. B. Nightingale, C. Frost, E. Ipek, B. Lee, D. Burger, and D. Coetzee. Better I/O through byte-addressable, persistent memory. Symposium on Operating Systems Principles, page 133, 2009.
[12]
Deadline IO Scheduler. http://en.wikipedia.org/wiki/Deadline_scheduler.
[13]
M. Dunn and A. L. N. Reddy. A new I/O scheduler for solid state devices. Texas A&M University, 2010.
[14]
fio. http://freecode.com/projects/fio.
[15]
P. Foglia, C. A. Prete, M. Solinas, and F. Panicucci. Investigating design tradeoffs in S-NUCA based CMP systems. UCAS, 2009.
[16]
Fusion-io ioDrive2. http://www.fusionio.com/.
[17]
L. M. Grupp, J. D. David, and S. Swanson. The Bleak Future of NAND Flash Memory. USENIX Conference on File and Storage Technologies, 2012.
[18]
A. Huffman. NVM Express, Revision 1.0c. Intel Corporation, 2012.
[19]
J. Kim, Y. Oh, E. Kim, J. Choi, D. Lee, and S. H. Noh. Disk Schedulers for Solid State Drives. In EMSOFTâĂŹ09: 7th ACM Conf. on Embedded Software, pages 295--304, 2009.
[20]
F. Liu, X. Jiang, and Y. Solihin. Understanding How Off-Chip Memory Bandwidth Partitioning in Chip Multiprocessors Affects System Performance. High Performance Computer Architecture, 2009.
[21]
S. Mangold, S. Choi, P. May, O. Klein, G. Hiertz, and L. Stibor. 802.11e Wireless LAN for Quality of Service. IEEE, 2012.
[22]
J. Nieplocha, R. J. Harrison, and R. J. Littlefield. Global Arrays: A Non-Uniform-Memory-Access Programming Model For High-Performance Computers. The Journal of Supercomputing, 1996.
[23]
S. Park and K. Shen. FIOS: A Fair, Efficient Flash I/O Scheduler. In USENIX Conference on File and Storage Technologies, 2010.
[24]
J. Parkhurst, J. Darringer, and B. Grundmann. From single core to multi-core: preparing for a new exponential. In Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design, 2006.
[25]
PCI-SIG. PCI Express Specification Revision 3.0. Technical report, 2012.
[26]
L. Soares and M. Stumm. Flexsc: Flexible system call scheduling with exception-less system calls. In Proceedings of the 9th USENIX conference on Operating systems design and implementation, 2010.
[27]
H. Sutter. The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb's Journal, 30(3):202--210, 2005.
[28]
V. Vasudevan, M. Kaminsky, and D. G. Andersen. Using vector interfaces to deliver millions of iops from a networked key-value storage server. In Proceedings of the Third ACM Symposium on Cloud Computing, 2012.
[29]
B. Verghese, S. Devine, A. Gupta, and M. Rosenblum. Operating System Support for Improving Data Locality on CC-NUMA Compute Servers. In International Conference on Architectural Support for Programming Languages and Operating Systems, 1996.
[30]
J. Weinberg. Quantifying Locality In The Memory Access Patterns of HPC Applications. PhD thesis, 2005.
[31]
J. Yang, D. B. Minturn, and F. Hady. When Poll is Better than Interrupt. In USENIX Conference on File and Storage Technologies, 2012.

Cited By

View all
  • (2024)Sync+SyncProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699088(3349-3366)Online publication date: 14-Aug-2024
  • (2024)ScalaCacheProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692064(1185-1202)Online publication date: 10-Jul-2024
  • (2024)I/O PassthruProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650704(107-122)Online publication date: 27-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SYSTOR '13: Proceedings of the 6th International Systems and Storage Conference
June 2013
198 pages
ISBN:9781450321167
DOI:10.1145/2485732
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 June 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. block layer
  2. latency
  3. linux
  4. non-volatile memory
  5. solid state drives
  6. throughput

Qualifiers

  • Research-article

Conference

SYSTOR '13
Sponsor:
  • INTEL
  • Riverbed
  • Technion
  • SIGOPS
  • EMC<sup>2</sup>
  • AXCIENT
  • USENIX Assoc
  • IBM
  • HP

Acceptance Rates

SYSTOR '13 Paper Acceptance Rate 20 of 49 submissions, 41%;
Overall Acceptance Rate 108 of 323 submissions, 33%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)165
  • Downloads (Last 6 weeks)18
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Sync+SyncProceedings of the 33rd USENIX Conference on Security Symposium10.5555/3698900.3699088(3349-3366)Online publication date: 14-Aug-2024
  • (2024)ScalaCacheProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692064(1185-1202)Online publication date: 10-Jul-2024
  • (2024)I/O PassthruProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650704(107-122)Online publication date: 27-Feb-2024
  • (2024)Storage Abstractions for SSDs: The Past, Present, and FutureACM Transactions on Storage10.1145/370899221:1(1-44)Online publication date: 30-Dec-2024
  • (2024)BIZA: Design of Self-Governing Block-Interface ZNS AFA for Endurance and PerformanceProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695953(313-329)Online publication date: 4-Nov-2024
  • (2024)OZZ: Identifying Kernel Out-of-Order Concurrency Bugs with In-Vivo Memory Access ReorderingProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695944(229-248)Online publication date: 4-Nov-2024
  • (2024)From SSDs Back to HDDs: Optimizing VDO to Support Inline Deduplication and Compression for HDDs as Primary Storage MediaACM Transactions on Storage10.1145/367825020:4(1-28)Online publication date: 23-Jul-2024
  • (2024)A Contract-aware and Cost-effective LSM Store for Cloud Storage with Low Latency SpikesACM Transactions on Storage10.1145/364385120:2(1-27)Online publication date: 20-Feb-2024
  • (2024)zns-tools: An eBPF-powered, Cross-Layer Storage Profiling Tool for NVMe ZNS SSDsProceedings of the 4th Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems10.1145/3642963.3652205(23-32)Online publication date: 22-Apr-2024
  • (2024)ScaleCache: A Scalable Page Cache for Multiple Solid-State DrivesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629588(641-656)Online publication date: 22-Apr-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media