Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Your read is our priority in flash storage

Published: 01 May 2022 Publication History

Abstract

When replacing a dirty victim page upon page miss, the conventional buffer managers flush the dirty victim first to the storage before reading the missing page. This read-after-write (RAW) protocol, unfortunately, causes the read stall problem on flash storage; because of the asymmetric I/O speed and parallelism in flash storage, the clean frames are quickly consumed, so the read for the missing page often has to wait for the slow write to complete and for the frame to be clean due to the resource conflict for the same buffer frame. RAW will thus make the performance-critical synchronous reads often blocked by writes, severely worsening transaction throughput and latency. In addition, its strict I/O ordering will make flash storage with abundant parallelism under-utilized.
To avoid read stalls in the DBMS buffer, we propose RW (fused read and write) as a new storage interface. Using RW on read stall, the buffer manager can issue both read and write requests at once to the storage. Then, once the dirty page is copied to the storage buffer, it can immediately serve the read. In addition, to resolve read stalls in the flash storage buffer, we propose R-Buf, where the read buffer is separated from the write buffer so that reads can proceed at no stall. RW and R-Buf, working at different layers, complement each other when used together. We prototype RW and R-Buf on a real Cosmos+ OpenSSD board. Evaluation results show that RW alone improves TPC-C throughput over RAW by 3.2x and, combined with R-Buf, does by 3.9x. In addition, we demonstrate that R-Buf effectively mitigates the I/O interference in multi-tenancy.

References

[1]
Ibrar Ahmed, Gregory Smith, and Enrico Pirozzi. 2018. PostgreSQL 10 High Performance: Expert Techniques for Query Optimization, High Availability, and Efficient Database Maintenance. Packt Publishing.
[2]
Timothy G. Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: A Database Benchmark Based on the Facebook Social Graph. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13). 1185--1196.
[3]
Jens Axboe. [n.d.]. FIO (Flexible IO Tester). https://github.com/axboe/fio.
[4]
William Bridge, Ashok Joshi, M. Keihl, Tirthankar Lahiri, Juan Loaiza, and N. MacNaughton. 1997. The Oracle Universal Server Buffer. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB '97). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 590--594.
[5]
Feng Chen, Binbing Hou, and Rubao Lee. 2016. Internal Parallelism of Flash Memory-Based Solid-State Drives. ACM Transactions on Storage (TOS) 12 (2016), 1 -- 39.
[6]
Feng Chen, Rubao Lee, and Xiaodong Zhang. 2011. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In 2011 IEEE 17th International Symposium on High Performance Computer Architecture. 266--277.
[7]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10). 143--154.
[8]
Intel Corporation. 2018. Accelerated SSD Infrastructure for the Cloud. https://builders.intel.com/docs/datacenterbuilders/accelerated-ssd-infrastructure-for-the-cloud-with-attala.pdf. (2018).
[9]
Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham Wood. 2005. Automatic Performance Diagnosis and Tuning in Oracle. In CIDR.
[10]
Nima Elyasi, Changho Choi, Anand Sivasubramaniam, Jingpei Yang, and Vijay Balakrishnan. 2019. Trimming the Tail for Deterministic Read Performance in SSDs. In 2019 IEEE International Symposium on Workload Characterization (IISWC). 49--58.
[11]
Facebook. 2014. db_bench. https://github.com/facebook/rocksdb/wiki/Benchmarking-tools.
[12]
Jim Gray and Bob Fitzgerald. 2008. Flash Disk Opportunity for Server Applications: Future Flash-Based Disks Could Provide Breakthroughs in IOPS, Power, Reliability, and Volumetric Capacity When Compared with Conventional Disks. Queue 6, 4 (July 2008), 18--23.
[13]
Jim Gray and Andreas Reuter. 1992. Transaction Processing: Concepts and Techniques (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
[14]
Guy Harrison. 2014. Using Flash SSD to Optimize Oralce Database Performance. https://www.slideshare.net/gharriso/ssd-and-the-db-flash-cache.
[15]
Gabriel Haas, Michael Haubenschild, and Viktor Leis. 2020. Exploiting Directly-Attached NVMe Arrays in DBMS. In 10th Conference on Innovative Data Systems Research, CIDR 2020.
[16]
Jasmine OpenSSD. 2011. OpenSSD Project. http://www.openssd-project.org/wiki/Jasmine_OpenSSD_Platform.
[17]
Minji Kang, Soyee Choi, Gihwan Oh, and Sang-Won Lee. 2020. 2R: Efficiently Isolating Cold Pages in Flash Storages. Proceedings of VLDB Endowment 13, 12 (jul 2020), 2004--2017.
[18]
Woon-Hak Kang, Sang-Won Lee, and Bongki Moon. 2016. Flash as Cache Extension for Online Transactional Workloads. The VLDB Journal 25, 5 (Oct. 2016), 673--694.
[19]
Hyojun Kim and Seongjun Ahn. 2008. BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (San Jose, California) (FAST'08). USENIX Association, USA, Article 16, 14 pages.
[20]
Alexey Kopytov. 2018. SysBench. https://github.com/akopytov/sysbench.
[21]
Jaewook Kwak, Sangjin Lee, Kibin Park, Jinwoo Jeong, and Yong Ho Song. 2020. Cosmos+ OpenSSD: Rapid Prototype for Flash Storage Systems. ACM Transactions on Storage 16, 3, Article 15 (July 2020).
[22]
Sang-Won Lee, Bongki Moon, and Chanik Park. 2009. Advances in Flash Memory SSD Technology for Enterprise Database Applications. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD '09). 863--870.
[23]
Viktor Leis, Michael Haubenschild, Alfons Kemper, and Thomas Neumann. 2018. Leanstore: In-memory data management beyond main memory. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). 185--196.
[24]
Scott T. Leutenegger and Daniel Dias. 1993. A Modeling Study of the TPC-C Benchmark. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD '93). 22--31.
[25]
Feifei Li. 2019. Cloud-Native Database Systems at Alibaba: Opportunities and Challenges. PVLDB 12, 12 (2019), 1942--1945.
[26]
Violin Memory. 2016. Flash Fabric Architecture (Version 2.0). A Whitepaper from Violin Memory.
[27]
MySQL Team (Oracle Corp.). 2021. Configuring Buffer Pool Flushing. https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool-flushing.html.
[28]
MySQL Team (Oracle Corp.). 2021. The InnoDB Buffer Pool. https://dev.mysql.com/doc/refman/5.7/en/innodb-buffer-pool.html.
[29]
MySQLTeam (Oracle Corp.). 2021. Optimizing InnoDB Disk I/O. https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-diskio.html.
[30]
MySQL Team (Oracle Corp.). 2021. Server System Variable Reference. https://dev.mysql.com/doc/refman/5.7/en/server-system-variable-reference.html.
[31]
Eyee Hyun Nam, Bryan Suk Joon Kim, Hyeonsang Eom, and Sang Lyul Min. 2011. Ozone (O3): An Out-of-Order Flash Memory Controller Architecture. IEEE Trans. Comput. 60, 5 (2011), 653--666.
[32]
Sai Tung On, Shen Gao, Bingsheng He, Ming Wu, Qiong Luo, and Jianliang Xu. 2014. FD-Buffer: A Cost-Based Adaptive Buffer Replacement Algorithm for FlashMemory Devices. IEEE Trans. Comput. 63, 9 (2014), 2288--2301.
[33]
Tarikul Islam Papon and Manos Athanassoulis. 2021. A Parametric I/O Model for Modern Storage Devices. In Proceedings of the 17th International Workshop on Data Management on New Hardware (DaMoN 2021) (Virtual Event, China) (DAMON'21). Association for Computing Machinery, New York, NY, USA, Article 2, 11 pages.
[34]
Jong-Hyeok Park, Soyee Choi, Gihwan Oh, and Sang-Won Lee. 2021. SaS: SSD as SQL Database System. Proceedings of VLDB Endowment 14, 9 (may 2021), 1481--1488.
[35]
Seon-yeong Park, Dawoon Jung, Jeong-uk Kang, Jin-soo Kim, and Joonwon Lee. 2006. CFLRU: A Replacement Algorithm for Flash Memory. In Proceedings of the 2006 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (Seoul, Korea) (CASES '06). Association for Computing Machinery, New York, NY, USA, 234--241.
[36]
Percona. 2018. tpcc-mysql. https://github.com/Percona-Lab/tpcc-mysql.
[37]
Adam J. Storm, Christian Garcia-Arellano, Sam S. Lightstone, Yixin Diao, and M. Surendra. 2006. Adaptive Self-Tuning Memory in DB2. In Proceedings of the 32nd International Conference on Very Large Data Bases (Seoul, Korea) (VLDB '06). VLDB Endowment, 1081--1092.
[38]
Steven Swanson and Adrian Caulfield. 2013. Refactor, Reduce, Recycle: Restructuring the I/O Stack for the Future of Storage. Computer 46, 8 (Aug. 2013), 52--59.
[39]
J. Z. Teng and R. A. Gumaer. 1984. Managing IBM Database 2 buffers to maximize performance. IBM Systems Journal 23, 2 (1984), 211--218.
[40]
The PostgreSQL Global Development Group. 2019. PostgreSQL 11 Documentation: Resource Consumption. https://www.postgresql.org/docs/current/runtime-config-resource.html.
[41]
TPC. [n.d.]. TPC-H. http://www.tpc.org/tpch.
[42]
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-Scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). 1009--1024.
[43]
Daniel Waddington and Jim Harris. 2018. Software Challenges for the Changing Storage Landscape. Commun. ACM 61, 11 (oct 2018), 136--145.
[44]
Qingsong Wei, Cheng Chen, and Jun Yang. 2014. CBM: A cooperative buffer management for SSD. In 2014 30th Symposium on Mass Storage Systems and Technologies (MSST). 1--12.
[45]
Chun-Feng Wu, Yuan-Hao Chang, Ming-Chang Yang, and Tei-Wei Kuo. 2020. When Storage Response Time Catches Up With Overall Context Switch Overhead, What Is Next? IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 39, 11 (2020), 4266--4277.
[46]
Guanying Wu and Xubin He. 2012. Reducing SSD Read Latency via NAND Flash Program and Erase Suspension. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (San Jose, CA) (FAST'12). USENIX Association, USA, 10.

Cited By

View all
  • (2024)Volley: Accelerating Write-Read Orders in Disaggregated StorageProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650090(657-673)Online publication date: 22-Apr-2024
  • (2023)LRU-C: Parallelizing Database I/Os for Flash SSDsProceedings of the VLDB Endowment10.14778/3598581.359860516:9(2364-2376)Online publication date: 10-Jul-2023
  • (2023)NV-SQL: Boosting OLTP Performance with Non-Volatile DIMMsProceedings of the VLDB Endowment10.14778/3583140.358315916:6(1453-1465)Online publication date: 20-Apr-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 15, Issue 9
May 2022
239 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 May 2022
Published in PVLDB Volume 15, Issue 9

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)79
  • Downloads (Last 6 weeks)7
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Volley: Accelerating Write-Read Orders in Disaggregated StorageProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650090(657-673)Online publication date: 22-Apr-2024
  • (2023)LRU-C: Parallelizing Database I/Os for Flash SSDsProceedings of the VLDB Endowment10.14778/3598581.359860516:9(2364-2376)Online publication date: 10-Jul-2023
  • (2023)NV-SQL: Boosting OLTP Performance with Non-Volatile DIMMsProceedings of the VLDB Endowment10.14778/3583140.358315916:6(1453-1465)Online publication date: 20-Apr-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media