Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3673038.3673113acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article
Open access

Revisiting Learned Index with Byte-addressable Persistent Storage

Published: 12 August 2024 Publication History

Abstract

Byte-addressable Persistent Storage (BPS), such as persistent memory and CXL-enabled SSDs, has become an extension of main memory. This opens up new possibilities for indexes that operate and persist data directly on the memory bus. Recent learned indexes exploit data distribution and have shown great potential for some workloads. Despite some work proposed for integrating learned indexes into BPS, they are mainly based on Intel’s first-generation persistent memory. The current design suffers from the following problems: 1) Excessive storage line accesses due to large node in learned indexes; 2) Inefficient concurrency control due to volatile cache; 3) Write amplification due to mismatch access granularity.
To resolve these challenges, we introduce a new learned index named PFLX, featuring three key design improvements: 1) a Storage Line-Friendly Node Layout that enhances search and insertion operations and minimizes storage line accesses through strategic use of pointers; 2) a Persistent Cache-based Lock-free Concurrency Mechanism that utilizes atomic primitives to serialize operations in leaf nodes effectively; 3) a Selective Data Flush Mechanism designed to reduce write amplification. Evaluations show that PFLX outperforms the state-of-the-art persistent indexes by 1.2 ∼ 3 ×.

References

[1]
[n. d.]. The difference between RAM speed and CAS latency.https://www.crucial.com/articles/about-memory/difference-between-speed-and-latency.
[2]
Joy Arulraj, Justin Levandoski, Umar Farooq Minhas, and Per-Ake Larson. 2018. Bztree: A High-Performance Latch-Free Range Index for Non-Volatile Memory. Proc. VLDB Endow. 11, 5 (oct 2018), 553–565. https://doi.org/10.1145/3164135.3164147
[3]
AWS. 2021. OpenStreetMap on AWS. https://registry.opendata.aws/osm.
[4]
Youmin Chen, Youyou Lu, Fan Yang, Qing Wang, Yang Wang, and Jiwu Shu. 2020. FlatStore: An Efficient Log-Structured Key-Value Storage Engine for Persistent Memory. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS ’20). Association for Computing Machinery, New York, NY, USA, 1077–1091. https://doi.org/10.1145/3373376.3378515
[5]
Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA, 969–984. https://doi.org/10.1145/3318464.3389711
[6]
Jialin Ding, Vikram Nathan, Mohammad Alizadeh, and Tim Kraska. 2020. Tsunami: A Learned Multi-Dimensional Index for Correlated Data and Skewed Workloads. Proc. VLDB Endow. 14, 2 (oct 2020), 74–86. https://doi.org/10.14778/3425879.3425880
[7]
Paolo Ferragina and Giorgio Vinciguerra. 2020. The PGM-Index: A Fully-Dynamic Compressed Learned Index with Provable Worst-Case Bounds. Proc. VLDB Endow. 13, 8 (apr 2020), 1162–1175. https://doi.org/10.14778/3389133.3389135
[8]
Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In Proceedings of the 2016 ACM SIGCOMM Conference (Florianopolis, Brazil) (SIGCOMM ’16). Association for Computing Machinery, New York, NY, USA, 202–215. https://doi.org/10.1145/2934872.2934908
[9]
Yibo Huang, Yukai Huang, Ming Yan, Jiayu Hu, Cunming Liang, Yang Xu, Wenxiong Zou, Yiming Zhang, Rui Zhang, Chunpu Huang, and Jie Wu. 2022. An ultra-low latency and compatible PCIe interconnect for rack-scale communication. In Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (Roma, Italy) (CoNEXT ’22). Association for Computing Machinery, New York, NY, USA, 232–244. https://doi.org/10.1145/3555050.3569128
[10]
Deukyeon Hwang, Wook-Hee Kim, Youjip Won, and Beomseok Nam. 2018. Endurable Transient Inconsistency in Byte-Addressable Persistent B+-Tree. In 16th USENIX Conference on File and Storage Technologies (FAST 18). USENIX Association, Oakland, CA, 187–200. https://www.usenix.org/conference/fast18/presentation/hwang
[11]
Intel. 2021. eADR: New Opportunities for Persistent Memory Applications. https://www.intel.com/content/www/us/en/developer/articles/technical/eadr-new-opportunities-for-persistent-memory-applications.html
[12]
Intel. 2023. Intel® 64 and ia-32 architectures optimization reference manual. https://www.intel.com/content/www/us/en/content-details/671488/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html.
[13]
Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann. 2020. RadixSpline: A Single-Pass Learned Index. In Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (Portland, Oregon) (aiDM ’20). Association for Computing Machinery, New York, NY, USA, Article 5, 5 pages. https://doi.org/10.1145/3401071.3401659
[14]
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD ’18). Association for Computing Machinery, New York, NY, USA, 489–504. https://doi.org/10.1145/3183713.3196909
[15]
Pengfei Li, Yu Hua, Jingnan Jia, and Pengfei Zuo. 2021. FINEdex: A Fine-Grained Learned Index Scheme for Scalable and Concurrent Memory Systems. Proc. VLDB Endow. 15, 2 (oct 2021), 321–334. https://doi.org/10.14778/3489496.3489512
[16]
Baotong Lu, Jialin Ding, Eric Lo, Umar Farooq Minhas, and Tianzheng Wang. 2021. APEX: A High-Performance Learned Index on Persistent Memory. Proc. VLDB Endow. 15, 3 (nov 2021), 597–610. https://doi.org/10.14778/3494124.3494141
[17]
Chao Wang, Junliang Hu, Tsun-Yu Yang, Yuhong Liang, and Ming-Chang Yang. 2023. SEPH: Scalable, Efficient, and Predictable Hashing on Persistent Memory. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). USENIX Association, Boston, MA, 479–495. https://www.usenix.org/conference/osdi23/presentation/wang-chao
[18]
Qing Xie, Chaoyi Pang, Xiaofang Zhou, Xiangliang Zhang, and Ke Deng. 2014. Maximum error-bounded piecewise linear representation for online stream approximation. The VLDB journal 23 (2014), 915–937.
[19]
Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steve Swanson. 2020. An Empirical Guide to the Behavior and Use of Scalable Persistent Memory. In 18th USENIX Conference on File and Storage Technologies (FAST 20). USENIX Association, Santa Clara, CA, 169–182. https://www.usenix.org/conference/fast20/presentation/yang
[20]
Shao-Peng Yang, Minjae Kim, Sanghyun Nam, Juhyung Park, Jin yong Choi, Eyee Hyun Nam, Eunji Lee, Sungjin Lee, and Bryan S. Kim. 2023. Overcoming the Memory Wall with CXL-Enabled SSDs. In 2023 USENIX Annual Technical Conference (USENIX ATC 23). USENIX Association, Boston, MA, 601–617. https://www.usenix.org/conference/atc23/presentation/yang-shao-peng
[21]
Bowen Zhang, Shengan Zheng, Zhenlin Qi, and Linpeng Huang. 2022. NBTree: A Lock-Free PM-Friendly Persistent B+-Tree for EADR-Enabled PM Systems. Proc. VLDB Endow. 15, 6 (feb 2022), 1187–1200. https://doi.org/10.14778/3514061.3514066
[22]
Rui Zhang, Yao Wu, Shangyi Sun, Lulu Chen, Yibo Huang, Ming Yan, and Jie Wu. 2023. PFtree: Optimizing Persistent Adaptive Radix Tree for PM Systems on eADR Platform. In Database Systems for Advanced Applications, Xin Wang, Maria Luisa Sapino, Wook-Shin Han, Amr El Abbadi, Gill Dobbie, Zhiyong Feng, Yingxiao Shao, and Hongzhi Yin (Eds.). Springer Nature Switzerland, Cham, 46–61.
[23]
Zhou Zhang, Zhaole Chu, Peiquan Jin, Yongping Luo, Xike Xie, Shouhong Wan, Yun Luo, Xufei Wu, Peng Zou, Chunyang Zheng, Guoan Wu, and Andy Rudoff. 2022. PLIN: A Persistent Learned Index for Non-Volatile Memory with High Performance and Instant Recovery. Proc. VLDB Endow. 16, 2 (oct 2022), 243–255. https://doi.org/10.14778/3565816.3565826
[24]
Zhou Zhang, Pei-Quan Jin, Xiao-Liang Wang, Yan-Qi Lv, Shou-Hong Wan, and Xi-Ke Xie. 2021. COLIN: a cache-conscious dynamic learned index with high read/write performance. Journal of Computer Science and Technology 36 (2021), 721–740.
[25]
Xinjing Zhou, Lidan Shou, Ke Chen, Wei Hu, and Gang Chen. 2019. DPTree: Differential Indexing for Persistent Memory. Proc. VLDB Endow. 13, 4 (dec 2019), 421–434. https://doi.org/10.14778/3372716.3372717

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '24: Proceedings of the 53rd International Conference on Parallel Processing
August 2024
1279 pages
ISBN:9798400717932
DOI:10.1145/3673038
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2024

Check for updates

Author Tags

  1. Byte-addressable Persistent Storage
  2. CXL
  3. Learned Index
  4. Persistent Memory

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Key Research and Development Program of China

Conference

ICPP '24

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 201
    Total Downloads
  • Downloads (Last 12 months)201
  • Downloads (Last 6 weeks)64
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media