Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3183713.3196896acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

HOT: A Height Optimized Trie Index for Main-Memory Database Systems

Published: 27 May 2018 Publication History

Abstract

We present the Height Optimized Trie (HOT), a fast and space-efficient in-memory index structure. The core algorithmic idea of HOT is to dynamically vary the number of bits considered at each node, which enables a consistently high fanout and thereby good cache efficiency. The layout of each node is carefully engineered for compactness and fast search using SIMD instructions. Our experimental results, which use a wide variety of workloads and data sets, show that HOT outperforms other state-of-the-art index structures for string keys both in terms of search performance and memory footprint, while being competitive for integer keys. We believe that these properties make HOT highly useful as a general-purpose index structure for main-memory databases.

References

[1]
N. Askitis and R. Sinha. HAT-trie: a cache-conscious trie-based data structure for strings. Proceedings of the thirtieth Australasian conference on Computer science - Volume 62, pages 97--105, 2007.
[2]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. Dbpedia: A nucleus for a web of open data. In Proceedings of the 6th International Semantic Web Conference, pages 722--735, 2007.
[3]
M. Böhm, B. Schlegel, P. B. Volk, U. Fischer, D. Habich, and W. Lehner. Efficient In-Memory Indexing with Generalized Prefix Trees. In Proceedings of the 14th BTW conference on Database Systems for Business, Technology, and Web, pages 227--246, 2011.
[4]
S. Chen, P. B. Gibbons, and T. C. Mowry. Improving index performance through prefetching. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pages 235--246, 2001.
[5]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking cloud serving systems with ycsb. In Proceedings of the 1st ACM Symposium on Cloud Computing, pages 143--154, 2010.
[6]
D. E. Ferguson. Bit-tree: A data structure for fast file processing. Communications of the ACM, 35(6):114--120, June 1992.
[7]
G. Graefe. A survey of b-tree locking techniques. ACM Transactions on Database Systems, 35(3):16:1--16:26, July 2010.
[8]
G. Graefe. Modern b-tree techniques. Foundations and Trends in Databases, 3(4):203--402, 2011.
[9]
T. E. Hart, P. E. McKenney, A. D. Brown, and J. Walpole. Performance of memory reclamation for lockless synchronization. Journal of Parallel and Distributed Computing, 67(12):1270--1285, 2007.
[10]
S. Heinz, J. Zobel, and H. E. Williams. Burst tries: A fast, efficient data structure for string keys. ACM Transactions on Information Systems, 20(2):192--223, Apr. 2002.
[11]
J. Hoffart, F. M. Suchanek, K. Berberich, E. Lewis-Kelham, G. de Melo, and G. Weikum. YAGO2: Exploring and Querying World Knowledge in Time, Space, Context, and Many Languages. In Proceedings of the 20th International Conference Companion on World Wide Web, pages 229--232, 2011.
[12]
R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. In Proceedings of the VLDB Endowment, pages 1496--1499, Aug. 2008.
[13]
A. Kemper and T. Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In 2011 IEEE 27th International Conference on Data Engineering, pages 195--206, April 2011.
[14]
C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: Fast Architecture Sensitive Tree Search on Modern CPUs and GPUs. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pages 339--350, 2010.
[15]
H. Kimura. Foedus: Oltp engine for a thousand cores and nvram. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 691--706, 2015.
[16]
T. Kissinger, B. Schlegel, D. Habich, and W. Lehner. KISS-Tree: Smart Latch-free In-memory Indexing on Modern Architectures. In Proceedings of the Eighth International Workshop on Data Management on New Hardware, pages 16--23, 2012.
[17]
A. Kovács and T. Kis. Partitioning of trees for minimizing height and cardinality. Information Processing Letters, 89(4):181--185, 2004.
[18]
V. Leis, A. Kemper, and T. Neumann. The adaptive radix tree: ARTful indexing for main-memory databases. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering, pages 38--49, 2013.
[19]
V. Leis, F. Scheibner, A. Kemper, and T. Neumann. The ART of practical synchronization. In Proceedings of the 12th International Workshop on Data Management on New Hardware, DaMoN, 2016.
[20]
J. J. Levandoski, D. B. Lomet, and S. Sengupta. The Bw-tree: A B-tree for new hardware platforms. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering, pages 302--313, April 2013.
[21]
D. Makreshanski, J. Levandoski, and R. Stutsman. To lock, swap, or elide: On the interplay of hardware transactional memory and lock-free indexing. Proc. VLDB Endow., 8(11):1298--1309, July 2015.
[22]
Y. Mao, E. Kohler, and R. T. Morris. Cache Craftiness for Fast Multicore Key-value Storage. In Proceedings of the 7th ACM European Conference on Computer Systems, pages 183--196, 2012.
[23]
D. R. Morrison. PATRICIA--Practical Algorithm To Retrieve Information Coded in Alphanumeric. Journal of the ACM, 15(4):514--534, 10 1968.
[24]
J. Rao and K. A. Ross. Making B-Trees Cache Conscious in Main Memory. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 475--486, 2000.
[25]
B. Schlegel, R. Gemulla, and W. Lehner. k-Ary Search on Modern Processors. In Proceedings of the Fifth International Workshop on Data Management on New Hardware, pages 52--60, 2009.
[26]
S. Tu, W. Zheng, E. Kohler, B. Liskov, and S. Madden. Speedy transactions in multicore in-memory databases. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 18--32, 2013.
[27]
Z. Wang, A. Pavlo, H. Lim, V. Leis, H. Zhang, M. Kaminsky, and D. Andersen. Building a Bw-tree takes more than just buzz words. In Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data, 2018.
[28]
Z. Xie, Q. Cai, H. V. Jagadish, B. C. Ooi, and W. F. Wong. Parallelizing skip lists for in-memory multi-core database systems. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering, pages 119--122, April 2017.
[29]
H. Zhang, D. G. Andersen, M. Kaminsky, A. Pavlo, H. Lim, V. Leis, and K. Keeton. SuRF: Practical Range Query Filtering with Fast Succinct Tries. In Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data, 2018.
[30]
H. Zhang, D. G. Andersen, A. Pavlo, M. Kaminsky, L. Ma, and R. Shen. Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes. In Proceedings of the 2016 International Conference on Management of Data, pages 1567--1581, 2016.
[31]
J. Zhou and K. A. Ross. Implementing Database Operations Using SIMD Instructions. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pages 145--156, 2002.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '18: Proceedings of the 2018 International Conference on Management of Data
May 2018
1874 pages
ISBN:9781450347037
DOI:10.1145/3183713
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. height optimized trie
  2. index
  3. main memory
  4. simd

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '18
Sponsor:

Acceptance Rates

SIGMOD '18 Paper Acceptance Rate 90 of 461 submissions, 20%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)119
  • Downloads (Last 6 weeks)20
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)LITS: An Optimized Learned Index for StringsProceedings of the VLDB Endowment10.14778/3681954.368201017:11(3415-3427)Online publication date: 1-Jul-2024
  • (2024)DEX: Scalable Range Indexing on Disaggregated MemoryProceedings of the VLDB Endowment10.14778/3675034.367505017:10(2603-2616)Online publication date: 1-Jun-2024
  • (2024)SLIPP: A Space-Efficient Learned Index for String KeysProceedings of the 2024 6th International Conference on Big-data Service and Intelligent Computation10.1145/3686540.3686550(69-77)Online publication date: 29-May-2024
  • (2024)A Memory-Disaggregated Radix TreeACM Transactions on Storage10.1145/366428920:3(1-41)Online publication date: 6-Jun-2024
  • (2024)Hyper: A High-Performance and Memory-Efficient Learned Index via Hybrid ConstructionProceedings of the ACM on Management of Data10.1145/36549482:3(1-26)Online publication date: 30-May-2024
  • (2024)SWIX: A Memory-efficient Sliding Window Learned IndexProceedings of the ACM on Management of Data10.1145/36392962:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Spruce: a Fast yet Space-saving Structure for Dynamic Graph StorageProceedings of the ACM on Management of Data10.1145/36392822:1(1-26)Online publication date: 26-Mar-2024
  • (2024)IndeXY: A Framework for Constructing Indexes Larger than Memory2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00046(516-529)Online publication date: 13-May-2024
  • (2024)A survey on persistent memory indexes: Recent advances, challenges and opportunitiesJournal of Systems Architecture10.1016/j.sysarc.2024.103140(103140)Online publication date: Apr-2024
  • (2024)CoCo-trieInformation Systems10.1016/j.is.2023.102316120:COnline publication date: 1-Feb-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media