Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

LIT: Lightning-fast In-memory Temporal Indexing

Published: 26 March 2024 Publication History

Abstract

We study the problem of temporal database indexing, i.e., indexing versions of a database table in an evolving database. With the larger and cheaper memory chips nowadays, we can afford to keep track of all versions of an evolving table in memory. This raises the question of how to index such a table effectively. We depart from the classic indexing approach, where both current (i.e., live) and past (i.e., dead) data versions are indexed in the same data structure, and propose LIT, a hybrid index, which decouples the management of the current and past states of the indexed column. LIT includes optimized indexing modules for dead and live records, which support efficient queries and updates, and gracefully combines them. We experimentally show that LIT is orders of magnitude faster than the state-of-the-art temporal indices. Furthermore, we demonstrate that LIT uses linear space to the number of record indexed versions, making it suitable for main-memory temporal data management.

References

[1]
Bruno Becker, Stephan Gschwind, Thomas Ohler, Bernhard Seeger, and Peter Widmayer. 1996. An Asymptotically Optimal Multiversion B-Tree. VLDB J., Vol. 5, 4 (1996), 264--275. https://doi.org/10.1007/S007780050028
[2]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. 1990. The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, USA, May 23--25, 1990. ACM Press, 322--331. https://doi.org/10.1145/93597.98741
[3]
Andreas Behrend, Anton Dignö s, Johann Gamper, Philip Schmiegelt, Hannes Voigt, Matthias Rottmann, and Karsten Kahl. 2019. Period Index: A Learned 2D Hash Index for Range and Duration Queries. In Proceedings of the 16th International Symposium on Spatial and Temporal Databases, SSTD 2019, Vienna, Austria, August 19--21, 2019. ACM, 100--109. https://doi.org/10.1145/3340964.3340965
[4]
Luigi Bellomarini, Markus Nissl, and Emanuel Sallinger. 2022. iTemporal: An Extensible Generator of Temporal Benchmarks. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 2021--2033. https://doi.org/10.1109/ICDE53745.2022.00197
[5]
Arthur Bernhardt, Sajjad Tamimi, Tobias Vincc on, Christian Knö dler, Florian Stock, Carsten Heinz, Andreas Koch, and Ilia Petrov. 2022. neoDBMS: In-situ Snapshots for Multi-Version DBMS on Native Computational Storage. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9--12, 2022. IEEE, 3170--3173. https://doi.org/10.1109/ICDE53745.2022.00290
[6]
Michael H. Bö hlen, Anton Dignö s, Johann Gamper, and Christian S. Jensen. 2017. Temporal Data Management - An Overview. In Business Intelligence and Big Data - 7th European Summer School, eBISS 2017, Bruxelles, Belgium, July 2--7, 2017, Tutorial Lectures (Lecture Notes in Business Information Processing, Vol. 324). Springer, 51--83. https://doi.org/10.1007/978--3--319--96655--7_3
[7]
Michael H. Bö hlen, Richard T. Snodgrass, and Michael D. Soo. 1996. Coalescing in Temporal Databases. In VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3--6, 1996, Mumbai (Bombay), India. Morgan Kaufmann, 180--191. http://www.vldb.org/conf/1996/P180.PDF
[8]
Leon Bornemann, Tobias Bleifuß, Dmitri V. Kalashnikov, Fatemeh Nargesian, Felix Naumann, and Divesh Srivastava. 2023. Matching Roles from Temporal Data: Why Joe Biden is not only President, but also Commander-in-Chief. Proc. ACM Manag. Data, Vol. 1, 1 (2023), 65:1--65:26. https://doi.org/10.1145/3588919
[9]
Panagiotis Bouros and Nikos Mamoulis. 2017. A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins. Proc. VLDB Endow., Vol. 10, 11 (2017), 1346--1357. https://doi.org/10.14778/3137628.3137644
[10]
Panagiotis Bouros, Nikos Mamoulis, Dimitrios Tsitsigkos, and Manolis Terrovitis. 2021. In-Memory Interval Joins. VLDB J., Vol. 30, 4 (2021), 667--691. https://doi.org/10.1007/S00778-020-00639-0
[11]
Felix S. Campbell, Bahareh Sadat Arab, and Boris Glavic. 2022. Efficient Answering of Historical What-if Queries. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 1556--1569. https://doi.org/10.1145/3514221.3526138
[12]
George Christodoulou, Panagiotis Bouros, and Nikos Mamoulis. 2022. HINT: A Hierarchical Index for Intervals in Main Memory. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 1257--1270. https://doi.org/10.1145/3514221.3517873
[13]
George Christodoulou, Panagiotis Bouros, and Nikos Mamoulis. 2023. HINT: A Hierarchical Interval Index for Allen Relationships. VLDB J. (2023). https://doi.org/10.1007/s00778-023-00798-w
[14]
Mark de Berg, Otfried Cheong, Marc J. van Kreveld, and Mark H. Overmars. 2008. Computational geometry: algorithms and applications, 3rd Edition. Springer.
[15]
Anton Dignö s, Boris Glavic, Xing Niu, Johann Gamper, and Michael H. Bö hlen. 2019. Snapshot Semantics for Temporal Multiset Relations. Proc. VLDB Endow., Vol. 12, 6 (2019), 639--652. https://doi.org/10.14778/3311880.3311882
[16]
Jens-Peter Dittrich and Bernhard Seeger. 2000. Data Redundancy and Duplicate Detection in Spatial Join Processing. In Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, February 28 - March 3, 2000. IEEE Computer Society, 535--546. https://doi.org/10.1109/ICDE.2000.839452
[17]
Herbert Edelsbrunner. 1980. Dynamic Rectangle Intersection Searching. Technical Report 47. Institute for Information Processing, TU Graz, Austria.
[18]
Ramez Elmasri, Gene T. J. Wuu, and Yeong-Joon Kim. 1990. The Time Index: An Access Structure for Temporal Data. In 16th International Conference on Very Large Data Bases, August 13--16, 1990, Brisbane, Queensland, Australia, Proceedings. Morgan Kaufmann, 1--12. http://www.vldb.org/conf/1990/P001.PDF
[19]
Volker Gaede and Oliver Gü nther. 1998. Multidimensional Access Methods. ACM Comput. Surv., Vol. 30, 2 (1998), 170--231. https://doi.org/10.1145/280277.280279
[20]
Dengfeng Gao, Christian S. Jensen, Richard T. Snodgrass, and Michael D. Soo. 2005. Join operations in temporal databases. VLDB J., Vol. 14, 1 (2005), 2--29. https://doi.org/10.1007/S00778-003-0111--3
[21]
Junyang Gao, Stavros Sintos, Pankaj K. Agarwal, and Jun Yang. 2021. Durable Top-K Instant-Stamped Temporal Records with User-Specified Scoring Functions. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19--22, 2021. IEEE, 720--731. https://doi.org/10.1109/ICDE51399.2021.00068
[22]
Antonin Guttman. 1984. R-Trees: A Dynamic Index Structure for Spatial Searching. In SIGMOD'84, Proceedings of Annual Meeting, Boston, Massachusetts, USA, June 18--21, 1984. ACM Press, 47--57. https://doi.org/10.1145/602259.602266
[23]
Xiao Hu, Stavros Sintos, Junyang Gao, Pankaj K. Agarwal, and Jun Yang. 2022. Computing Complex Temporal Join Queries Efficiently. In SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022. ACM, 2076--2090. https://doi.org/10.1145/3514221.3517893
[24]
Martin Kaufmann, Amin Amiri Manjili, Panagiotis Vagenas, Peter M. Fischer, Donald Kossmann, Franz F"a rber, and Norman May. 2013. Timeline index: a unified data structure for processing queries on temporal data in SAP HANA. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22--27, 2013. ACM, 1173--1184. https://doi.org/10.1145/2463676.2465293
[25]
Nick Kline and Richard T. Snodgrass. 1995. Computing Temporal Aggregates. In Proceedings of the Eleventh International Conference on Data Engineering, March 6--10, 1995, Taipei, Taiwan. IEEE Computer Society, 222--231. https://doi.org/10.1109/ICDE.1995.380389
[26]
Hans-Peter Kriegel, Marco Pö tke, and Thomas Seidl. 2000. Managing Intervals Efficiently in Object-Relational Databases. In VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10--14, 2000, Cairo, Egypt. Morgan Kaufmann, 407--418. http://www.vldb.org/conf/2000/P407.pdf
[27]
Mateusz Loskot and Adam Wulkiewicz. 2019. https://github.com/mloskot/spatial_index_benchmark.
[28]
Wei Lu, Zhanhao Zhao, Xiaoyu Wang, Haixiang Li, Zhenmiao Zhang, Zhiyu Shui, Sheng Ye, Anqun Pan, and Xiaoyong Du. 2019. A Lightweight and Efficient Temporal Database Management System in TDSQL. Proc. VLDB Endow., Vol. 12, 12 (2019), 2035--2046. https://doi.org/10.14778/3352063.3352122
[29]
Achilleas Michalopoulos, Dimitrios Tsitsigkos, Panagiotis Bouros, Nikos Mamoulis, and Manolis Terrovitis. 2023. Efficient Nearest Neighbor Queries on Non-point Data. In Proceedings of the 31st ACM International Conference on Advances in Geographic Information Systems, SIGSPATIAL 2023, Hamburg, Germany, November 13--16, 2023. ACM, 33:1--33:4. https://doi.org/10.1145/3589132.3625609
[30]
Bongki Moon, Iné s Fernando Vega Ló pez, and Vijaykumar Immanuel. 2003. Efficient Algorithms for Large-Scale Temporal Aggregation. IEEE Trans. Knowl. Data Eng., Vol. 15, 3 (2003), 744--759. https://doi.org/10.1109/TKDE.2003.1198403
[31]
Mirella M. Moro and Vassilis J. Tsotras. 2018a. Transaction-Time Indexing. In Encyclopedia of Database Systems, Second Edition. Springer. https://doi.org/10.1007/978--1--4614--8265--9_399
[32]
Mirella M. Moro and Vassilis J. Tsotras. 2018b. Valid-Time Indexing. In Encyclopedia of Database Systems, Second Edition. Springer. https://doi.org/10.1007/978--1--4614--8265--9_1513
[33]
Katerina Papaioannou, Martin Theobald, and Michael H. Bö hlen. 2019. Outer and Anti Joins in Temporal-Probabilistic Databases. In 35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, April 8--11, 2019. IEEE, 1742--1745. https://doi.org/10.1109/ICDE.2019.00187
[34]
Danila Piatov and Sven Helmer. 2017. Sweeping-Based Temporal Aggregation. In Advances in Spatial and Temporal Databases - 15th International Symposium, SSTD 2017, Arlington, VA, USA, August 21--23, 2017, Proceedings (Lecture Notes in Computer Science, Vol. 10411). Springer, 125--144. https://doi.org/10.1007/978--3--319--64367-0_7
[35]
Danila Piatov, Sven Helmer, and Anton Dignö s. 2016. An interval join optimized for modern hardware. In 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, May 16--20, 2016. IEEE Computer Society, 1098--1109. https://doi.org/10.1109/ICDE.2016.7498316
[36]
Danila Piatov, Sven Helmer, Anton Dignö s, and Fabio Persia. 2021. Cache-efficient sweeping-based interval joins for extended Allen relation predicates. VLDB J., Vol. 30, 3 (2021), 379--402. https://doi.org/10.1007/S00778-020-00650--5
[37]
Betty Salzberg and Vassilis J. Tsotras. 1999. Comparison of Access Methods for Time-Evolving Data. ACM Comput. Surv., Vol. 31, 2 (1999), 158--221. https://doi.org/10.1145/319806.319816
[38]
Richard T. Snodgrass and Ilsoo Ahn. 1986. Temporal Databases. Computer, Vol. 19, 9 (1986), 35--42. https://doi.org/10.1109/MC.1986.1663327
[39]
Yufei Tao, Dimitris Papadias, and Christos Faloutsos. 2004. Approximate Temporal Aggregation. In Proceedings of the 20th International Conference on Data Engineering, ICDE 2004, 30 March - 2 April 2004, Boston, MA, USA. IEEE Computer Society, 190--201. https://doi.org/10.1109/ICDE.2004.1319996
[40]
Dimitrios Tsitsigkos, Konstantinos Lampropoulos, Panagiotis Bouros, Nikos Mamoulis, and Manolis Terrovitis. 2021. A Two-layer Partitioning for Non-point Spatial Data. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19--22, 2021. IEEE, 1787--1798. https://doi.org/10.1109/ICDE51399.2021.00157
[41]
Leong Hou U, Nikos Mamoulis, Klaus Berberich, and Srikanta J. Bedathur. 2010. Durable top-k search in document archives. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6--10, 2010. ACM, 555--566. https://doi.org/10.1145/1807167.1807228
[42]
Donghui Zhang, Alexander Markowetz, Vassilis J. Tsotras, Dimitrios Gunopulos, and Bernhard Seeger. 2001. Efficient Computation of Temporal Aggregates with Range Predicates. In Proceedings of the Twentieth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 21--23, 2001, Santa Barbara, California, USA. ACM. https://doi.org/10.1145/375551.375600
[43]
Zihao Zhang, Huiqi Hu, Zhihui Xue, Changcheng Chen, Yang Yu, Cuiyun Fu, Xuan Zhou, and Feifei Li. 2021. SLIMSTORE: A Cloud-based Deduplication System for Multi-version Backups. In 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19--22, 2021. IEEE, 1841--1846. https://doi.org/10.1109/ICDE51399.2021.00164showDOI

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 1
SIGMOD
February 2024
1874 pages
EISSN:2836-6573
DOI:10.1145/3654807
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 March 2024
Published in PACMMOD Volume 2, Issue 1

Permissions

Request permissions for this article.

Author Tags

  1. indexing
  2. query processing
  3. temporal data

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 167
    Total Downloads
  • Downloads (Last 12 months)167
  • Downloads (Last 6 weeks)32
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media