Abstract
Based on the problem that existing learned indexes are difficult to adjust dynamically with data changes, a learned index Structure for dynamical uneven spatial data (LIDUSA) is presented in our paper. To handle with the problem of poor KNN query performance on sparse regions, LIDUSA could dynamically adjust data layout by merging and splitting corresponding grid cells, and relearn mapping function of this region to make data points stored in adjacent sparse grid cell also stored in neighboring disk pages. It combines the advantage of tree-shaped indexes, which could be adjusted dynamically, and that of learned indexes. In this paper, extensive experiments are conducted on real-world dataset and synthetic datasets. From experiment results, it could be seen that LIDUSA is twice as fast as other existing indexes in the scenario of KNN query, which will greatly extend the applicable scope of learned indexes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, P., Lu, H., Zheng, Q., Yang, L., Pan, G.: LISA: a learned index structure for spatial data. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 2119–2133. ACM (2020). https://doi.org/10.1145/3318464.3389703
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Finkel, R.A., Bentley, J.L.: Quad trees: a data structure for retrieval on composite keys. Acta Informatica 4(1), 1–9 (1974)
Meagher, D.: Geometric modeling using octree encoding. Comput. Graph. Image Process. 19(2), 129–147 (1982)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proceedings of the International Conference on Management of Data, pp. 47–57. ACM (1984)
Beckmann, N., Kriegel, H., Schneider, R., Seeger, B.: The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the International Conference on Management of Data, pp. 322–331. ACM (1990)
Kamel, I., Faloutsos, C.: Hilbert R-tree: an improved R-tree using fractals. In: Proceedings of the International Conference on Very Large Data Bases, pp. 500–509. Morgan Kaufmann (1994)
Sellis, T., Roussopoulos, N., Faloutsos, C.: The R+-tree: a dynamic index for multi-dimensional objects. In: Proceedings of the International Conference on Very Large Data Bases, pp. 507–518. Morgan Kaufmann (1987)
Sagan, H.: Space-Filling Curves. Springer, New York (1994)
Ramsak, F., Markl, V., Fenk, R., Zirkel, M., Elhardt, K., Bayer, R.: Integrating the UB-Tree into a database system kernel. In: Proceedings of the International Conference on Very Large Data Bases, pp. 263–272. Morgan Kaufmann (2000)
Kraska, T., Alizadeh, M., Beutel, A., Chi, E.H., Ding, J., Kristo, A., et al.: SageDB: a learned database system. In: Proceedings of the Biennial Conference on Innovative Data Systems Research (2019)
Wang, H., Fu, X., Xu, J., Lu, H.: Learned index for spatial queries. In: Proceedings of the IEEE International Conference on Mobile Data Management, pp. 569–574. IEEE (2019)
Royden, H.L., Fitzpatrick, P.M.: Real Analysis (2010)
Nathan, V., Ding, J., Alizadeh, M., Kraska, T.: Learning multi-dimensional indexes. In: Proceedings of the International Conference on Management of Data, pp. 985−1000. ACM (2020)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: SIGMOD, pp. 489–504. ACM (2018)
Davitkova, A., Milchevski, E., Michel, S.: The ML-index: a multidimensional, learned index for point, range, and nearest-neighbor queries. In: Proceedings of the International Conference on Extending Database Technology, pp. 407–410 (2020)
Ho, D., Ding, J., Misra, S., Tatbul, N., Nathan, V., Vasimuddin, et al.: LISA: towards learned DNA sequence search. arXiv: Databases (2019)
Kirsche, M., Das, A., Schatz, M.C..: Sapling: accelerating suffix array queries with learned data models. bioRxiv (2020)
Kristo, A., Vaidya, K., Çetintemel, U., Misra, S., Kraska, T.: The case for a learned sorting algorithm. In: Proceedings of the International Conference on Management of Data, pp. 1001–1016. ACM (2020)
Acknowledgement
This work is partially supported by Science and Technology Planning Project of Fujian Province under Grant No. 2020H0023.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Z., Wang, Y., Zhu, S. (2022). LIDUSA – A Learned Index Structure for Dynamical Uneven Spatial Data. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13157. Springer, Cham. https://doi.org/10.1007/978-3-030-95391-1_46
Download citation
DOI: https://doi.org/10.1007/978-3-030-95391-1_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95390-4
Online ISBN: 978-3-030-95391-1
eBook Packages: Computer ScienceComputer Science (R0)