Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Spatial hash-joins

Published: 01 June 1996 Publication History

Abstract

We examine how to apply the hash-join paradigm to spatial joins, and define a new framework for spatial hash-joins. Our spatial partition functions have two components: a set of bucket extents and an assignment function, which may map a data item into multiple buckets. Furthermore, the partition functions for the two input datasets may be different.We have designed and tested a spatial hash-join method based on this framework. The partition function for the inner dataset is initialized by sampling the dataset, and evolves as data are inserted. The partition function for the outer dataset is immutable, but may replicate a data item from the outer dataset into multiple buckets. The method mirrors relational hash-joins in other aspects. Our method needs no pre-computed indices. It is therefore applicable to a wide range of spatial joins.Our experiments show that our method outperforms current spatial join algorithms based on tree matching by a wide margin. Further, its performance is superior even when the tree-based methods have pre-computed indices. This makes the spatial hash-join method highly competitive both when the input datasets are dynamically generated and when the datasets have pre-computed indices.

References

[1]
M. Kitsuregawa, H. Tanaka, and T. Moto-Oka, "Application of hash to data base machine and its architecture," New Generation Computing, vol. 1, no. 1, pp. 66-74, 1983.
[2]
D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. Wood, "Implementation techniques for main memory database systems," in Proceedings of A CM SIGMOD International Conference on Management o/Data, pp. 1-8, 1984.
[3]
D. J. DeWitt and R. Gerber, "Multiprocessor hashbased join algorithms," in Proceedings of VLDB 85, pp. 151-164, Stockholm, 1985.
[4]
M. Nakayama, M. Kitsuregawa, and M. Takagi, "Hashpartitioned join method using dynamic destaging strategy," in Proceedings of the 14th VLDB Conference, pp. 468-478, 1988.
[5]
M. Kitsuregawa, M. Nakayama, and M. Takagi, "The effect of bucket size tuning in the dynamic hybrid grace hash join method," in Proceedings of the Fifteenth International Conference on Very Large Data Bases, pp. 257-266, Amsterdam, 1989.
[6]
L. D. Shapiro, "Join processing in database systems with large main memories," A CM Transactions on Database Systems, vol. 11, no. 3, pp. 239-264, September 1986.
[7]
P. Mishra and M. H. Eich, "Join processing in relational databases," A CM Computing Surveys, vol. 24, no. 1, pp. 64-113, March 1992.
[8]
M.-L. Lo and C. V. Ravishankar, "Spatial joins using seeded trees," in Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 209- 220, Minneapolis, MN, May 1994.
[9]
M.-L. Lo and C. V. Ravishankar, "Generating seeded trees from data sets," in The Fourth International Symposium on Large Spatial Databases (Advances in Spatial Databases: SSD '95), Portland, Maine, August 26-29 1995, Springer-Verlag.
[10]
J. Orenstein, "A comparison of spatial query processing techniques for native and parameter spaces," in Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 343-352, 1990.
[11]
D. Rotem, "Spatial join indices," in Proceedings of International Conference on Data Engineering, pp. 500-509, Kobe, Japan 1991.
[12]
W. Lu and J. Han, "Distance-associated join indices for spatial range search," in Proceedings of Internat2onal Conference on Data Engineering, pp. 284-292, 1992.
[13]
J. A. Orenstein, "Redundancy in spatial databases," in Proceedings oj' A CM SIGMOD Internatzonal Conference on Management of Data, Portland, OR, 1989.
[14]
J. Orenstein, "An algorithm for computing the overlay of k-dimensional spaces," in Advances in Spatial Databases (SSD '91), 0. Gunther and H.-J. Schek, editors, pp. 381-400, Zurich, Switzerland, August 28-30 1991, Springer-Verlag.
[15]
O. Gunther, "Efficient computation of spatial joins," Proceedings o:f international Conference on Data Engineering, pp. 50-59, 1993.
[16]
R. H. Guting and W. Schilling, "A practical divideand-conquer algorithm for the rectangle intersection problem," Information Sciences, vol. 42, no. 2, pp. 95- 112, July 1987.
[17]
A. Guttman, "R-trees: A dynamic index structure for spatial searching," Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 47-57, Aug. 1984.
[18]
N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, "The R*-tree: An efficient and robust access method for points and rectangles," Proceedings of A CM SIGMOD International Conference on Management of Data, pp. 322-332, May 1990.
[19]
C. Faloutsos, T. Sellis, and N. Roussopoulos, "Analysis of object oriented spatial access methods," Proceedings of ACM SIGMOD Internatzonal Conference on Management of Data, pp. 427-439, 1987.
[20]
T. Sellis, N. Roussopoulos, and C. Faloutsos, "The R+- tree: A dynamic index for multi-dimensional objects," in Proceedings of Very Large Data Bases, pp. 3-11, Brighton, England, 1987.
[21]
T. Brinkhoff, H.-P. Kriegel, and B. Seeger, "Efficient processing of spatial joins using R-trees," Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 237-246, May 1993.
[22]
J. M.Patel and D. DeWitt, "Partition based spatialmerge join," in Proceedings of the 1996 A CM.SIGMOD conference, Montreal, Canada, 3-6 June 1996.
[23]
C. Faloutsos and Y. Rong, "Dot: A spatial access method using fractals," in Proceedings of International Conference on Data Engzneering, pp. 152-159, 1991.
[24]
B. of Census, "Tiger/lines precensus files: 1990 technical documentation," Technical report, Bureau of Census, Washington, DC, 1989.
[25]
D. J. DeWitt, N. Kabra, J. Luo, J. M. Patel, and J. Yu, "Client-server paradise," in Proceedings of the 20th VLDB Conference, Santiage, Chile, September 1994.

Cited By

View all
  • (2023)Block-Join: A Partition-Based Method for Processing Spatio-Temporal JoinsWeb and Big Data10.1007/978-3-031-25201-3_30(397-411)Online publication date: 10-Feb-2023
  • (2021)Hierarchical Semantics Matching For Heterogeneous Spatio-temporal SourcesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482350(565-575)Online publication date: 26-Oct-2021
  • (2020)A distributed data exchange engine for polystoresit - Information Technology10.1515/itit-2019-003762:3-4(145-156)Online publication date: 4-Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 25, Issue 2
June 1996
557 pages
ISSN:0163-5808
DOI:10.1145/235968
Issue’s Table of Contents
  • cover image ACM Conferences
    SIGMOD '96: Proceedings of the 1996 ACM SIGMOD international conference on Management of data
    June 1996
    560 pages
    ISBN:0897917944
    DOI:10.1145/233269
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1996
Published in SIGMOD Volume 25, Issue 2

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)95
  • Downloads (Last 6 weeks)19
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Block-Join: A Partition-Based Method for Processing Spatio-Temporal JoinsWeb and Big Data10.1007/978-3-031-25201-3_30(397-411)Online publication date: 10-Feb-2023
  • (2021)Hierarchical Semantics Matching For Heterogeneous Spatio-temporal SourcesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482350(565-575)Online publication date: 26-Oct-2021
  • (2020)A distributed data exchange engine for polystoresit - Information Technology10.1515/itit-2019-003762:3-4(145-156)Online publication date: 4-Mar-2020
  • (2020)Generalized communication cost efficient multi-way spatial join: revisiting the curse of the last reducerGeoInformatica10.1007/s10707-019-00387-624:3(557-589)Online publication date: 14-Jan-2020
  • (2019)Spatial joinsSIGSPATIAL Special10.1145/3355491.335549411:1(13-21)Online publication date: 5-Aug-2019
  • (2019)Editorial from the Incoming Editor-in-ChiefIEEE Transactions on Computers10.1109/TC.2018.287942168:1(3-3)Online publication date: 1-Jan-2019
  • (2019)Parallel co-location mining with MapReduce and NoSQL systemsKnowledge and Information Systems10.1007/s10115-019-01381-yOnline publication date: 21-Aug-2019
  • (2018)G-HBase: A High Performance Geographical Database Based on HBaseIEICE Transactions on Information and Systems10.1587/transinf.2017DAP0017E101.D:4(1053-1065)Online publication date: 2018
  • (2018)Optimal Binning for GenomicsIEEE Transactions on Computers10.1109/TC.2018.285488068:1(125-138)Online publication date: 10-Dec-2018
  • (2018)O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval JoinJournal of Computer Science and Technology10.1007/s11390-018-1872-x33:5(1023-1038)Online publication date: 12-Sep-2018
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media