Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2452376.2452390acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Processing multi-way spatial joins on map-reduce

Published: 18 March 2013 Publication History

Abstract

In this paper we investigate the problem of processing multi-way spatial joins on map-reduce platform. We look at two common spatial predicates - overlap and range. We address these two classes of join queries, discuss the challenges and outline novel approaches for executing these queries on a map-reduce framework. We then discuss how we can process join queries involving both overlap and range predicates. Specifically we present a Controlled-Replicate framework using which we design the approaches presented in this paper. The Controlled-Replicate framework is carefully engineered to minimize the communication among cluster nodes. Through experimental evaluations we discuss the complexity of the problem under investigation, details of Controlled-Replicate framework and demonstrate that the proposed approaches comfortably outperform naive approaches.

References

[1]
Census 2000 Tiger/Line Data www.esri.com/data/download/census2000-tigerline.
[2]
OpenMap Library http://openmap.bbn.com/.
[3]
F. N. Afrati and J. D. Ullman. Optimizing joins in a map-reduce environment. In EDBT, 2010.
[4]
S. Blanas, J. M. Patel, V. Ercegovac, J. Rao, E. J. Shekita, and Y. Tian. A comparison of join algorithms for log processing in mapreduce. In SIGMOD, 2010.
[5]
T. Brinkhoff, H. P. Kriegal, and B. Seeger. Efficient processing of spatial joins using R-trees. In SIGMOD, 1993.
[6]
T. Brinkhoff, H.-P. Kriegel, R. Schneider, and B. Seeger. Multi-step processing of spatial joins. In SIGMOD, 1994.
[7]
T. Brinkhoff, H.-P. Kriegel, and B. Seeger. Parallel processing of spatial joins using r-trees. In ICDE, 1996.
[8]
J. Dean and S. Ghemawat. Mapreduce: simplified data processing on large clusters. Commun. ACM, 51(1), 2008.
[9]
J.-P. Dittrich and B. Seeger. Data redundancy and duplicate detection in spatial join processing. In ICDE, 2000.
[10]
O. Günther. Efficient computation of spatial joins. In ICDE, 1993.
[11]
M.-L. Lo and C. V. Ravishankar. Spatial joins using seeded trees. In SIGMOD, 1994.
[12]
M.-L. Lo and C. V. Ravishankar. Spatial hash-joins. In SIGMOD, pages 247--258, 1996.
[13]
W. Lu, Y. Shen, S. Chen, and B. C. Ooi. Efficient processing of k nearest neighbor joins using mapreduce. PVLDB, 5(10), 2012.
[14]
N. Mamoulis and D. Papadias. Integration of spatial join algorithms for processing multiple inputs. In SIGMOD, 1999.
[15]
N. Mamoulis and D. Papadias. Multiway spatial joins. ACM Trans. Database Syst., 26(4), 2001.
[16]
A. Okcan and M. Riedewald. Processing theta-joins using mapreduce. In SIGMOD, 2011.
[17]
D. Papadias and D. Arkoumanis. Approximate processing of multiway spatial joins in very large databases. In EDBT, 2002.
[18]
J. M. Patel and D. J. DeWitt. Partition based spatial-merge join. In SIGMOD, 1996.
[19]
J. M. Patel and D. J. DeWitt. Clone join and shadow join: two parallel spatial join algorithms. In ACM-GIS, 2000.
[20]
R. Vernica, M. J. Carey, and C. Li. Efficient parallel set-similarity joins using mapreduce. In SIGMOD, 2010.
[21]
K. Wang, J. Han, B. Tu, J. Dai, W. Zhou, and X. Song. Accelerating spatial data processing with mapreduce. In ICPADS, 2010.
[22]
S. Zhang, J. Han, Z. Liu, K. Wang, and S. Feng. Spatial queries evaluation with mapreduce. In GCC, 2009.
[23]
S. Zhang, J. Han, Z. Liu, K. Wang, and Z. Xu. Sjmr: Parallelizing spatial join with mapreduce on clusters. In CLUSTER, 2009.

Cited By

View all
  • (2023)Scheduling distributed multiway spatial join queries: optimization models and algorithmsInternational Journal of Geographical Information Science10.1080/13658816.2023.217038037:6(1388-1419)Online publication date: 6-Feb-2023
  • (2021)A MapReduce-based distributed and scalable framework for stitching of satellite mosaic imagesArabian Journal of Geosciences10.1007/s12517-021-07500-w14:18Online publication date: 23-Aug-2021
  • (2020)Projection Based Large Scale High-Dimensional Data Similarity Join Using MapReduce FrameworkIEEE Access10.1109/ACCESS.2020.30070288(121665-121677)Online publication date: 2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT '13: Proceedings of the 16th International Conference on Extending Database Technology
March 2013
793 pages
ISBN:9781450315975
DOI:10.1145/2452376
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 March 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

EDBT/ICDT '13

Acceptance Rates

Overall Acceptance Rate 7 of 10 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Scheduling distributed multiway spatial join queries: optimization models and algorithmsInternational Journal of Geographical Information Science10.1080/13658816.2023.217038037:6(1388-1419)Online publication date: 6-Feb-2023
  • (2021)A MapReduce-based distributed and scalable framework for stitching of satellite mosaic imagesArabian Journal of Geosciences10.1007/s12517-021-07500-w14:18Online publication date: 23-Aug-2021
  • (2020)Projection Based Large Scale High-Dimensional Data Similarity Join Using MapReduce FrameworkIEEE Access10.1109/ACCESS.2020.30070288(121665-121677)Online publication date: 2020
  • (2020)Generalized communication cost efficient multi-way spatial join: revisiting the curse of the last reducerGeoInformatica10.1007/s10707-019-00387-624:3(557-589)Online publication date: 14-Jan-2020
  • (2020)Skew Aware Partitioning Techniques for Multi-way Spatial JoinMining Intelligence and Knowledge Exploration10.1007/978-3-030-66187-8_6(52-61)Online publication date: 20-Dec-2020
  • (2019)A Survey of Big Data Analytics for Smart ForestryIEEE Access10.1109/ACCESS.2019.29079997(46621-46636)Online publication date: 2019
  • (2018)Efficiently Processing Temporal Queries on Hyperledger Fabric2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00167(1489-1494)Online publication date: Apr-2018
  • (2018)Efficient large-scale distance-based join queries in spatialhadoopGeoinformatica10.1007/s10707-017-0309-y22:2(171-209)Online publication date: 1-Apr-2018
  • (2017)An Effective High-Performance Multiway Spatial Join Algorithm with SparkISPRS International Journal of Geo-Information10.3390/ijgi60400966:4(96)Online publication date: 26-Mar-2017
  • (2017)The era of big spatial dataProceedings of the VLDB Endowment10.14778/3137765.313782810:12(1992-1995)Online publication date: 1-Aug-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media