Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Multiway spatial joins

Published: 01 December 2001 Publication History

Abstract

Due to the evolution of Geographical Information Systems, large collections of spatial data having various thematic contents are currently available. As a result, the interest of users is not limited to simple spatial selections and joins, but complex query types that implicate numerous spatial inputs become more common. Although several algorithms have been proposed for computing the result of pairwise spatial joins, limited work exists on processing and optimization of multiway spatial joins. In this article, we review pairwise spatial join algorithms and show how they can be combined for multiple inputs. In addition, we explore the application of synchronous traversal (ST), a methodology that processes synchronously all inputs without producing intermediate results. Then, we integrate the two approaches in an engine that includes ST and pairwise algorithms, using dynamic programming to determine the optimal execution plan. The results show that, in most cases, multiway spatial joins are best processed by combining ST with pairwise methods. Finally, we study the optimization of very large queries by employing randomized search algorithms.

References

[1]
ACHAYA, S., POOSALA,V.,AND RAMASWAMY, S. 1999. Selectivity estimation in spatial databases. In Proceedings of the ACMSIGMOD Conference (SIGMOD '99) (Philiadephia, Pa. June). ACM, New York, pp. 13-24.
[2]
ARGE, L., PROCOPIUC, O., RAMASWAMY, S., SUEL,T.,AND VITTER, J. S. 1998. Scalable sweepingbased spatial join. In Proceedings of the VLDB Conference,(VLDB '98) (New York, N.Y., Aug.), pp. 570-581.
[3]
BACCHUS,F.AND GROVE, A. 1995. On the forward checking algorithm. In Proceedings of the Conference on Principles and Practice of Constraint Programming (CP '95) (Casis, France, Sept.). Lecture Note in Computer Science, Vol. 976. Springer-Verlag, New York, pp. 292-308.
[4]
BECKMANN, N., KRIEGEL,H.P.,SCHNEIDER, R., AND SEEGER, B. 1990. The R-tree: An efficient and robust access method for points and rectangles. In Proceedings of the ACM SIGMOD Conference (SIGMOD '90) (Atlantic City, N.J., May). pp. 322-331.
[5]
BIALLY, T. 1969. Space-filling curves: Their generation and their application to bandwidth reduction. IEEE Trans. Inf. Theory 15, 6 (Nov.), 658-664.
[6]
BOUGANIM, L., KAPITSKAIA,O.,AND VALDURIEZ, P. 1998. Memory adaptive scheduling for large query execution. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM '98) (Bethesda, Md., May). ACM, pp. 105-115.
[7]
BRINKHOFF, T., KRIEGEL,H.P.,AND SEEGER, B. 1993. Efficient processing of spatial joins using R-trees. In Proceedings of the ACMSIGMOD Conference (SIGMOD '93) (Washington, D.C., May). ACM, New York, pp. 237-246.
[8]
BRINKHOFF, T., KRIEGEL,H.P.,AND SEEGER, B. 1996. Parallel processing of spatial joins using R-trees. In Proceedings of the International Conference on Data Engineering (ICDE '96) (New Orleans, L., Mar.). IEEE Computer Society press, Los Alamitos, Calif., pp. 258-265.
[9]
BUREAU OF THE CENSUS 1989. Tiger/Line Precensus Files: 1990 Technical Documentation. Washington, D.C.
[10]
CORRAL, A., VASSILAKOPOULOS, M., AND MANOLOPOULOS, Y. 1999. Algorithms for joining R-trees with linear region quadtrees. In Proceedings of the Symposium on Large Spatial Databases (SSD '99) (Hong Kong, China, July). Lecture Notes in Computer Science, Vol. 1651. Springer-Verlag, New York, pp. 251-269.
[11]
DECHTER,R.AND MEIRI, I. 1994. Experimental evaluation of preprocessing algorithms for constraint satisfaction problems. Artif. Int., 68, 2 (July), 211-241.
[12]
GAEDE,V.AND GUNTHER, O. 1998. Multidimensional access methods. ACM Comput. Surv., 30,2 (June), 123-169.
[13]
GALINDO-LEGARIA, C., PELLENKOFT, A., AND KERSTEN, M. 1994. Fast, randomized join-order selection-Why use transformations? In Proceedings of the VLDB Conference (VLDB '93) (Dublin, Ireland, Sept.). pp. 85-95.
[14]
GRAEFE, G. 1993. Query evaluation techniques for large databases. ACM Comput. Surv., 25,2 (June), 73-170.
[15]
GUNTHER, O. 1993. Efficient computation of spatial joins. In Proceedings of the International Conference on Data Engineering (ICDE '93) (Vienna, Austria, Apr.). IEEE Computer Society Press, Los Alamitos, Calif., pp. 50-59.
[16]
GUTING, R. H. 1994. An introduction to spatial database systems. VLDB J. 3, 4 (Oct.), 357-399.
[17]
GUTTMAN, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD Conference (SIGMOD '84) (Boston, Mass., June). ACM, New York, pp. 47-57.
[18]
HARALICK,R.AND ELLIOTT, G. 1980. Increasing tree search efficiency for constraint satisfaction problems. Artifi. Int., 14, 3 (Oct.), 263-313.
[19]
HJALTASON,G.AND SAMET, H. 1998. Incremental distance join algorithms for spatial databases. In Proceedings of the ACM SIGMOD Conference (SIGMOD '98) (Seattle, Wash., June). ACM, New York, pp. 237-248.
[20]
HOEL,E.G.AND SAMET, H. 1995. Benchmarking spatial join operations with spatial output. In Proceedings of the VLDB Conference (VLDB '95) (Zurich, Switzerland, Sept.). pp. 606-618.
[21]
HUANG,Y.W.,JING,N.,AND RUNDENSTEINER, E. 1997a. Spatial joins using R-trees: Breadth a first traversal with global optimizations. In Proceedings of the VLDB Conference (VLDB '97) (Athens, Greece, Aug.), pp. 395-405.
[22]
HUANG,Y.W.,JING,N.,AND RUNDENSTEINER, E. 1997. A cost model for estimating the performance of spatial joins using R-trees. In Proceedings of the International Conference on Scientific and Statical Database Management (SSDBM '97) (Olympia, Wash., Aug.). IEEE Computer Society, Press, Los Alamitos, Calif., pp. 30-38.
[23]
IOANNIDIS,Y.AND CHRISTODOULAKIS, S. 1991. On the propagation of errors in the size of join results. In Proceedings of the ACM SIGMOD Conference (SIGMOD '91) (Denver, Col., May). ACM, New York, pp. 268-277.
[24]
IOANNIDIS,Y.AND KANG, Y. 1990. Randomized algorithms for optimizing large join queries. In Proceedings of the ACM SIGMOD Conference (SIGMOD '90) (Atlantic City, N.J., May). ACM, New York, pp. 312-321.
[25]
IOANNIDIS,Y.AND POOSALA, V. 1995. Balancing histogram optimality and practicality for query result size estimation. In Proceedings of the ACM SIGMOD Conference (SIGMOD '95) (San Jose, Calif., May). ACM, New York, pp. 233-244.
[26]
KIRKPATRICK, S., GELAT,C.,AND VECCHI, M. 1983. Optimization by simulated annealing. Science 220, 4598, 671-680.
[27]
KONDRAK,Q.AND VAN BEEK, P. 1997. A theoretical evaluation of selected backtracking algorithms. Artifi. Int. 89, 1-2 (Jan.), 365-387.
[28]
KOUDAS,N.AND SEVCIK, K. 1997. Size separation spatial join. In Proceedings of the ACMSIGMOD Conference (SIGMOD '97) (Tucson, Az., May). ACM, New York, pp. 324-335.
[29]
LO, M.-L. AND RAVISHANKAR, C. V. 1994. Spatial joins using seeded trees. In Proceedings of the ACM SIGMOD Conference (SIGMOD '94) (Minneapolis, Minn., May). ACM, New York, pp. 209-220.
[30]
LO, M.-L. AND RAVISHANKAR, C. V. 1996 Spatial hash-joins. In Proceedings of the ACM SIGMOD Conference (SIGMOD '96) (Montreal, Que., Canada, June). ACM, New York, pp. 247-258.
[31]
MAMOULIS,N.AND PAPADIAS, D. 1999. Integration of spatial join algorithms for processing multiple inputs. In Proceedings of the ACM SIGMOD Conference (SIGMOD '99) (Philadelphia, Pa., June). ACM, New York, pp. 1-12.
[32]
MAMOULIS,N.AND PAPADIAS, D. 2001a. Selectivity estimation of complex spatial queries. In Proceedings of the Symposium on Large Spatial and Temporal Databases (SSTD '01) (Los Angeles, Calif., July). Lecture Notes in Computer Science, Vol. 2121. Springer-Verlag, New York, pp. 155-174.
[33]
MAMOULIS,N.AND PAPADIAS, D. 2001b. Slot index spatial join. IEEE Trans. Knowl. Data Eng. (TKDE), to appear.
[34]
MANNINO, M., CHU,P.,AND SAGER, T. 1988. Statistical profile estimation in database systems. ACM Comput. Surv., 20, 3 (Sept.), 192-221.
[35]
NAHAR, S., SAHNI,S.,AND SHRAGOWITZ, E. 1986. Simulated annealing and combinatorial optimization. In Proceedings of the 23rd ACM/IEEE Design Automation Conference (las Vegas, Nev., June). IEEE Computer Society Press, Los Alamitos, Calif., pp. 293-299.
[36]
ORENSTEIN, J. 1986. Spatial query processing in an object-oriented database System. In Proceedings of the ACM SIGMOD Conference (SIGMOD '86) (Washington, D.C., May). ACM, New York, pp. 326-336.
[37]
PAPADIAS, D., MAMOULIS,N.,AND DELIS, V. 1998. Algorithms for querying by spatial structure. In Proceedings of the VLDB Conference (VLDB '98) (New York, N.Y., Aug.). pp. 546-557.
[38]
PAPADIAS, D., MAMOULIS,N.,AND THEODORIDIS, Y. 1999a. Processing and optimization of multiway spatial joins using R-trees. In Proceedings of the ACM Symposium on Principles of Database Systems (PODS) (July). ACM, New York, pp. 44-55.
[39]
PAPADIAS, D., MANTZOUROGIANNIS, M., KALNIS, P., MAMOULIS,N.,AND AHMAD, I. 1999b. Content-based retrieval using heuristic search. In Proceedings of the International ACMConference on Research and Development in Information Retrieval (SIGIR) (Aug.). ACM, New York, pp. 168-175.
[40]
PAPADIAS, D., THEODORIDIS, Y., SELLIS,T.,AND EGENHOFER, M. 1995. Topological relations in the world of minimum bounding rectangles: A study with R-trees. In Proceedings of the ACM SIGMOD Conference (SIGMOD '95) (San Jose, Calif., May). ACM, New York, pp. 92-103.
[41]
PAPADOPOULOS,A.N.,RIGAUX,P.,AND SCHOLL, M. 1999. A performance evaluation of spatial join processing strategies. In Proceedings of the Symposium on Large Spatial Databases (SSD '99) (Hong Kong, China, July). Lecture Notes in Computer Science, Vol. 1651. Springer-Verlag, New York, pp. 286-307.
[42]
PARK, H., CHA,G.,AND CHUNG, C. 1999. Multiway spatial joins using R-trees: Methodology and performance evaluation. In Proceedings of the Symposium on Large Spatial Databases (SSD '99) (Hong Kong, China, July). Lecture Notes in Computer Science, Vol. 1651. Springer-Verlag, New York, pp. 229-250.
[43]
PARK, H., LEE, C.-G., LEE, Y.-J., AND CHUNG, C. 1999b. Early separation of filter and refinement steps in spatial query optimization. In Proceedings of the International Conference on Database Systems for Advanced Applications (DASFAA '99) (Taiwan, ROC, Apr.). IEEE Computer Society, Press, Los Alamitos, Calif., pp. 161-168.
[44]
PATEL,J.M.AND DEWITT, D. J. 1996. Partition based spatial-merge join. In Proceedings of the ACM SIGMOD Conference (SIGMOD '96) (Montreal, Ont., Canada, June). ACM, New York, pp. 259-270.
[45]
PATEL, J., YU, J., KABRA,YUFTE, K., NAG, B., BURGER, J., HALL, N., RAMASAMY, K., LUEDER, R., ELLMAN, C., KUPSCH, J., GUO, S., LARSON,J.,DE WITT,D.,AND NAUGHTON, J. 1997. Building a scalable geospatial DBMS: Technology, implementation, and evaluation. In Proceedings of the ACMSIGMOD Conference (SIGMOD '97) (Tucson, Az., June). ACM, New York, pp. 336-347.
[46]
PREPARATA, F.AND SHAMOS, M. 1985. Computational Geometry: an introduction. Springer-Verlag, New York.
[47]
ROTEM, D. 1991. Spatial join indices. In Proceedings of the International Conference on Data Engineering (ICDE '91) (Kobe, Japan, Apr.). IEEE Computer Society Press, Los Alamitos, Calif., pp. 500-509.
[48]
ROUSSOPOULOS, N., KELLEY,F.,AND VINCENT, F. 1995. Nearest neighbour queries. In Proceedings of the ACMSIGMOD Conference (SIGMOD '95) (San Jose, Calif., May). ACM, New York, pp. 71-79.
[49]
SELLIS, T., ROUSSOPOULOS,N.,AND FALOUTSOS, C. 1987. The RC-tree: A dynamic index for multidimensional objects. In Proceedings of the VLDB Conference (VLDB '87) (Brighton, England, Sept.). pp. 507-518.
[50]
SILBERSCHATZ, A., KORTH,H.F.,AND SUDARSHAN, S. 1997. Database System Concepts. 3rd ed. McGraw-Hill, New York.
[51]
SWAMI,A.AND GUPTA, A. 1988. Optimization of large join queries. In Proceedings of the ACM SIGMOD Conference (SIGMOD '88) (Chicago, Ill., June). ACM, New York, pp. 8-17.
[52]
THEODORIDIS,Y.AND SELLIS, T. 1996. A model for the prediction of R-tree performance. In Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS '96) (Montreal, Ont., Canada, June). ACM, New York, pp. 161-171.
[53]
THEODORIDIS, Y., STEFANAKIS, E., AND SELLIS, T. 1998. Cost models for join queries in spatial databases. In Proceedings of the International Conference on Data Engineering (ICDE '98) (Orlando, Fla., Feb.). IEEE Computer Society Press, Los Alamitos, Calif., pp. 476-483.
[54]
VALDURIEZ, P. 1987. Join Indices. ACM Trans. Datab. Syst. (TODS) 12, 2 (June), 218-246.

Cited By

View all
  • (2024)Construct and Query A Fine-Grained Geospatial Knowledge GraphData Science and Engineering10.1007/s41019-023-00237-49:2(152-176)Online publication date: 22-Jan-2024
  • (2024)Efficient processing of all neighboring object group queries with budget range constraint in road networksComputing10.1007/s00607-024-01260-7106:5(1359-1393)Online publication date: 1-May-2024
  • (2023)ParSCL: A Parallel and Distributed Framework to Process All Nearest Neighbor Queries on a Road NetworkIEEE Access10.1109/ACCESS.2023.330868411(94043-94056)Online publication date: 2023
  • Show More Cited By

Recommendations

Reviews

Raphael M. Malyankar

Mamoulis and Papadias motivate the work presented in this paper with the observation that complex queries for spatial databases may involve multiple inputs; and hence multiway joins. They adapt pairwise join algorithms for multiple inputs; discuss the role of synchronous traversal and query plan optimization using combinations of synchronous traversal and pairwise methods; and address query optimization using dynamic programming and randomized search algorithms. Evaluation is done with both synthetic data sets and with actual geodata from selected datasets. The work and its exposition are detailed and solid, and the results facilitate useful advances in implementations of geodata processing. Skew in datasets is also addressed quite competently, though I would be interested in seeing more experiments with—and analysis of—the effects of skew on the performance of the algorithms. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 26, Issue 4
December 2001
135 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/503099
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2001
Published in TODS Volume 26, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Multiway joins
  2. query processing
  3. spatial joins

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)2
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Construct and Query A Fine-Grained Geospatial Knowledge GraphData Science and Engineering10.1007/s41019-023-00237-49:2(152-176)Online publication date: 22-Jan-2024
  • (2024)Efficient processing of all neighboring object group queries with budget range constraint in road networksComputing10.1007/s00607-024-01260-7106:5(1359-1393)Online publication date: 1-May-2024
  • (2023)ParSCL: A Parallel and Distributed Framework to Process All Nearest Neighbor Queries on a Road NetworkIEEE Access10.1109/ACCESS.2023.330868411(94043-94056)Online publication date: 2023
  • (2023)Scheduling distributed multiway spatial join queries: optimization models and algorithmsInternational Journal of Geographical Information Science10.1080/13658816.2023.217038037:6(1388-1419)Online publication date: 6-Feb-2023
  • (2023)Construct Fine-Grained Geospatial Knowledge GraphDatabase Systems for Advanced Applications. DASFAA 2023 International Workshops10.1007/978-3-031-35415-1_19(267-282)Online publication date: 17-Apr-2023
  • (2022)The Complexity of Boolean Conjunctive Queries with Intersection JoinsProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524156(53-65)Online publication date: 12-Jun-2022
  • (2022)Intersection joins under updatesJournal of Computer and System Sciences10.1016/j.jcss.2021.09.004124:C(41-64)Online publication date: 1-Mar-2022
  • (2022)Join optimization for inverted index technique on relational database management systemsExpert Systems with Applications10.1016/j.eswa.2022.116956198(116956)Online publication date: Jul-2022
  • (2022)Evaluating pattern matching queries for spatial databasesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-019-00550-328:5(649-673)Online publication date: 11-Mar-2022
  • (2022)Spatial Data ManagementundefinedOnline publication date: 2-Mar-2022
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media