Abstract
We present a study to show the possibility of using two well-known space partitioning and indexing techniques, kd trees and quad trees, in declustering applications to increase input/output (I/O) parallelization and reduce spatial data processing times. This parallelization enables time-consuming computational geometry algorithms to be applied efficiently to big spatial data rendering and querying. The key challenge is how to balance the spatial processing load across a large number of worker nodes, given significant performance heterogeneity in nodes and processing skews in the workload.
Similar content being viewed by others
References
Bentley, J.L., 1975. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9): 509–517. [doi:10.1145/361002.361007]
Beynon, M., Chang, C., Catalyurek, U., et al., 2002. Processing large-scale multi-dimensional data in parallel and distributed environments. Parall. Comput., 28(5):827–859. [doi:10.1016/S0167-8191(02)00097-2]
Chakka, V.P., Everspaugh, A.C., Patel, J.M., 2003. Indexing large trajectory data sets with SETI. Proc. 1st Biennial Conf. on Innovative Data Systems Research.
Chilès, J.P., Delfiner, P., 2009. Geostatistics: Modeling Spatial Uncertainty. John Wiley & Sons, New York, USA.
Chou, T.C.K., Abraham, J.A., 1982. Load balancing in distributed systems. IEEE Trans. Softw. Eng., SE-8(4):401–412. [doi:10.1109/TSE.1982.235574]
Cudre-Mauroux, P., Wu, E., Madden, S., 2010. TrajStore: an adaptive storage system for very large trajectory data sets. Proc. IEEE 26th Int. Conf. on Data Engineering, p.109–120. [doi:10.1109/ICDE.2010.5447829]
DeWitt, D., Gray, J., 1992. Parallel database systems: the future of high performance database systems. Commun. ACM, 35(6):85–98. [doi:10.1145/129888.129894]
Furht, B., Escalante, A., 2011. Handbook of Data Intensive Computing. Springer, New York, USA.
Li, R., Bhanu, B., Ravishankar, C., et al., 2007. Uncertain spatial data handling: modeling, indexing and query. Comput. Geosci., 33(1):42–61. [doi:10.1016/j.cageo.2006.05.011]
Moon, B., Saltz, J.H., 1998. Scalability analysis of declustering methods for multidimensional range queries. IEEE Trans. Knowl. Data Eng., 10(2):310–327. [doi:10.1109/69.683759]
Ray, S., Simion, B., Brown, A.D., et al., 2013. A parallel spatial data analysis infrastructure for the cloud. Proc. 21st ACM SIGSPATIAL Int. Conf. on Advances in Geographic Information Systems, p.284–293. [doi:10.1145/2525314.2525347]
Reich, B.J., Chang, H.H., Strickland, M.J., 2014. Spatial health effects analysis with uncertain residential locations. Stat. Methods Med. Res., 23(2):156–168. [doi:10.1177/0962280212447151]
Samet, H., 2006. Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann, San Francisco, USA.
Sayar, A., 2013. Fine-grained federation of geographic information services through metadata aggregation. Sci. Res. Essays, 8(46):2242–2256.
Sayar, A., Marlon, P., Geoffrey, F.C., 2014. An adaptive range-query optimization technique with distributed replicas. J. Cent. South Univ., 21(1):190–198. [doi:10.1007/s11771-014-1930-7]
Sinha, R., Samaddar, S., Bhattacharyya, D., et al., 2010. A tutorial on spatial data handling. Int. J. Database Theory Appl., 3(1):1–12.
Wang, L., Wu, P., Chen, H., 2013. Finding probabilistic prevalent colocations in spatially uncertain data sets. IEEE Trans. Knowl. Data Eng., 25(4):790–804. [doi:10.1109/TKDE.2011.256]
Wei, W., 2010. Analysis of spatial database index technology. Proc. 2nd Int. Conf. on Computer Engineering and Technology, p.29–32. [doi:10.1109/ICCET.2010.5486363]
Zhang, Y., Lin, X., Zhang, W., et al., 2010. Effectively indexing the uncertain space. IEEE Trans. Knowl. Data Eng., 22(9):1247–1261. [doi:10.1109/TKDE.2010.77]
Zhong, Y., Han, J., Zhang, T., et al., 2012. Towards parallel spatial query processing for big spatial data. Proc. IEEE 26th Int. Parallel and Distributed Processing Symp. Workshops & PhD Forum, p.2085–2094. [doi:10.1109/IPDPSW.2012.245]
Author information
Authors and Affiliations
Corresponding author
Additional information
ORCID: Ahmet SAYAR, http://orcid.org/0000-0002-6335-459X; Süleyman EKEN, http://orcid.org/0000-0001-9488-908X
Rights and permissions
About this article
Cite this article
Sayar, A., Eken, S. & Öztürk, O. Kd-tree and quad-tree decompositions for declustering of 2D range queries over uncertain space. Frontiers Inf Technol Electronic Eng 16, 98–108 (2015). https://doi.org/10.1631/FITEE.1400165
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1400165