Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fast augmentation algorithms for network kernel density visualization

Published: 01 May 2021 Publication History

Abstract

Network kernel density visualization, or NKDV, has been extensively used to visualize spatial data points in various domains, including traffic accident hotspot detection, crime hotspot detection, disease outbreak detection, and business and urban planning. Due to a wide range of applications for NKDV, some geographical software, e.g., ArcGIS, can also support this operation. However, computing NKDV is very time-consuming. Although NKDV has been used for more than a decade in different domains, existing algorithms are not scalable to million-sized datasets. To address this issue, we propose three efficient methods in this paper, namely aggregate distance augmentation (ADA), interval augmentation (IA), and hybrid augmentation (HA), which can significantly reduce the time complexity for computing NKDV. In our experiments, ADA, IA and HA can achieve at least 5x to 10x speedup, compared with the state-of-the-art solutions.

References

[1]
ArcGIS. http://pro.arcgis.com/en/pro-app/tool-reference/spatial-analyst/how-kernel-density-works.htm (last accessed: 2020-10-15).
[2]
IKCEST: Disaster risk reduction. http://drr.ikcest.org/knowledge_service/ncp.html (last accessed: 2020-10-15).
[3]
Johns Creek open data. https://opendata.atlantaregional.com/datasets/JohnsCreekGA::police-calls-for-service-archive-2009-to-2018 (last accessed: 2020-10-15).
[4]
Los Angeles open data. https://data.lacity.org/A-Safe-City/Crime-Data-from-2010-to-2019/63jg-8b9z (last accessed: 2020-10-15).
[5]
NYC open data. https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95 (last accessed: 2020-10-15).
[6]
Openstreetmap. https://www.openstreetmap.org/ (last accessed: 2020-10-15).
[7]
QGIS. https://docs.qgis.org/2.18/en/docs/user_manual/plugins/plugins_heatmap.html (last accessed: 2020-10-15).
[8]
SANET. http://sanet.csis.u-tokyo.ac.jp/ (last accessed: 2020-10-15).
[9]
Seattle open data. https://data.seattle.gov/Public-Safety/SPD-Crime-Data-2008-Present/tazs-3rd5 (last accessed: 2020-10-15).
[10]
M. Bíl, R. Andrášik, and Z. Janoška. Identification of hazardous road locations of traffic accidents by means of kernel density estimation and cluster significance evaluation. Accident Analysis & Prevention, 55:265 -- 273, 2013.
[11]
G. Boeing. Osmnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Computers, Environment and Urban Systems, 65:126 -- 139, 2017.
[12]
S. Chainey, L. Tompson, and S. Uhlig. The utility of hotspot mapping for predicting spatial patterns of crime. Security Journal, 21(1):4--28, Feb 2008.
[13]
T. N. Chan, R. Cheng, and M. L. Yiu. QUAD: Quadratic-bound-based kernel density visualization. In SIGMOD, pages 35--50, 2020.
[14]
T. N. Chan, P. L. Ip, L. H. U, W. H. Tong, S. Mittal, Y. Li, and R. Cheng. KDV-Explorer: A near real-time kernel density visualization system for spatial analysis. Proc. VLDB Endow., 2021, (To appear).
[15]
T. N. Chan, L. H. U, R. Cheng, M. L. Yiu, and S. Mittal. Efficient algorithms for kernel aggregation queries. IEEE Transactions on Knowledge and Data Engineering, pages 1--1, 2020.
[16]
T. N. Chan, M. L. Yiu, and L. H. U. KARL: Fast kernel aggregation queries. In ICDE, pages 542--553, 2019.
[17]
W. Chen, F. Guo, and F. Wang. A survey of traffic data visualization. IEEE Trans. Intelligent Transportation Systems, 16(6):2970--2984, 2015.
[18]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, 3rd Edition. MIT Press, 2009.
[19]
D. Deng, C. Shahabi, U. Demiryurek, L. Zhu, R. Yu, and Y. Liu. Latent space model for road networks to predict time-varying traffic. In SIGKDD, pages 1525--1534, 2016.
[20]
M. Deng, X. Yang, Y. Shi, J. Gong, Y. Liu, and H. Liu. A density-based approach for detecting network-constrained clusters in spatial point events. International Journal of Geographical Information Science, 33(3):466--488, 2019.
[21]
A. Eldawy, M. F. Mokbel, and C. Jonathan. HadoopViz: A mapreduce framework for extensible visualization of big spatial data. In ICDE, pages 601--612, 2016.
[22]
E. Gan and P. Bailis. Scalable kernel density classification via threshold-based pruning. In ACM SIGMOD, pages 945--959, 2017.
[23]
R. Geisberger, P. Sanders, D. Schultes, and D. Delling. Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In WEA, pages 319--333, 2008.
[24]
A. Gramacki. Nonparametric Kernel Density Estimation and Its Computational Aspects. Studies in Big Data. Springer International Publishing, 2017.
[25]
A. G. Gray and A. W. Moore. Nonparametric density estimation: Toward computational tractability. In SDM, pages 203--211, 2003.
[26]
T. Guo, K. Feng, G. Cong, and Z. Bao. Efficient selection of geospatial data on maps for interactive and visualized exploration. In SIGMOD, pages 567--582, 2018.
[27]
T. Guo, M. Li, P. Li, Z. Bao, and G. Cong. POIsam: a system for efficient selection of large-scale geospatial data on maps. In SIGMOD, pages 1677--1680, 2018.
[28]
B. Han, L. Liu, and E. Omiecinski. Road-network aware trajectory clustering: Integrating locality, flow, and density. IEEE Trans. Mob. Comput., 14(2):416--429, 2015.
[29]
B. Han, L. Liu, and E. Omiecinski. A systematic approach to clustering whole trajectories of mobile objects in road networks. IEEE Trans. Knowl. Data Eng., 29(5):936--949, 2017.
[30]
H. Harirforoush and L. Bellalite. A new integrated gis-based analysis to detect hotspots: A case study of the city of sherbrooke. Accident Analysis & Prevention, 130:62 -- 74, 2019. Road Safety Data Considerations.
[31]
T. Hart and P. Zandbergen. Kernel density estimation and hotspot mapping: examining the influence of interpolation method, grid cell size, and bandwidth on crime forecasting. Policing: An International Journal of Police Strategies and Management, 37:305--323, 2014.
[32]
S. C. Joshi, R. V. Kommaraju, J. M. Phillips, and S. Venkatasubramanian. Comparing distributions and shapes using the kernel distance. In SOCG, pages 47--56, 2011.
[33]
P. K. Kefaloukos, M. A. V. Salles, and M. Zachariasen. Declarative cartography: In-database map generalization of geospatial datasets. In ICDE, pages 1024--1035, 2014.
[34]
S. Khalid, F. Shoaib, T. Qian, Y. Rui, A. Bari, M. Sajjad, M. Shakeel, and J. Wang. Network constrained spatio-temporal hotspot mapping of crimes in faisalabad. Applied Spatial Analysis and Policy, 11:599--622, 9 2018.
[35]
L. Li, M. Zhang, W. Hua, and X. Zhou. Fast query decomposition for batch shortest path processing in road networks. In ICDE, pages 1189--1200, 2020.
[36]
M. Li, Z. Bao, F. M. Choudhury, and T. Sellis. Supporting large-scale geographical visualization in a multi-granularity way. In WSDM, pages 767--770, 2018.
[37]
P. H. Li, M. L. Yiu, and K. Mouratidis. Discovering historic traffic-tolerant paths in road networks. GeoInformatica, 21(1):1--32, 2017.
[38]
Q. Li, T. Zhang, H. Wang, and Z. Zeng. Dynamic accessibility mapping using floating car data: a network-constrained density estimation approach. Journal of Transport Geography, 19(3):379 -- 393, 2011. Special Issue: Geographic Information Systems for Transportation.
[39]
Y. Li, H. Su, U. Demiryurek, B. Zheng, T. He, and C. Shahabi. PaRE: A system for personalized route guidance. In WWW, pages 637--646, 2017.
[40]
Y. Li, L. H. U, M. L. Yiu, and N. M. Kou. An experimental study on hub labeling based shortest path algorithms. Proc. VLDB Endow., 11(4):445--457, 2017.
[41]
A. Mayorga and M. Gleicher. Splatterplots: Overcoming overdraw in scatter plots. IEEE Transactions on Visualization and Computer Graphics, 19(9):1526--1538, Sept 2013.
[42]
L. Micallef, G. Palmas, A. Oulasvirta, and T. Weinkauf. Towards perceptual optimization of the visual design of scatterplots. IEEE Trans. Vis. Comput. Graph., 23(6):1588--1599, 2017.
[43]
M. M. Moradi, F. J. Rodríguez-Cortés, and J. Mateu. On kernel-based intensity estimation of spatial point patterns on linear networks. Journal of Computational and Graphical Statistics, 27(2):302--311, 2018.
[44]
K. Mouratidis, Y. Lin, and M. L. Yiu. Preference queries in large multi-cost transportation networks. In ICDE, pages 533--544, 2010.
[45]
J. Ni, T. Qian, C. Xi, Y. Rui, and J. Wang. Spatial distribution characteristics of healthcare facilities in nanjing: Network point pattern analysis and correlation analysis. International journal of environmental research and public health, 13(8):833, 2016.
[46]
J. Ni and C. V. Ravishankar. Pointwise-dense region queries in spatio-temporal databases. In ICDE, pages 1066--1075, 2007.
[47]
A. Okabe, T. Satoh, and K. Sugihara. A kernel density estimation method for networks, its computational method and a gis-based tool. International Journal of Geographical Information Science, 23(1):7--32, 2009.
[48]
A. Okabe and K. Sugihara. Spatial Analysis Along Networks: Statistical and Computational Methods. Statistics in Practice. Wiley, 2012.
[49]
D. Oliver, S. Shekhar, J. M. Kang, R. Laubscher, V. Carlan, and A. Bannur. A k-main routes approach to spatial network activity summarization. IEEE Trans. Knowl. Data Eng., 26(6):1464--1478, 2014.
[50]
D. Papadias, J. Zhang, N. Mamoulis, and Y. Tao. Query processing in spatial network databases. In VLDB, pages 802--813, 2003.
[51]
Y. Park, M. J. Cafarella, and B. Mozafari. Visualization-aware sampling for very large databases. In ICDE, pages 755--766, 2016.
[52]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. VanderPlas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12:2825--2830, 2011.
[53]
A. Perrot, R. Bourqui, N. Hanusse, F. Lalanne, and D. Auber. Large interactive visualization of density functions on big data infrastructure. In LDAV, pages 99--106, 2015.
[54]
J. M. Phillips. ϵ-samples for kernels. In SODA, pages 1622--1632, 2013.
[55]
J. M. Phillips and W. M. Tai. Improved coresets for kernel density estimates. In SODA, pages 2718--2727, 2018.
[56]
J. M. Phillips and W. M. Tai. Near-optimal coresets of kernel density estimates. In SOCG, pages 66:1--66:13, 2018.
[57]
S. Rakshit, A. Baddeley, and G. Nair. Efficient code for second order analysis of events on a linear network. Journal of Statistical Software, Articles, 90(1):1--37, 2019.
[58]
V. C. Raykar, R. Duraiswami, and L. H. Zhao. Fast computation of kernel estimators. Journal of Computational and Graphical Statistics, 19(1):205--220, 2010.
[59]
B. Romano and Z. Jiang. Visualizing traffic accident hotspots based on spatial-temporal network kernel density estimation. In SIGSPATIAL, pages 98:1--98:4, 2017.
[60]
G. Rosser, T. O. Davies, K. Bowers, S. D. Johnson, and T. Cheng. Predictive crime mapping: Arbitrary grids or street networks? Journal of Quantitative Criminology, 33:569 -- 594, 2017.
[61]
Y. Rui, Z. Yang, T. Qian, S. Khalid, N. Xia, and J. Wang. Network-constrained and category-based point pattern analysis for suguo retail stores in nanjing, china. International Journal of Geographical Information Science, 30(2):186--199, 2016.
[62]
A. D. Sarma, H. Lee, H. Gonzalez, J. Madhavan, and A. Y. Halevy. Efficient spatial sampling of large geographical tables. In SIGMOD, pages 193--204, 2012.
[63]
D. Scott. Multivariate Density Estimation: Theory, Practice, and Visualization. A Wiley-interscience publication. Wiley, 1992.
[64]
J. Steele. The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities. MAA problem books series. Cambridge University Press, 2004.
[65]
X. Tang, E. Eftelioglu, and S. Shekhar. Detecting isodistance hotspots on spatial networks: A summary of results. In SSTD, pages 281--299, 2017.
[66]
X. Tang, J. Gupta, and S. Shekhar. Linear hotspot discovery on all simple paths: A summary of results. In SIGSPATIAL, pages 476--479, 2019.
[67]
A. C. Telea. Data Visualization: Principles and Practice, Second Edition. A. K. Peters, Ltd., Natick, MA, USA, 2nd edition, 2014.
[68]
J. R. Thomsen, M. L. Yiu, and C. S. Jensen. Effective caching of shortest paths for location-based services. In SIGMOD, pages 313--324, 2012.
[69]
M. Wand and M. Jones. Kernel Smoothing. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. Taylor & Francis, 1994.
[70]
D. Wilkie, J. Sewall, and M. C. Lin. Transforming GIS data into functional road models for large-scale traffic simulation. IEEE Trans. Vis. Comput. Graph., 18(6):890--901, 2012.
[71]
K. Xie, K. Ozbay, A. Kurkcu, and H. Yang. Analysis of traffic crashes involving pedestrians using big data: Investigation of contributing factors and identification of hotspots. Risk Analysis, 37(8):1459--1476, 2017.
[72]
Z. Xie and J. Yan. Kernel density estimation of traffic accidents in a network space. Computers, Environment and Urban Systems, 32(5):396 -- 406, 2008.
[73]
Z. Xie and J. Yan. Detecting traffic accident clusters with network kernel density estimation and local spatial statistics: an integrated approach. Journal of Transport Geography, 31:64 -- 71, 2013.
[74]
C. Yang, R. Duraiswami, and L. S. Davis. Efficient kernel machines using the improved fast gauss transform. In NIPS, pages 1561--1568, 2004.
[75]
M. L. Yiu and N. Mamoulis. Clustering objects on a spatial network. In SIGMOD, pages 443--454, 2004.
[76]
M. L. Yiu, N. Mamoulis, and D. Papadias. Aggregate nearest neighbor queries in road networks. IEEE Trans. Knowl. Data Eng., 17(6):820--833, 2005.
[77]
H. Yu, P. Liu, J. Chen, and H. Wang. Comparative analysis of the spatial analysis methods for hotspot identification. Accident Analysis and Prevention, 66:80 -- 88, 2014.
[78]
W. Yu, T. Ai, and S. Shao. The analysis and delimitation of central business district using network kernel density estimation. Journal of Transport Geography, 45:32--47, 2015.
[79]
G. Zhang, A. Zhu, and Q. Huang. A GPU-accelerated adaptive kernel density estimation approach for efficient point pattern analysis on spatial big data. International Journal of Geographical Information Science, 31(10):2068--2097, 2017.
[80]
Z. Zhang, D. Chen, W. Liu, J. Racine, S.-H. Ong, Y. Chen, G. Zhao, and Q. Jiang. Nonparametric evaluation of dynamic disease risk: A spatio-temporal kernel approach. PloS one, 6:e17381, 03 2011.
[81]
Y. Zheng. Trajectory data mining: An overview. ACM Trans. Intell. Syst. Technol., 6(3), May 2015.
[82]
Y. Zheng, J. Jestes, J. M. Phillips, and F. Li. Quality and efficiency for kernel density estimates in large data. In SIGMOD, pages 433--444, 2013.
[83]
Y. Zheng, Y. Ou, A. Lex, and J. M. Phillips. Visualization of big spatial data using coresets for kernel density estimates. In IEEE Symposium on Visualization in Data Science (VDS '17), to appear. IEEE, 2017.
[84]
Y. Zheng and J. M. Phillips. L∞ error and bandwidth selection for kernel density estimates of large data. In SIGKDD, pages 1533--1542, 2015.

Cited By

View all
  • (2024)Approximate kernel density estimation under metric-based local differential privacyProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702876(4250-4270)Online publication date: 15-Jul-2024
  • (2024)LION: Fast and High-Resolution Network Kernel Density VisualizationProceedings of the VLDB Endowment10.14778/3648160.364816817:6(1255-1268)Online publication date: 3-May-2024
  • (2023)Scalable Evaluation of Local K-Function for Radius-Accurate Hotspot Detection in Spatial NetworksProceedings of the 31st ACM International Conference on Advances in Geographic Information Systems10.1145/3589132.3625646(1-12)Online publication date: 13-Nov-2023
  • Show More Cited By

Index Terms

  1. Fast augmentation algorithms for network kernel density visualization
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 14, Issue 9
      May 2021
      249 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      Published: 01 May 2021
      Published in PVLDB Volume 14, Issue 9

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 09 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Approximate kernel density estimation under metric-based local differential privacyProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702876(4250-4270)Online publication date: 15-Jul-2024
      • (2024)LION: Fast and High-Resolution Network Kernel Density VisualizationProceedings of the VLDB Endowment10.14778/3648160.364816817:6(1255-1268)Online publication date: 3-May-2024
      • (2023)Scalable Evaluation of Local K-Function for Radius-Accurate Hotspot Detection in Spatial NetworksProceedings of the 31st ACM International Conference on Advances in Geographic Information Systems10.1145/3589132.3625646(1-12)Online publication date: 13-Nov-2023
      • (2023)PyNKDV: An Efficient Network Kernel Density Visualization Library for Geospatial Analytic SystemsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589711(99-102)Online publication date: 4-Jun-2023
      • (2023)Large-scale Geospatial Analytics: Problems, Challenges, and OpportunitiesCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589401(21-29)Online publication date: 4-Jun-2023
      • (2022)Fast network k-function-based spatial analysisProceedings of the VLDB Endowment10.14778/3551793.355183615:11(2853-2866)Online publication date: 1-Jul-2022
      • (2022)SLAM: Efficient Sweep Line Algorithms for Kernel Density VisualizationProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517823(2120-2134)Online publication date: 10-Jun-2022
      • (2021)SWSProceedings of the VLDB Endowment10.14778/3503585.350359115:4(814-827)Online publication date: 1-Dec-2021
      • (2021)SAFEProceedings of the VLDB Endowment10.14778/3494124.349413515:3(513-526)Online publication date: 1-Nov-2021

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media