Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

A Unified Framework for Robust and Efficient Hotspot Detection in Smart Cities

Published: 14 September 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Given N geo-located point instances (e.g., crime or disease cases) in a spatial domain, we aim to detect sub-regions (i.e., hotspots) that have a higher probability density of generating such instances than the others. Hotspot detection has been widely used in a variety of important urban applications, including public safety, public health, urban planning, and equity, among others. The problem is challenging because its societal applications often have low tolerance for false positives and require significance testing that is computationally intensive. In related work, the spatial scan statistic introduced a likelihood ratio--based framework for hotspot evaluation and significance testing. However, it fails to consider the effect of spatial non-determinism, causing many missing detections. Our previous work introduced a non-deterministic normalization--based scan statistic to mitigate this issue. However, its robustness against false positives is not stably controlled. To address these limitations, we propose a unified framework that can improve the completeness of results without incurring more false positives. We also propose a reduction algorithm to improve the computational efficiency. Experiment results confirm that the unified framework can greatly improve the recall of hotspot detection without increasing the number of false positives, and the reduction algorithm can greatly reduce execution time.

    References

    [1]
    National Science Foundation. 2017. S8CC-IRG Track 1: Connecting the Smart-City Paradigm with a Sustainable Urban Infrastructure Systems Framework to Advance Equity in Communities. Retrieved July 17, 2020 from https://www.nsf.gov/awardsearch/showAward?AWD_ID=17376338HistoricalAwards=false.
    [2]
    National Cancer Institute. 2017. Surveillance Research Program. Retrieved July 17, 2020 from https://surveillance.cancer.gov//.
    [3]
    SatScan. 2017. Home Page. Retrieved July 17, 2020 from https://www.satscan.org/.
    [4]
    Gowtham Atluri, Anuj Karpatne, and Vipin Kumar. 2018. Spatio-temporal data mining: A survey of problems and methods. ACM Computing Surveys 51, 4 (2018), 83.
    [5]
    Jose Cadena, Arinjoy Basak, Anil Vullikanti, and Xinwei Deng. 2018. Graph scan statistics with uncertainty. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
    [6]
    Eric Delmelle, Coline Dony, Irene Casas, Meijuan Jia, and Wenwu Tang. 2014. Visualizing the impact of space-time uncertainties on dengue fever patterns. International Journal of Geographical Information Science 28, 5 (2014), 1107--1127.
    [7]
    Emre Eftelioglu, Yan Li, Xun Tang, Shashi Shekhar, James M. Kang, and Christopher Farah. 2016. Mining network hotspots with holes: A summary of results. In Proceedings of the International Conference on Geographic Information Science. 51--67.
    [8]
    Emre Eftelioglu, Shashi Shekhar, Dev Oliver, Xun Zhou, Michael R. Evans, Yiqun Xie, James M. Kang, Renee Laubscher, and Christopher Farah. 2014. Ring-shaped hotspot detection: A summary of results. In Proceedings of the 2014 IEEE International Conference on Data Mining. IEEE, Los Alamitos, CA, 815--820.
    [9]
    Emre Eftelioglu, Xun Tang, and Shashi Shekhar. 2015. Geographically robust hotspot detection: A summary of results. In Proceedings of the IEEE International Conference on Data Mining Workshop (ICDMW’15). IEEE, Los Alamitos, CA, 1447--1456.
    [10]
    Lan Huang, Ram C. Tiwari, Linda W. Pickle, and Zhaohui Zou. 2010. Covariate adjusted weighted normal spatial scan statistics with applications to study geographic clustering of obesity and lung cancer mortality in the United States. Statistics in Medicine 29, 23 (2010), 2410--2422.
    [11]
    Lan Huang, Ram C. Tiwari, Zhaohui Zou, Martin Kulldorff, and Eric J. Feuer. 2009. Weighted normal spatial scan statistic for heterogeneous population data. Journal of the American Statistical Association 104, 487 (2009), 886--898.
    [12]
    Yan Huang and Jason W. Powell. 2012. Detecting regions of disequilibrium in taxi services under uncertainty. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems. ACM, New York, NY, 139--148.
    [13]
    Vandana Pursnani Janeja and Vijayalakshmi Atluri. 2005. LS 3: A linear semantic scan statistic technique for detecting anomalous windows. In Proceedings of the 2005 ACM Symposium on Applied Computing. ACM, New York, NY, 493--497.
    [14]
    Xia Jiang and Gregory F. Cooper. 2010. A Bayesian spatio-temporal method for disease outbreak detection. Journal of the American Medical Informatics Association 17, 4 (2010), 462--471.
    [15]
    Inkyung Jung, Martin Kulldorff, and Otukei John Richard. 2010. A spatial scan statistic for multinomial data. Statistics in Medicine 29, 18 (2010), 1910--1918.
    [16]
    Julia Krolik, Gerald Evans, Paul Belanger, Allison Maier, Geoffrey Hall, Alan Joyce, Stephanie Guimont, Amanda Pelot, and Anna Majury. 2014. Microbial source tracking and spatial analysis of E. coli contaminated private well waters in southeastern Ontario. Journal of Water and Health 12, 2 (2014), 348--357.
    [17]
    Julia Krolik, Allison Maier, Gerald Evans, Paul Belanger, Geoffrey Hall, and Alan Joyce. 2013. A spatial analysis of private well water Escherichia coli contamination in southern Ontario. Geospatial Health 8, 1 (2013), 65--75.
    [18]
    Martin Kulldorff. 1997. A spatial scan statistic. Communications in Statistics—Theory and Methods 26, 6 (1997), 1481--1496.
    [19]
    Martin Kulldorff, Lan Huang, and Kevin Konty. 2009. A scan statistic for continuous data based on the normal probability model. International Journal of Health Geographics 8, 1 (2009), 58.
    [20]
    Martin Kulldorff, Lan Huang, and Linda Pickle. 2003. An elliptic spatial scan statistic and its application to breast cancer mortality data in Northeastern United States. Journal of Urban Health 80 (2003), i130--i131.
    [21]
    Martin Kulldorff, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic. Statistics in Medicine 25, 22 (2006), 3929--3943.
    [22]
    Martin Kulldorff, Farzad Mostashari, Luiz Duczmal, W. Katherine Yih, Ken Kleinman, and Richard Platt. 2007. Multivariate scan statistics for disease surveillance. Statistics in Medicine 26, 8 (2007), 1824--1833.
    [23]
    Michael Leitner and Marco Helbich. 2011. The impact of hurricanes on crime: A spatio-temporal analysis in the city of Houston, Texas. Cartography and Geographic Information Science 38, 2 (2011), 213--221.
    [24]
    Lan Luo. 2013. Impact of spatial aggregation error on the spatial scan analysis: A case study of colorectal cancer. Geospatial Health 8, 1 (2013), 23--35.
    [25]
    Nicholas Malizia. 2013. Inaccuracy, uncertainty and the space-time permutation scan statistic. PLoS One 8, 2 (2013), e52034.
    [26]
    Tomoki Nakaya and Keiji Yano. 2010. Visualising crime clusters in a space-time cube: An exploratory data-analysis approach using space-time kernel density estimation and scan statistics. Transactions in GIS 14, 3 (2010), 223--239.
    [27]
    Daniel B. Neill. 2009. Expectation-based scan statistics for monitoring spatial time series data. International Journal of Forecasting 25, 3 (2009), 498--517.
    [28]
    Daniel B. Neill. 2011. Fast Bayesian scan statistics for multivariate event detection and visualization. Statistics in Medicine 30, 5 (2011), 455--469.
    [29]
    Daniel B. Neill and Gregory F. Cooper. 2010. A multivariate Bayesian scan statistic for early event detection and characterization. Machine Learning 79, 3 (2010), 261--282.
    [30]
    Daniel B. Neill, Gregory F. Cooper, Kaustav Das, Xia Jiang, and Jeff Schneider. 2009. Bayesian network scan statistics for multivariate pattern detection. In Scan Statistics. Statistics for Industry and Technology. Springer, 221--249.
    [31]
    Daniel B. Neill and Andrew W. Moore. 2004. A fast multi-resolution method for detection of significant spatial disease clusters. In Advances in Neural Information Processing Systems 10. 651--658.
    [32]
    Daniel B. Neill and Andrew W. Moore. 2004. Rapid detection of significant spatial clusters. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 256--265.
    [33]
    Daniel B. Neill, Andrew W. Moore, and Gregory F. Cooper. 2006. A Bayesian spatial scan statistic. In Advances in Neural Information Processing Systems. 1003--1010.
    [34]
    Fernando L. P. Oliveira, André L. F. Cançado, Luiz H. Duczmal, and Anderson R. Duarte. 2012. Assessing the outline uncertainty of spatial disease clusters. In Public Health—Methodology, Environmental and Systems Issues, J. Maddock (Ed.). InTech, 51--66.
    [35]
    Dev Oliver, Shashi Shekhar, James M. Kang, Renee Laubscher, Veronica Carlan, and Abdussalam Bannur. 2013. A k-main routes approach to spatial network activity summarization. IEEE Transactions on Knowledge and Data Engineering 26, 6 (2013), 1464--1478.
    [36]
    Sushil K. Prasad, Danial Aghajarian, Michael McDermott, Dhara Shah, Mohamed Mokbel, Satish Puri, Sergio J. Rey, et al. 2017. Parallel processing over spatial-temporal datasets from geo, bio, climate and social science communities: A research roadmap. In Proceedings of the 2017 IEEE International Congress on Big Data (BigData Congress’17). IEEE, Los Alamitos, CA, 232--250.
    [37]
    Shashi Shekhar, Steven Feiner, and Walid Aref. 2015. Spatial computing. Communications of the ACM 59, 1 (2015), 72--81.
    [38]
    Shashi Shekhar, Zhe Jiang, Reem Ali, Emre Eftelioglu, Xun Tang, Venkata Gunturi, and Xun Zhou. 2015. Spatiotemporal data mining: A computational perspective. ISPRS International Journal of Geo-Information 4, 4 (2015), 2306--2338.
    [39]
    Lei Shi and Vandana P. Janeja. 2009. Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP). In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 767--776.
    [40]
    Joanne R. Stevenson, Christopher T. Emrich, Jerry T. Mitchell, and Susan L. Cutter. 2010. Using building permits to monitor disaster recovery: A spatio-temporal case study of coastal Mississippi following Hurricane Katrina. Cartography and Geographic Information Science 37, 1 (2010), 57--68.
    [41]
    Xun Tang, Emre Eftelioglu, Dev Oliver, and Shashi Shekhar. 2017. Significant linear hotspot discovery. IEEE Transactions on Big Data 3, 2 (2017), 140--153.
    [42]
    Jonathan Wakefield and Albert Kim. 2013. A Bayesian model for cluster detection. Biostatistics 14, 4 (2013), 752--765.
    [43]
    Clemens Wastl, Yong Wang, Aitor Atencia, and Christoph Wittmann. 2019. Independent perturbations for physics parametrization tendencies in a convection-permitting ensemble (pSPPT). Geoscientific Model Development 12, 1 (2019), 261--273.
    [44]
    Antje Weisheimer, Susanna Corti, Tim Palmer, and Frederic Vitart. 2014. Addressing model error through atmospheric stochastic physical parametrizations: Impact on the coupled ECMWF seasonal forecasting system. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 372, 2018 (2014), 20130290.
    [45]
    Claire S. Witham and Clive Oppenheimer. 2004. Mortality in England during the 1783--4 Laki Craters eruption. Bulletin of Volcanology 67, 1 (2004), 15--26.
    [46]
    Yiqun Xie, Emre Eftelioglu, Reem Ali, Xun Tang, Yan Li, Ruhi Doshi, and Shashi Shekhar. 2017. Transdisciplinary foundations of geospatial data science. ISPRS International Journal of Geo-Information 6, 12 (2017), 395.
    [47]
    Yiqun Xie, Jayant Gupta, Yan Li, and Shashi Shekhar. 2018. Transforming smart cities with spatial computing. In Proceedings of the 2018 IEEE International Smart Cities Conference (ISC2’18). IEEE, Los Alamitos, CA, 1--9.
    [48]
    Yiqun Xie and Shashi Shekhar. 2019. A nondeterministic normalization based scan statistic (NN-scan) towards robust hotspot detection: A summary of results. In Proceedings of the SIAM International Conference on Data Mining (SDM’19).
    [49]
    Yiqun Xie and Shashi Shekhar. 2019. Significant DBSCAN towards statistically robust clustering. In Proceedings of the 16th International Symposium on Spatial and Temporal Databases. 31--40.
    [50]
    Yiqun Xie, Xun Zhou, and Shashi Shekhar. 2020. Discovering interesting sub-paths with statistical significance from spatio-temporal datasets. ACM Transactions on Intelligent Systems and Technology 11, 1 (2020), Article 2.

    Cited By

    View all
    • (2022)Statistically-Robust Clustering Techniques for Mapping Spatial Hotspots: A SurveyACM Computing Surveys10.1145/348789355:2(1-38)Online publication date: 18-Jan-2022
    • (2021)Significant DBSCAN+: Statistically Robust Density-based ClusteringACM Transactions on Intelligent Systems and Technology10.1145/347484212:5(1-26)Online publication date: 24-Nov-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM/IMS Transactions on Data Science
    ACM/IMS Transactions on Data Science  Volume 1, Issue 3
    Special Issue on Urban Computing and Smart Cities
    August 2020
    217 pages
    ISSN:2691-1922
    DOI:10.1145/3424342
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 September 2020
    Online AM: 07 May 2020
    Accepted: 01 December 2019
    Revised: 01 November 2019
    Received: 01 June 2019
    Published in TDS Volume 1, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Unified framework
    2. hotspot
    3. smart cities

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)127
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Statistically-Robust Clustering Techniques for Mapping Spatial Hotspots: A SurveyACM Computing Surveys10.1145/348789355:2(1-38)Online publication date: 18-Jan-2022
    • (2021)Significant DBSCAN+: Statistically Robust Density-based ClusteringACM Transactions on Intelligent Systems and Technology10.1145/347484212:5(1-26)Online publication date: 24-Nov-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media