Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3274895.3274936acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Evaluating spatial-keyword queries on streaming data

Published: 06 November 2018 Publication History
  • Get Citation Alerts
  • Abstract

    This paper provides an extensive experimental evaluation for different spatial-keyword index structures on streaming data. We extend existing snapshot spatial-keyword queries with the temporal dimension to effectively serve streaming data applications. Then, the major index structures are equipped with efficient query processing techniques and evaluated to process the extended queries. The evaluation is oriented towards a system building perspective to provide system builders with insights on supporting scalable spatial-keyword queries on fast data streams, e.g., social media streams and news streams. In particular, we have taken existing spatial-keyword index structures apart into four major building blocks that are commonly supported at a system-level. Ten different index structures are then composed as combinations of these four building blocks. The ten indexes are wholly residents in main-memory, and they are evaluated on real datasets and query locations. The index performance is measured in terms of data digestion rate in real time, main-memory footprint, and query latency. The results show the relative performance gains of both basic and hybrid index structures with abundant insights from a system point of view.

    References

    [1]
    Apache Spark. https://spark.apache.org/, 2017.
    [2]
    W. G. Aref and H. Samet. Efficient Processing of Window Queries in the Pyramid Data Structure. In PODS, 1990.
    [3]
    N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an Efficient and Robust Access Method for Points and Rectangles. In Acm Sigmod Record, volume 19, 1990.
    [4]
    After Boston Explosions, People Rush to Twitter for Breaking News. http://www.latimes.com/business/technology/la-fi-tn-after-boston-explosions-people-rush-to-twitter-for-breaking-news-20130415,0,3729783.story, 2013.
    [5]
    C. Budak, T. Georgiou, D. Agrawal, and A. El Abbadi. Geoscope: Online Detection of Geo-correlated Information Trends in Social Networks. PVLDB, 7(4), 2013.
    [6]
    M. Busch, K. Gade, B. Larson, P. Lok, S. Luckenbill, and J. Lin. Earlybird: Real-Time Search at Twitter. In ICDE, 2012.
    [7]
    X. Cao, L. Chen, G. Cong, and X. Xiao. Keyword-aware Optimal Route Search. PVLDB, 5(11), 2012.
    [8]
    X. Cao, G. Cong, T. Guo, C. S. Jensen, and B. C. Ooi. Efficient Processing of Spatial Group Keyword Queries. TODS, 40(2), 2015.
    [9]
    X. Cao, G. Cong, C. S. Jensen, and B. C. Ooi. Collective Spatial Keyword Querying. In SIGMOD, 2011.
    [10]
    A. Cary, O. Wolfson, and N. Rishe. Efficient and Scalable Method for Processing Top-k Spatial Boolean Queries. In SSDBM, 2010.
    [11]
    G. Chen, J. Zhao, Y. Gao, L. Chen, and R. Chen. Time-Aware Boolean Spatial Keyword Queries. TKDE, 29(11), 2017.
    [12]
    L. Chen, G. Cong, and X. Cao. An Efficient Query Indexing Mechanism for Filtering Geo-textual Data. In SIGMOD, 2013.
    [13]
    L. Chen, G. Cong, C. S. Jensen, and D. Wu. Spatial Keyword Query Processing: an Experimental Evaluation. In PVLDB, volume 6, 2013.
    [14]
    L. Chen, Y. Cui, G. Cong, and X. Cao. SOPS: A System for Efficient Processing of Spatial-keyword Publish/Subscribe. PVLDB, 7(13), 2014.
    [15]
    How Twitter, Facebook, WhatsApp And Other Social Networks Are Saving Lives During Disasters. http://www.huffingtonpost.in/2017/01/31/how-twitter-facebook-whatsapp-and-other-social-networks-are-sa_a_21703026/, 2017.
    [16]
    M. Christoforaki, J. He, C. Dimopoulos, A. Markowetz, and T. Suel. Text vs. Space: Efficient Geo-search Query Processing. In CIKM, 2011.
    [17]
    G. Cong and C. S. Jensen. Querying Geo-textual Data: Spatial Keyword Queries and Beyond. In SIGMOD, 2016.
    [18]
    G. Cong, C. S. Jensen, and D. Wu. Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects. PVLDB, 2(1), 2009.
    [19]
    S. A. et. al. AsterixDB: A Scalable, Open Source BDMS. PVLDB, 7(14), 2014.
    [20]
    R. Grover and M. Carey. Data Ingestion in AsterixDB. In EDBT, 2015.
    [21]
    T. Guo, X. Cao, and G. Cong. Efficient Algorithms for Answering the M-closest Keywords Query. In SIGMOD, 2015.
    [22]
    A. Guttman. R-trees: A Dynamic Index Structure for Spatial Searching, volume 14. 1984.
    [23]
    T.-A. Hoang-Vu, H. T. Vo, and J. Freire. A Unified Index for Spatio-Temporal Keyword Queries. In CIKM, 2016.
    [24]
    Hurricane Harvey Victims Turn to Twitter and Facebook. http://time.com/4921961/hurricane-harvey-twitter-facebook-social-media/, 2017.
    [25]
    In Irma, Emergency Responders' New Tools: Twitter and Facebook. https://www.wsj.com/articles/for-hurricane-irma-information-officials-post-on-social-media-1505149661, 2017.
    [26]
    I. Kamel and C. Faloutsos. Hilbert R-tree: An Improved R-tree Using Fractals. Technical report, 1993.
    [27]
    Embrace of Social Media Aids Flood Victims in Kashmir. https://www.nytimes.com/2014/09/13/world/asia/embrace-of-social-media-aids-flood-victims-in-kashmir.html, 2014.
    [28]
    T. Lee, J.-w. Park, S. Lee, S.-w. Hwang, S. Elnikety, and Y. He. Processing and Optimizing Main Memory Spatial-Keyword Queries. PVLDB, 9(3), 2015.
    [29]
    G. Li, Y. Wang, T. Wang, and J. Feng. Location-aware Publish/Subscribe. In KDD, 2013.
    [30]
    Z. Li, K. C. Lee, B. Zheng, W.-C. Lee, D. Lee, and X. Wang. IR-tree: An Efficient Index for Geographic Document Search. TKDE, 23(4), 2011.
    [31]
    C. Long, R. C.-W. Wong, K. Wang, and A. W.-C. Fu. Collective Spatial Keyword Queries: a Distance Owner-driven Approach. In SIGMOD, 2013.
    [32]
    A. Magdy, L. Alarabi, S. Al-Harthi, M. Musleh, T. M. Ghanem, S. Ghani, and M. F. Mokbel. Taghreed: a System for Querying, Analyzing, and Visualizing Geotagged Microblogs. In SIGSPATIAL, 2014.
    [33]
    A. Magdy, R. Alghamdi, and M. F. Mokbel. On Main-memory Flushing in Microblogs Data Management Systems. In ICDE, 2016.
    [34]
    A. Magdy and M. Mokbel. Demonstration of Kite: A Scalable System for Microblogs Data Management. In ICDE, 2017.
    [35]
    A. Magdy, M. F. Mokbel, S. Elnikety, S. Nath, and Y. He. Mercury: A Memory-constrained Spatio-temporal Real-time Search on Microblogs. In ICDE, 2014.
    [36]
    A. Magdy, M. F. Mokbel, S. Elnikety, S. Nath, and Y. He. Venus: Scalable Real-time Spatial Queries on Microblogs with Adaptive Load Shedding. TKDE, 2016.
    [37]
    A. R. Mahmood, A. M. Aly, T. Qadah, E. K. Rezig, A. Daghistani, A. Madkour, A. S. Abdelhamid, M. S. Hassan, W. G. Aref, and S. Basalamah. Tornado: A Distributed Spatio-textual Stream Processing System. 8(12), 2015.
    [38]
    A. R. Mahmood, W. G. Aref, and A. M. Aly. FAST: Frequency-Aware Indexing for Spatio-Textual Data Streams. In ICDE, 2018.
    [39]
    A. R. Mahmood, W. G. Aref, A. M. Aly, and M. Tang. Atlas: On the Expression of Spatial-keyword Group Queries Using Extended Relational Constructs. In SIGSPATIAL, 2016.
    [40]
    A. R. Mahmood, A. Daghistani, A. M. Aly, W. G. Aref, M. Tang, S. M. Basalamah, and S. Prabhakar. Adaptive Processing of Spatial-Keyword Data Over a Distributed Streaming Cluster. CoRR, abs/1709.02533, 2017.
    [41]
    M. Mokbel and A. Magdy. System and Method for Microblogs Data Management, U.S. Patent and Trademark Office on August 31, 2015, Application number: 14/841299. http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=/netahtml/PTO/srchnum.html&r=1&f=G&l=50&s1=20160070754.PGNR., 08 2015.
    [42]
    J. B. Rocha-Junior, O. Gkorgkas, S. Jonassen, and K. Nørvåg. Efficient Processing of Top-k Spatial Keyword Queries. In SSTD, 2011.
    [43]
    J. Sankaranarayanan, H. Samet, B. E. Teitler, M. D. Lieberman, and J. Sperling. TwitterStand: News in Tweets. In SIGSPATIAL, 2009.
    [44]
    T. Sellis, N. Roussopoulos, and C. Faloutsos. The R+-Tree: A Dynamic Index for Multi-Dimensional Objects. Technical report, 1987.
    [45]
    M. Stonebraker and A. Weisberg. The VoltDB Main Memory DBMS. IEEE Data Engineering Bulletin, 36(2), 2013.
    [46]
    J. Subercaze, C. Gravier, and F. Laforest. Real-time, Scalable, Content-based Twitter Users Recommendation. In Companion of WWW, 2018.
    [47]
    Health Department Use of Social Media to Identify Foodborne Illness - Chicago, Illinois, 2013-2014. https://www.cdc.gov/mmwr/preview/mmwrhtml/mm6332a1.htm, 2014.
    [48]
    S. Vaid, C. B. Jones, H. Joho, and M. Sanderson. Spatio-textual Indexing for Geographical Search on the Web. In SSTD, 2005.
    [49]
    X. Wang, Y. Zhang, W. Zhang, X. Lin, and W. Wang. Ap-tree: Efficiently Support Continuous Spatial-keyword Queries over Stream. In ICDE, 2015.
    [50]
    L. Wu, W. Lin, X. Xiao, and Y. Xu. LSII: An Indexing Structure for Exact Real-Time Search on Microblogs. In ICDE, 2013.
    [51]
    Y. Zhou, X. Xie, C. Wang, Y. Gong, and W.-Y. Ma. Hybrid Index Structures for Location-based Web Search. In CIKM, 2005.
    [52]
    J. Zobel and A. Moffat. Inverted Files for Text Search Engines. ACS, 38(2), 2006.

    Cited By

    View all
    • (2024)SkyEye: continuous processing of moving spatial-keyword queries over moving objectsGeoInformatica10.1007/s10707-024-00512-0Online publication date: 20-Mar-2024
    • (2024)Keeping an eye on moving objects: processing continuous spatial-keyword range queriesGeoinformatica10.1007/s10707-023-00499-028:1(117-143)Online publication date: 1-Jan-2024
    • (2023)Indexing for Keyword Search with Structured ConstraintsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588663(263-275)Online publication date: 18-Jun-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGSPATIAL '18: Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
    November 2018
    655 pages
    ISBN:9781450358897
    DOI:10.1145/3274895
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 November 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. geo-texual
    2. real-time
    3. spatial
    4. temporal query processing

    Qualifiers

    • Research-article

    Conference

    SIGSPATIAL '18
    Sponsor:

    Acceptance Rates

    SIGSPATIAL '18 Paper Acceptance Rate 30 of 150 submissions, 20%;
    Overall Acceptance Rate 220 of 1,116 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SkyEye: continuous processing of moving spatial-keyword queries over moving objectsGeoInformatica10.1007/s10707-024-00512-0Online publication date: 20-Mar-2024
    • (2024)Keeping an eye on moving objects: processing continuous spatial-keyword range queriesGeoinformatica10.1007/s10707-023-00499-028:1(117-143)Online publication date: 1-Jan-2024
    • (2023)Indexing for Keyword Search with Structured ConstraintsProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588663(263-275)Online publication date: 18-Jun-2023
    • (2023)Approximate Reverse Top-k Spatial-Keyword Queries2023 24th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM58254.2023.00026(96-105)Online publication date: Jul-2023
    • (2023)FogLBSPervasive and Mobile Computing10.1016/j.pmcj.2023.10183294:COnline publication date: 18-Oct-2023
    • (2022)Advanced Conjunctive Boolean Streaming Spatial Keyword Processing2022 23rd IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM55031.2022.00027(41-43)Online publication date: Jun-2022
    • (2022)Learning to Process Topic Aware Queries on Geo-Textual Streaming Data2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00063(443-450)Online publication date: Dec-2022
    • (2021)A parametric approximation algorithm for spatial group keyword queriesIntelligent Data Analysis10.3233/IDA-19507125:2(305-319)Online publication date: 4-Mar-2021
    • (2021)Temporal Geo-Social Personalized Keyword Search Over Streaming DataACM Transactions on Spatial Algorithms and Systems10.1145/34730067:4(1-28)Online publication date: 16-Aug-2021
    • (2021)LATEST: Learning-Assisted Selectivity Estimation Over Spatio-Textual Streams2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00142(1607-1618)Online publication date: Apr-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media