Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3274895.3274949acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Scalable hybrid similarity join over geolocated time series

Published: 06 November 2018 Publication History
  • Get Citation Alerts
  • Abstract

    A geolocated time series is a sequence of values associated with a geolocation, such as measurements provided by a sensor installed at a certain location. In this paper, we address the problem of hybrid similarity joins over such geolocated time series. This operation returns all pairs of geolocated time series that exhibit similar behavior in the time series domain while also being closely located in space. First, we propose algorithms for performing such join operations using different types of indices, including spatial-only, time series-only, and hybrid indices. Such centralized indexing schemes can cope well with moderate data volumes but they face scalability issues when the dataset size increases significantly. To overcome this problem, we present a MapReduce-based processing scheme with space-driven partitioning. Our parallel and distributed algorithm leverages our hybrid index for geolocated time series to efficiently execute similarity joins locally within each partition and minimize the amount of data that needs to be shuffled between processing nodes. An extensive experimental evaluation confirms that our approach can efficiently compute all matching pairs even for datasets containing millions of geolocated time series.

    References

    [1]
    N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An efficient and robust access method for points and rectangles. In SIGMOD, pages 322--331, 1990.
    [2]
    P. Bouros, S. Ge, and N. Mamoulis. Spatio-textual similarity joins. PVLDB, 6(1):1--12, 2012.
    [3]
    T. Brinkhoff, H. Kriegel, and B. Seeger. Efficient processing of spatial joins using R-trees. In SIGMOD, pages 237--246, 1993.
    [4]
    A. Camerra, T. Palpanas, J. Shieh, and E. J. Keogh. iSAX 2.0: Indexing and mining one billion time series. In ICDM, pages 58--67, 2010.
    [5]
    A. Camerra, J. Shieh, T. Palpanas, T. Rakthanmanon, and E. J. Keogh. Beyond one billion time series: indexing and mining very large time series collections with i SAX2+. Knowl. Inf. Syst., 39(1):123--151, 2014.
    [6]
    K. Chan and A. W. Fu. Efficient time series matching by wavelets. In ICDE, pages 126--133, 1999.
    [7]
    G. Chatzigeorgakidis, D. Skoutas, K. Patroumpas, S. Athanasiou, and S. Skiadopoulos. Indexing geolocated time series data. In SIGSPATIAL, pages 19:1--19:10, 2017.
    [8]
    H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. J. Keogh. Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB, 1(2):1542--1552, 2008.
    [9]
    D. Q. Goldin and P. C. Kanellakis. On similarity queries for time-series data: Constraint specification and implementation. In CP, pages 137--153, 1995.
    [10]
    A. Guttman. R-trees: A dynamic index structure for spatial searching. In SIGMOD, pages 47--57, 1984.
    [11]
    E. J. Keogh, K. Chakrabarti, M. J. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst., 3(3):263--286, 2001.
    [12]
    J. Lin, E. J. Keogh, L. Wei, and S. Lonardi. Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov., 15(2):107--144, 2007.
    [13]
    T. Palpanas. Big sequence management: A glimpse of the past, the present, and the future. In SOFSEM, pages 63--80, 2016.
    [14]
    D. Papadias, N. Mamoulis, and Y. Theodoridis. Processing and optimization of multiway spatial joins using r-trees. In SIGMOD, pages 44--55, 1999.
    [15]
    I. Popivanov and R. J. Miller. Similarity search over time-series data using wavelets. In ICDE, pages 212--221, 2002.
    [16]
    S. Qi, P. Bouros, and N. Mamoulis. Efficient top-k spatial distance joins. In SSTD, pages 1--18, 2013.
    [17]
    N. Roussopoulos, S. Kelley, and F. Vincent. Nearest neighbor queries. In SIGMOD, pages 71--79, 1995.
    [18]
    J. Shieh and E. J. Keogh. iSAX: indexing and mining terabyte sized time series. In SIGKDD, pages 623--631, 2008.
    [19]
    B. Yi and C. Faloutsos. Fast time sequence indexing for arbitrary Lp norms. In VLDB, pages 385--394, 2000.
    [20]
    Y. Zhang, Y. Ma, and X. Meng. Efficient spatio-textual similarity join using mapreduce. In WI, pages 52--59, 2014.
    [21]
    K. Zoumpatianos, S. Idreos, and T. Palpanas. Indexing for interactive exploration of big data series. In SIGMOD, pages 1555--1566, 2014.

    Cited By

    View all
    • (2023)Similarity join over multiple time series under Dynamic Time WarpingCTU Journal of Innovation and Sustainable Development10.22144/ctujoisd.2023.02915:ISDS(1-11)Online publication date: 16-Oct-2023
    • (2022)Real-Time Visual Analytics for Air QualityIntegrating Artificial Intelligence and Visualization for Visual Knowledge Discovery10.1007/978-3-030-93119-3_19(485-515)Online publication date: 5-Jun-2022
    • (2020)Projection Based Large Scale High-Dimensional Data Similarity Join Using MapReduce FrameworkIEEE Access10.1109/ACCESS.2020.30070288(121665-121677)Online publication date: 2020
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGSPATIAL '18: Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
    November 2018
    655 pages
    ISBN:9781450358897
    DOI:10.1145/3274895
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 November 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. big data
    2. similarity joins
    3. spatial indices
    4. time series

    Qualifiers

    • Research-article

    Funding Sources

    • General Secretariat for Research and Technology (GSRT)
    • Hellenic Foundation for Research and Innovation (HFRI)
    • EU H2020 project SLIPO (731581)
    • NSRF 2014-2020 project HELIX (5002781)

    Conference

    SIGSPATIAL '18
    Sponsor:

    Acceptance Rates

    SIGSPATIAL '18 Paper Acceptance Rate 30 of 150 submissions, 20%;
    Overall Acceptance Rate 220 of 1,116 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Similarity join over multiple time series under Dynamic Time WarpingCTU Journal of Innovation and Sustainable Development10.22144/ctujoisd.2023.02915:ISDS(1-11)Online publication date: 16-Oct-2023
    • (2022)Real-Time Visual Analytics for Air QualityIntegrating Artificial Intelligence and Visualization for Visual Knowledge Discovery10.1007/978-3-030-93119-3_19(485-515)Online publication date: 5-Jun-2022
    • (2020)Projection Based Large Scale High-Dimensional Data Similarity Join Using MapReduce FrameworkIEEE Access10.1109/ACCESS.2020.30070288(121665-121677)Online publication date: 2020
    • (2019)Local Similarity Search on Geolocated Time Series Using Hybrid IndexingProceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/3347146.3359349(179-188)Online publication date: 5-Nov-2019

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media