Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2064969.2064971acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Similarity matching for uncertain time series: analytical and experimental comparison

Published: 01 November 2011 Publication History

Abstract

In the last years there has been a considerable increase in the availability of continuous sensor measurements in a wide range of application domains, such as Location-Based Services (LBS), medical monitoring systems, manufacturing plants and engineering facilities to ensure efficiency, product quality and safety, hydrologic and geologic observing systems, pollution management, and others.
Due to the inherent imprecision of sensor observations, many investigations have recently turned into querying, mining and storing uncertain data. Uncertainty can also be due to data aggregation, privacy-preserving transforms, and error-prone mining algorithms.
In this study, we survey the techniques that have been proposed specifically for modeling and processing uncertain time series, an important model for temporal data. We provide both an analytical evaluation of the alternatives that have been proposed in the literature, highlighting the advantages and disadvantages of each approach. We additionally conduct an extensive experimental evaluation with 17 real datasets, and discuss some surprising results. Based on our evaluations, we also provide guidelines useful for practitioners in the field.

References

[1]
Keogh, E., Xi, X., Wei, L. & Ratanamahatana, C. A. (2006). The UCR Time Series Classification/Clustering Homepage: www.cs.ucr.edu/eamonn/time_series_data/. Accessed on 17 May 2011.
[2]
C. Aggarwal. On Unifying Privacy and Uncertain Data Models. In Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on, pages 386--395. IEEE, 2008.
[3]
C. Aggarwal. Managing and Mining Uncertain Data. Springer-Verlag New York Inc., 2009.
[4]
R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence databases. Foundations of Data Organization and Algorithms, pages 69--84, 1993.
[5]
J. Aßfalg, H.-P. Kriegel, P. Kröger, and M. Renz. Probabilistic similarity search for uncertain time series. In SSDBM, pages 435--443, 2009.
[6]
M. Ceriotti, M. Corra, L. D'Orazio, R. Doriguzzi, D. Facchin, S. Guna, G. P. Jesi, R. L. Cigno, L. Mottola, A. L. Murphy, M. Pescalli, G. P. Picco, D. Pregnolato, and C. Torghele. Is There Light at the Ends of the Tunnel? Wireless Sensor Networks for Adaptive Lighting in Road Tunnels. In Proceedings of the 10th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), pages 187--198, 2011.
[7]
K. Chan and A. Fu. Efficient time series matching by wavelets. In Data Engineering, 1999. Proceedings., 15th International Conference on, pages 126--133. IEEE, 2002.
[8]
H. Ding, G. Trajcevski, P. Scheuermann, X. Wang, and E. Keogh. Querying and mining of time series data: experimental comparison of representations and distance measures. Proceedings of the VLDB Endowment, 1(2):1542--1552, 2008.
[9]
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. ACM SIGMOD Record, 23(2):419--429, 1994.
[10]
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Comput. Surv., 42(4), 2010.
[11]
E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information Systems, 3(3):263--286, 2001.
[12]
L. Krishnamurthy, R. Adler, P. Buonadonna, J. Chhabra, M. Flanigan, N. Kushalnagar, L. Nachman, and M. Yarvis. Design and deployment of industrial sensor networks: experiences from a semiconductor plant and the north sea. In Proceedings of the 3rd international conference on Embedded networked sensor systems, pages 64--75. ACM, 2005.
[13]
J. Lin, E. Keogh, L. Wei, and S. Lonardi. Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2):107--144, 2007.
[14]
Y. Moon, K. Whang, and W. Han. General match: a subsequence matching method in time-series databases based on generalized windows. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pages 382--393. ACM, 2002.
[15]
Y. Moon, K. Whang, and W. Loh. Duality-based subsequence matching in time-series databases. In Data Engineering, 2001. Proceedings. 17th International Conference on, pages 263--272. IEEE, 2002.
[16]
S. Papadimitriou, F. Li, G. Kollios, and P. S. Yu. Time series compressibility and privacy. In VLDB, pages 459--470, 2007.
[17]
N. Roussopoulos, S. Kelley, and F. Vincent. Nearest neighbor queries. In Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pages 71--79. ACM, 1995.
[18]
S. Sarangi and K. Murthy. DUST: a generalized notion of similarity between uncertain time series. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 383--392. ACM, 2010.
[19]
J. Shieh and E. J. Keogh. sax: indexing and mining terabyte sized time series. In KDD, pages 623--631, 2008.
[20]
M. Stonebraker, J. Becla, D. J. DeWitt, K.-T. Lim, D. Maier, O. Ratzesberger, and S. B. Zdonik. Requirements for science data bases and scidb. In CIDR, 2009.
[21]
D. Suciu, A. Connolly, and B. Howe. Embracing uncertainty in large-scale computational astrophysics. In MUD, pages 63--77, 2009.
[22]
T. T. L. Tran, L. Peng, B. Li, Y. Diao, and A. Liu. Pods: a new model and processing algorithms for uncertain data streams. In SIGMOD Conference, pages 159--170, 2010.
[23]
M. Yeh, K. Wu, P. Yu, and M. Chen. PROUD: a probabilistic approach to processing similarity queries over uncertain data streams. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pages 684--695. ACM, 2009.
[24]
Y. Zhao, C. C. Aggarwal, and P. S. Yu. On wavelet decomposition of uncertain time series data sets. In CIKM, pages 129--138, 2010.

Cited By

View all
  • (2020)Exploring of the summer monsoon rainfall around the Himalayas in time domain through maximization of Shannon entropyTheoretical and Applied Climatology10.1007/s00704-020-03186-4Online publication date: 6-Apr-2020
  • (2020)Linking IT Product RecordsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-030-43887-6_9(101-111)Online publication date: 28-Mar-2020
  • (2017)Data Series Similarity Using Correlation-Aware MeasuresProceedings of the 29th International Conference on Scientific and Statistical Database Management10.1145/3085504.3085515(1-12)Online publication date: 27-Jun-2017
  • Show More Cited By

Index Terms

  1. Similarity matching for uncertain time series: analytical and experimental comparison

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    QUeST '11: Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Querying and Mining Uncertain Spatio-Temporal Data
    November 2011
    42 pages
    ISBN:9781450310376
    DOI:10.1145/2064969
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 November 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. distance measure
    2. similarity
    3. time series
    4. uncertain data

    Qualifiers

    • Research-article

    Conference

    GIS '11
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Exploring of the summer monsoon rainfall around the Himalayas in time domain through maximization of Shannon entropyTheoretical and Applied Climatology10.1007/s00704-020-03186-4Online publication date: 6-Apr-2020
    • (2020)Linking IT Product RecordsMachine Learning and Knowledge Discovery in Databases10.1007/978-3-030-43887-6_9(101-111)Online publication date: 28-Mar-2020
    • (2017)Data Series Similarity Using Correlation-Aware MeasuresProceedings of the 29th International Conference on Scientific and Statistical Database Management10.1145/3085504.3085515(1-12)Online publication date: 27-Jun-2017
    • (2016)Approaches of Handling Uncertain Time Series Data towards PredictionInternational Journal of Future Computer and Communication10.18178/ijfcc.2016.5.6.4775:6(233-236)Online publication date: 2016
    • (2013)Uncertain Time Series in Weather PredictionProcedia Technology10.1016/j.protcy.2013.12.22811(557-564)Online publication date: 2013
    • (2012)A probabilistic approach to correlation queries in uncertain time series dataProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398607(2229-2233)Online publication date: 29-Oct-2012
    • (2012)Scalable similarity matching in streaming time seriesProceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II10.1007/978-3-642-30220-6_19(218-230)Online publication date: 29-May-2012

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media