Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1458082.1458187acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Fast correlation analysis on time series datasets

Published: 26 October 2008 Publication History

Abstract

There has been increasing interest for efficient techniques for fast correlation analysis of time series data in different application domains. We present three algorithms for (1) bivariate correlation queries, (2) multivariate correlation queries, and (3) correlation queries based on a new correlation measure we introduce using dynamic time warping. To support these algorithms, we use a variant of the Compact Multi-Resolution Index (CMRI). In addition to conventional nearest neighbor and range queries supported by CMRI, the proposed algorithms compute all answers to user-defined, ad hoc and parametric correlation queries. The results of our experiments indicate a speed-up of two orders of magnitude over the brute force algorithm, and an order of magnitude improvement on average, while offering more functionalities than provided by existing techniques such as StatStream and the Spatial Cone Tree.

References

[1]
C. Aggarwal, A. Hinneburg, and D. Klein. On the surprising behavior of distance metrics in high dimensional space. Proceedings of the Eight International Conference on Database Theory, 1973:420--434, 2001.
[2]
D. Agrawal, Y. Wu, and A. E. Abbadi. A comparison of dft and dwt based similarity search in time-series databases. CIKM, pages 448--495, 2000.
[3]
N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The r*-tree: an efficient and robust method for points and rectangles. Proceedings ACM SIGMOD Conferences on Management of Data, 19(2):322--331, 1990.
[4]
S. Berchtold, D. Keim, and H. Kriegel. The x-tree : An index structure for high-dimensional data. VLDB, pages 28--39, 1996.
[5]
K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is nearest neighbor meaningful? Seventh International Conference on Database Theory, 1540:217--235, 1999.
[6]
K. Chakrabarti, E. Keogh, S. Mehrotra, and M. Pazzani. Dimensionality reduction for fast similarity search in large time series databases. Knowledge and Information System, 3(2):263--286, 2001.
[7]
K. Chakrabarti, E. Keogh, S. Mehrotra, and M. Pazzani. Locally adaptive dimensionality reduction for indexing large time series databases. SIGMOD, 30(2):151--162, 2001.
[8]
A. Chakraborti. An outlook on correlations in stock prices. Econophysics of Stock and other Markets, 2007.
[9]
P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. Proceedings of the 23rd VLDB International Conference, pages 426--435, 1997.
[10]
C. Faloutsos. Searching multimedia databases by content. 1996.
[11]
C. Faloutsos, Y. Manolopoulos, and M. Ranganathan. Fast subsequence matching in time-series databases. SIGMOD, pages 419--429, 1994.
[12]
C. Faloutsos, T. Sellis, and N. Roussopoulos. The r +-tree: A dynamic index for multi-dimensional objects. VLDB, pages 507--518, 1987.
[13]
C. Faloutsos and B. Yi. Fast time sequence indexing for arbitrary lp norms. VLDB, pages 385--394, 2000.
[14]
A. Guttman. R-trees: A dynamic index structure for spatial searching. Proceeedings ACM SIGMOD Conference, pages 47--57, 1984.
[15]
Y. Huang, V. Kumar, S. Shekkara, and P. Zhang. Correlation analysis of spatial time series datasets : A filter-and-refine approach. Proc. of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2637:532--544, 2003.
[16]
Y. Huang, V. Kumar, S. Shekkara, and P. Zhang. Exploiting spatial autocorrelation to efficiently process correlation-based similarity queries. Proc. of the 8th Intl. Symp. on Spatial and Temporal Databases, pages 449--468, 2003.
[17]
Y. Huang, V. Kumar, S. Shekkara, and P. Zhang. Spatial cone tree : An index structure for correlation-based similarity queries on spatial time series data. International Workshop on Next Generation Geospatial Information, pages 19--21, 2003.
[18]
R. Juday, A. Mahlanobis, and B. V. Kumar. Correlation pattern recognition. 2005.
[19]
S. Kadiyala and N. Shiri. A compact multi-resolution index for variable length queries in time series databases. Knowledge and Information Systems, 2007.
[20]
T. Kahveci and A. Singh. Optimizing similarity search for arbitrary length time series queries. IEEE Transactions on Knowledge and Data Engineering, 16(4):418--433, 2004.
[21]
R. Kataoka, Y. Sakurai, S. Uemura, and M. Yoshikawa. Similarity search for adaptive ellipsoid queries using spatial transformation. VLDB, pages 231--240, 2001.
[22]
K. Kelley and S. Maxwell. Sample size for multiple regression: obtaining regression coefficients that are accurate, not simply significant. Psychological Methods, 8(3):305--321, 2003.
[23]
E. Keogh and C. Ratanamahatana. Exact indexing of dynamic time warping. Knowledge and Information Systems, 7(3):358--386, 2004.
[24]
J. Lee and M. Verleysen. Nonlinear dimensionality reduction. 2006.
[25]
C. Myers and L. Rabiner. A level building dynamic time warping algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 2(29):284--297, 1981.
[26]
K. Pearson. Mathematical contributions to the theory of evolution. Supplement to a memoir on skew variation, Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 197:443--459, 1901.
[27]
T. Sellis and Y. Theodoridis. A model for the prediction of r-tree performance. PODS, pages 161--171, 1996.
[28]
D. Shasha and Z. Zhu. Statstream: Statistical monitoring of thousands of data streams in real time. VLDB, pages 358--369, 2002.
[29]
D. Shasha and Z. Zhu. High performance discovery in time series: Techniques and case studies. 2004.

Cited By

View all
  • (2017)Correlation analysis techniques for uncertain time seriesKnowledge and Information Systems10.1007/s10115-016-0939-750:1(79-116)Online publication date: 1-Jan-2017
  • (2013)Financial Time Series Processing: A Roadmap of Online and Offline MethodsBusiness Intelligence and Performance Management10.1007/978-1-4471-4866-1_10(145-162)Online publication date: 2013
  • (2010)Shape-based indexing scheme for camera view invariant 3-D object retrievalMultimedia Tools and Applications10.1007/s11042-009-0404-747:1(7-29)Online publication date: 1-Mar-2010

Index Terms

  1. Fast correlation analysis on time series datasets

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management
    October 2008
    1562 pages
    ISBN:9781595939913
    DOI:10.1145/1458082
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. correlation analysis
    2. dynamically time warped correlation
    3. multi-resolution index
    4. time series

    Qualifiers

    • Research-article

    Conference

    CIKM08
    CIKM08: Conference on Information and Knowledge Management
    October 26 - 30, 2008
    California, Napa Valley, USA

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Correlation analysis techniques for uncertain time seriesKnowledge and Information Systems10.1007/s10115-016-0939-750:1(79-116)Online publication date: 1-Jan-2017
    • (2013)Financial Time Series Processing: A Roadmap of Online and Offline MethodsBusiness Intelligence and Performance Management10.1007/978-1-4471-4866-1_10(145-162)Online publication date: 2013
    • (2010)Shape-based indexing scheme for camera view invariant 3-D object retrievalMultimedia Tools and Applications10.1007/s11042-009-0404-747:1(7-29)Online publication date: 1-Mar-2010

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media