Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases

Published: 01 July 2004 Publication History

Abstract

In this paper, an algorithm is proposed for subsequence matching that supports normalization transform in time-series databases. Normalization transform enables finding sequences with similar fluctuation patterns even though they are not close to each other before the normalization transform. Simple application of existing subsequence matching algorithms to support normalization transform is not feasible since the algorithms do not have information for normalization transform of subsequences of arbitrary lengths. Application of the existing whole matching algorithm supporting normalization transform to the subsequence matching is feasible, but requires an index for every possible length of the query sequence causing serious overhead on both storage space and update time. The proposed algorithm generates indexes only for a small number of different lengths of query sequences. For subsequence matching it selects the most appropriate index among them. Better search performance can be obtained by using more indexes. In this paper, the approach is called index interpolation . It is formally proved that the proposed algorithm does not cause false dismissal. The search performance can be traded off with storage space by adjusting the number of indexes. For performance evaluation, a series of experiments is conducted using the indexes for only five different lengths out of lengths 256~512 of the query sequence. The results show that the proposed algorithm outperforms the sequential scan by up to 2.4 times on the average when the selectivity of the query is 10 2 and up to 14.6 times when it is 10 5 . Since the proposed algorithm performs better with smaller selectivities, it is suitable for practical situations, where the queries with smaller selectivities are much more frequent.

References

[1]
Agrawal, R. et al. 1993. Efficient similarity search in sequence databases. In Proc. Int'l Conf. on Foundations of Data Organization and Algorithms, Chicago, Illinois, pp. 69-84.
[2]
Agrawal, R. et al. 1995. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proc. Int'l Conf. on Very Large Data Bases, Zurich, Switzerland, pp. 490-501.
[3]
Beckmann, N. et al. 1990. The R *-tree: An efficient and robust access method for points and rectangles. In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Atlantic City, NJ, pp. 322-331.
[4]
Berchtold, S. et al. 1996. The X-tree: An index structure for high-dimensional data. In Proc. Int'l Conf. on Very Large Data Bases, Mumbai, India, pp. 28-39.
[5]
Chatfield, C. 1984. The Analysis of Time Series: An Introduction, 3rd ed., Chapman and Hall.
[6]
Chan, K.-P. and Fu, W.-C. 1999. Efficient time series matching by wavelets. In Proc. Int'l Conf. on Data Engineering (ICDE), Sydney, Australia: IEEE, pp. 126-133.
[7]
Chu, K.K.W. and Wong, M.H. 1999. Fast time-series searching with scaling and shifting. In Proc. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Philadelphia, Pennsylvania, pp. 237-248.
[8]
Faloutsos, C. et al. 1994. Fast subsequence matching in time-series databases. In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Minneapolis, Minnesota, pp. 419-429.
[9]
Goldin, D.Q. and Kanellakis, P.C. 1995. On similarity queries for time-series data: Constraint specification and implementation. In Proc. Int'l Conf. on Principles and Practices of Constraint Programming, Cassis, France, pp. 137-153.
[10]
Gonzalez, R.C. and Woods, R.E. 1993. Digital Image Processing, Addison-Wesley.
[11]
Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Boston, Massachusetts, pp. 47-57.
[12]
Hart, J.M. 1997. Win32 System Programming. Addison-Wesley Developers Press.
[13]
Kendall, M. 1976. Time-Series, 2nd ed., Charles Griffin and Company.
[14]
Kreyszig, E. 1993. Advanced Engineering Mathematics, 7th ed., John Wiley & Sons.
[15]
Loh, W.-K. et al. 2001. Index interpolation: A subsequence matching algorithm supporting moving average transform of arbitrary order in time-series databases. IEICE Trans. Information and Systems, E84-D(1):76-86.
[16]
Moon, Y., Whang, K., and Loh, W. 2001. Duality-based subsequence matching in time-series databases. In Proc. Int'l Conf. on Data Engineering (ICDE), Heidelberg, Germany: IEEE, pp. 263-272.
[17]
Moon, Y., Whang, K., and Han, W. 2002. A subsequence matching method in time-series databases based on generalized windows. In Proc. Int'l Conf. on Management of Data, Madison, Wisconsin: ACM SIGMOD.
[18]
Oppenheim, A.V. and Schafer, R.W. 1975. Digital Signal Processing, Prentice-Hall.
[19]
Press, W.H. et al. 1992. Numerical Recipes in C--The Art of Scientific Computing, 2nd ed., Cambridge University Press.
[20]
Rafiei, D. and Mendelzon, A. 1997. Similarity-based queries for time series data. In Proc. Int'l Conf. on Management of Data, Tucson, Arizona: ACM SIGMOD, pp. 13-25.
[21]
Sellis, T. et al. 1987. The R +-tree: A dynamic index for multidimensional objects. In Proc. Int'l Conf. on Very Large Data Bases, Brighton, England, pp. 507-518.
[22]
Weber, R. et al. 1998. A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In Proc. Int'l Conf. on Very Large Data Bases, New York, New York, pp. 194-205.
[23]
Yi, B.-K. et al. 1998. Efficient retrieval of similar time sequences under time warping. In Proc. Int'l Conf. on Data Engineering (ICDE), Orlando, Florida: IEEE, pp. 201-208.

Cited By

View all
  • (2020)Scalable data series subsequence matching with ULISSEThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-020-00619-429:6(1449-1474)Online publication date: 4-Jul-2020
  • (2018)A novel join technique for similar-trend searches supporting normalization on time-series databasesProceedings of the 33rd Annual ACM Symposium on Applied Computing10.1145/3167132.3173383(481-486)Online publication date: 9-Apr-2018
  • (2018)A time-series matching approach for symmetric-invariant boundary image matchingMultimedia Tools and Applications10.1007/s11042-017-5323-477:16(20979-21001)Online publication date: 1-Aug-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery  Volume 9, Issue 1
July 2004
110 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 July 2004

Author Tags

  1. index interpolation
  2. normalization transform
  3. subsequence matching
  4. time-series databases

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)Scalable data series subsequence matching with ULISSEThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-020-00619-429:6(1449-1474)Online publication date: 4-Jul-2020
  • (2018)A novel join technique for similar-trend searches supporting normalization on time-series databasesProceedings of the 33rd Annual ACM Symposium on Applied Computing10.1145/3167132.3173383(481-486)Online publication date: 9-Apr-2018
  • (2018)A time-series matching approach for symmetric-invariant boundary image matchingMultimedia Tools and Applications10.1007/s11042-017-5323-477:16(20979-21001)Online publication date: 1-Aug-2018
  • (2018)Exact indexing for massive time series databases under time warping distanceData Mining and Knowledge Discovery10.1007/s10618-010-0165-y21:3(509-541)Online publication date: 26-Dec-2018
  • (2018)Matching Consecutive Subpatterns over Streaming Time SeriesWeb and Big Data10.1007/978-3-319-96893-3_8(90-105)Online publication date: 23-Jul-2018
  • (2017)Boundary image matching supporting partial denoising using time-series matching techniquesMultimedia Tools and Applications10.1007/s11042-016-3479-y76:6(8471-8496)Online publication date: 1-Mar-2017
  • (2014)Interactive noise-controlled boundary image matching using the time-series moving average transformMultimedia Tools and Applications10.1007/s11042-013-1552-372:3(2543-2571)Online publication date: 1-Oct-2014
  • (2011)Similar subsequence search in time series databasesProceedings of the 22nd international conference on Database and expert systems applications - Volume Part I10.5555/2035368.2035390(232-246)Online publication date: 29-Aug-2011
  • (2010)Lag patterns in time series databasesProceedings of the 21st international conference on Database and expert systems applications: Part II10.5555/1887568.1887591(209-224)Online publication date: 30-Aug-2010
  • (2008)Similar sequence matching supporting variable-length and variable-tolerance continuous queries on time-series data streamInformation Sciences: an International Journal10.1016/j.ins.2007.10.026178:6(1461-1478)Online publication date: 20-Mar-2008
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media