Article

Free access

Autowarp: learning a warping distance from unlabeled time series using sequence autoencoders

Authors:

James ZouAuthors Info & Claims

NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems

Pages 10568 - 10578

Published: 03 December 2018 Publication History

PDF eReader Publisher Site

Abstract

Measuring similarities between unlabeled time series trajectories is an important problem in domains as diverse as medicine, astronomy, finance, and computer vision. It is often unclear what is the appropriate metric to use because of the complex nature of noise in the trajectories (e.g. different sampling rates or outliers). Domain experts typically hand-craft or manually select a specific metric, such as dynamic time warping (DTW), to apply on their data. In this paper, we propose Autowarp, an end-to-end algorithm that optimizes and learns a good metric given unlabeled trajectories. We define a flexible and differentiable family of warping metrics, which encompasses common metrics such as DTW, Euclidean, and edit distance. Autowarp then leverages the representation power of sequence autoen-coders to optimize for a member of this warping distance family. The output is a metric which is easy to interpret and can be robustly learned from relatively few trajectories. In systematic experiments across different domains, we show that Autowarp often outperforms hand-crafted trajectory similarity metrics.

References

[1]

Eamonn Keogh, Jessica Lin, Ada Waichee Fu, and Helga VanHerle. Finding unusual medical time-series subsequences: Algorithms and applications. IEEE Transactions on Information Technology in Biomedicine, 10(3):429-439, 2006.

Digital Library

[2]

Ruey S Tsay. Analysis of Financial Time Series, volume 543. John Wiley & Sons, 2005.

[3]

Jeffrey D Scargle. Studies in astronomical time series analysis. ii-statistical aspects of spectral analysis of unevenly spaced data. The Astrophysical Journal, 263:835-853, 1982.

[4]

Rake & Agrawal King-lp Lin and Harpreet S Sawhney Kyuseok Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceeding of the 21th International Conference on Very Large Databases, pages 490-501. Citeseer, 1995.

Digital Library

[5]

Eric P Xing, Michael I Jordan, Stuart J Russell, and Andrew Y Ng. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems, pages 521-528, 2003.

Digital Library

[6]

Hongsheng Yin, Honggang Qi, Jingwen Xu, William NN Hung, and Xiaoyu Song. Generalized framework for similarity measure of time series. Mathematical Problems in Engineering, 2014.

[7]

Philippe Besse, Brendan Guillouet, Jean-Michel Loubes, and Royer François. Review and perspective for distance based trajectory clustering. arXiv preprint arXiv:1508.04904, 2015.

[8]

Haozhou Wang, Han Su, Kai Zheng, Shazia Sadiq, and Xiaofang Zhou. An effectiveness study on trajectory similarity measures. In Proceedings of the Twenty-Fourth Australasian Database Conference, pages 13-22. Australian Computer Society, Inc., 2013.

Digital Library

[9]

Zhiguang Wang, Weizhong Yan, and Tim Oates. Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks, pages 1578-1585. IEEE, 2017.

[10]

Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon West-over, Qiang Zhu, Jesin Zakaria, and Eamonn Keogh. Searching and mining trillions of time series subsequences under dynamic time warping. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 262-270. ACM, 2012.

Digital Library

[11]

Hiroaki Sakoe and Seibi Chiba. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(1):43-49, 1978.

[12]

Lei Chen, M Tamer Özsu, and Vincent Oria. Robust and fast similarity search for moving object trajectories. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pages 491-502. ACM, 2005.

Digital Library

[13]

Lei Chen and Raymond Ng. On the marriage of lp-norms and edit distance. In Proceedings of the Thirtieth International Conference on Very Large Databases, pages 792-803. VLDB Endowment, 2004.

Digital Library

[14]

Joe K Kearney and Stuart Hansen. Stream editing for animation. Technical report, Iowa University, 1990.

[15]

Dev Goyal, Zeeshan Syed, and Jenna Wiens. Clinically meaningful comparisons over time: An approach to measuring patient similarity based on subsequence alignment. arXiv preprint arXiv:1803.00744, 2018.

[16]

Pierre-François Marteau. Time warp edit distance with stiffness adjustment for time series matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):306-318, 2009.

Digital Library

[17]

Zachary C Lipton, David C Kale, Charles Elkan, and Randall Wetzel. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677, 2015.

[18]

Wenjie Pei, David MJ Tax, and Laurens van der Maaten. Modeling time series similarity with siamese recurrent networks. arXiv preprint arXiv:1603.04713, 2016.

[19]

Martin Längkvist, Lars Karlsson, and Amy Loutfi. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 42:11-24, 2014.

[20]

Pankaj Malhotra, Vishnu TV, Lovekesh Vig, Puneet Agarwal, and Gautam Shroff. Timenet: Pre-trained deep recurrent neural network for time series classification. arXiv preprint arXiv:1706.08838, 2017.

[21]

Dominique T Shipmon, Jason M Gurevitch, Paolo M Piselli, and Stephen T Edwards. Time series anomaly detection; detection of anomalous drops with limited features and sparse examples in noisy highly periodic data. arXiv preprint arXiv:1708.03665, 2017.

[22]

Pablo Romeu, Francisco Zamora-Martínez, Paloma Botella-Rocamora, and Juan Pardo. Stacked denoising auto-encoders for short-term time series forecasting. In Artificial Neural Networks, pages 463-486. Springer, 2015.

[23]

Andrew M Dai and Quoc V Le. Semi-supervised sequence learning. In Advances in Neural Information Processing Systems, pages 3079-3087, 2015.

Digital Library

[24]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, pages 3104-3112, 2014.

Digital Library

[25]

Nitish Srivastava, Elman Mansimov, and Ruslan Salakhudinov. Unsupervised learning of video representations using lstms. In International Conference on Machine Learning, pages 843-852, 2015.

Digital Library

[26]

Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.

[27]

Mohammed J Zaki, Wagner Meira Jr, and Wagner Meira. Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, 2014.

Digital Library

[28]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pages 3111-3119, 2013.

Digital Library

[29]

Stan Salvador and Philip Chan. Toward accurate dynamic time warping in linear time and space. Intelligent Data Analysis, 11(5):561-580, 2007.

Digital Library

Autowarp: learning a warping distance from unlabeled time series using sequence autoencoders
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
2. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms

Recommendations

Are Slice-Based Cohesion Metrics Actually Useful in Effort-Aware Post-Release Fault-Proneness Prediction? An Empirical Study
Background. Slice-based cohesion metrics leverage program slices with respect to the output variables of a module to quantify the strength of functional relatedness of the elements within the module. Although slice-based cohesion metrics have been ...
Object-oriented runtime software quality analysis
The hyperdyadic index and generalized indexing and query with PIQUE
SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management

Many scientists rely on indexing and query to identify trends and anomalies within extreme-scale scientific data. Compressed bitmap indexing (e.g., FastBit) is the go-to indexing method for many scientific datasets and query workloads. Recently, the ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems

December 2018

11021 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 03 December 2018

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
208
Total Downloads

Downloads (Last 12 months)99
Downloads (Last 6 weeks)12

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents