Abstract
Early classification on multivariate time series has recently emerged as a novel and important topic in data mining fields with wide applications such as early detection of diseases in healthcare domains. Most of the existing studies on this topic focused only on univariate time series, while some very recent works exploring multivariate time series considered only numerical attributes and are not applicable to multivariate time series containing both of numerical and categorical attributes. In this paper, we present a novel methodology named REACT (Reliable EArly ClassificaTion), which is the first work addressing the issue of constructing an effective classifier on multivariate time series with numerical and categorical attributes in serial manner so as to guarantee stability of accuracy compared to the classifiers using full-length time series. Furthermore, we also employ the GPU parallel computing technique to develop an extended mechanism for building the early classifier efficiently. Experimental results on real datasets show that REACT significantly outperforms the state-of-the-art method in terms of accuracy and earliness, and the GPU implementation is verified to substantially enhance the efficiency by several orders of magnitudes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bache, K., Lichman, M.: UCI machine learning repository. University of California, Irvine (2013)
Batal, I., Hauskrecht, M.: Constructing classification features using minimal predictive patterns. In: 10th CIKM, New York, pp. 869–878 (2010)
Baranzini, S.E., Mousavi, P., Rio, J., Caillier, S.J., Stillman, A., Villoslada, P., Wyatt, M.M., Comabella, M., Greller, L.D., Somogyi, R., Oksenberg, J.R.: Transcription-based prediction of response to IFNβ using supervised computational methods. PLos Biology 3(1), 166–176 (2005)
Chang, K.W., Deka, B., Hwu, W.M.H., Roth, D.: Efficient Pattern-Based Time Series Classification on GPU. In: ICDM, Belgium, pp. 131–140 (2012)
Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2) (2008)
Gao, C., Wang, J.: Efficient itemset generator discovery over a stream sliding window. In: 9th CIKM, Hong Kong, pp. 355–364 (2009)
Ghalwash, M.F., Radosavljevic, V., Obradovic, Z.: Extraction of Interpretable Multivariate Patterns for Early Diagnostics. In: 13th ICDM, Dallas, pp. 201–210 (2013)
Ghalwash, M.F., Obradovic, Z.: Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinformatics 13(195) (2012)
Griffin, M.P., Moorman, J.R.: Toward the early diagnosis of neonatal sepsis and sepsis-like illness using novel heart rate analysis. PEDIATRICS 107(1), 97–104 (2001)
He, G.., Duan, Y., Qian, T.Y., Chen, X.: Early prediction on imbalanced multivariate time series. In: 22th CIKM, Burlingame, pp. 1889–1892 (2013)
Lee, C., Chen, J.C., Tseng, V.S.: A novel data mining mechanism considering bio-signal and environmental data with application on asthma monitoring. Computer Methods and Program in Biomedicine 101(1), 44–61 (2011)
Li, J., Li, H., Wong, L., Pei, J., Dong, G.: Minimum description length principle: Generators are preferable to closed patterns. In: 21th AAAI, Boston, pp. 409–414 (2006)
Li, J., Liu, G., Wong, L.: Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: 13th KDD, New York, pp. 430–439 (2007)
Lines, J., Davis, L.M., Hills, J., Bagnall, A.: A shapelet transform for time series classification. In: 18th KDD, New York, pp. 289–297 (2012)
Lo, D., Khoo, S., Li, J.: Mining and ranking generators of sequential patterns. In: SDM, Atlanta, pp. 553–564 (2008)
Olszewski, R.T.: Generalized feature extraction for structural pattern recognition in time-series data. PhD Thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh (2011)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering Frequent Closed Itemsets for Association Rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Xing, Z., Pei, J., Dong, G., Yu, P. S.: Mining sequence classifiers for early prediction. In: SDM, Atlanta, pp. 644–655 (2008)
Xing, Z., Pei, J., Yu, P.S.: Early classification on time series: A nearest neighbor approach. In: 21th IJCAI, Pasadena, pp. 1297–1302 (2009)
Ye, L., Keogh, E.: Time series shapelet: A new primitive for data mining. In: 15th KDD, Paris, pp. 947–956 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lin, YF., Chen, HH., Tseng, V.S., Pei, J. (2015). Reliable Early Classification on Multivariate Time Series with Numerical and Categorical Attributes. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9077. Springer, Cham. https://doi.org/10.1007/978-3-319-18038-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-18038-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18037-3
Online ISBN: 978-3-319-18038-0
eBook Packages: Computer ScienceComputer Science (R0)