Abstract
In recent times, functional data analysis has been successfully applied in the field of high dimensional data classification. In this paper, we present a classification framework using functional data and classwise Principal Component Analysis (PCA). Our proposed method can be used in high dimensional time series data which typically suffers from small sample size problem. Our method extracts a piecewise linear functional feature space and is particularly suitable for hard classification problems. The proposed framework converts time series data into functional data and uses classwise functional PCA for feature extraction followed by classification using a Bayesian linear classifier. We demonstrate the efficacy of our proposed method by applying it to both synthetic data sets and real time series data from diverse fields including but not limited to neuroscience, food science, medical sciences and chemometrics.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
Available on request.
Code availability
Notes
\(A=\{f_1,f_2,\ldots , f_n\}\) is said to be linearly independent (LI) if \(c_1f_1+c_2f_2+\ldots +c_nf_n=\textbf{0}\), where \(c_i\)’s are scalars and \(\textbf{0}\) denotes zero function, has only one solution i.e., \(c_1=c_2=\ldots =c_n=0\) (Kreyszig 1991). A set that is not LI is called linearly dependent (LD). Hence a linearly dependent set of functions has at least one function \(f_j\) that can be written as a linear combination of other elements of that set. Usually, the Gram–Schmidt orthonormalization process is applied to LI set. If we try to apply the Gram–Schmidt orthonormalization process on the LD set, then some \(g_j\), for \(j\in \{1, \ldots ,n\}\) will become zero function, and hence we cannot obtain an orthonormal set.
References
Acal C, Aguilera AM (2022) Basis expansion approaches for functional analysis of variance with repeated measures. Adv Data Anal Classif:1–31
Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. In: International conference on foundations of data organization and algorithms. Springer, pp 69–84
Aguilera AM, Escabias M (2000) Principal component logistic regression. In: COMPSTAT. Springer, pp 175–180
Alpaydin E (2021) Machine learning. MIT Press, Cambridge
Bagnall A, Lines J, Hills J, Bostrom A (2015) Time-series classification with cote: the collective of transformation-based ensembles. IEEE Trans Knowl Data Eng 27(9):2522–2535
Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660
Bagnall A, Flynn M, Large J, Lines J, Middlehurst M (2020) On the usage and performance of the hierarchical vote collective of transformation-based ensembles version 1.0 (hive-cote v1. 0). In: International workshop on advanced analytics and learning on temporal data. Springer, pp 3–18
Bagnall A, Lines J, Vickers W, Keogh E (2018) The uea & ucr time series classification repository. http://www.timeseriesclassification.com
Belhumeur PN, Hespanha JP, Kriegman DJ (1996) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. In: European conference on computer vision. Springer, pp 43–58
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Björck Å (1967) Solving linear least squares problems by gram-schmidt orthogonalization. BIT Numer Math 7(1):1–21
Bostrom A, Bagnall A (2017) Binary shapelet transform for multiclass time series classification, pp 24–46
Bottou L, Curtis FE, Nocedal J (2018) Optimization methods for large-scale machine learning. Siam Review 60(2):223–311
Carmen Aguilera-Morillo M, Aguilera AM (2020) Multi-class classification of biomechanical data: a functional lda approach based on multi-class penalized functional pls. Stat Model 20(6):592–616
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785–794
Chiou J-M, Chen Y-T, Yang Y-F (2014) Multivariate functional principal component analysis: a normalization approach. Statistica Sinica:1571–1596
Das K, Nenadic Z (2009) An efficient discriminant-based solution for small sample size problem. Pattern Recogn 42(5):857–866
Dau HA, Bagnall A, Kamgar K, Yeh C-CM, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The ucr time series archive. IEEE/CAA J Automatica Sinica 6(6):1293–1305
Dempster A, Petitjean F, Webb GI (2020) Rocket: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min Knowl Discov 34(5):1454–1495
Dempster A, Schmidt DF, Webb GI (2021) Minirocket: a very fast (almost) deterministic transform for time series classification. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 248–257
Escabias M, Aguilera AM, Valderrama MJ (2004) Principal component estimation of functional logistic regression: discussion of two different approaches. J Nonparametric Stat 16(3–4):365–384
Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller P-A (2019) Deep learning for time series classification: a review. Data Min Knowl Discov 33(4):917–963
Fawaz HI, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller P-A, Petitjean F (2020) Inceptiontime: finding alexnet for time series classification. Data Min Knowl Discov 34(6):1936–1962
Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New York
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175
Garcia S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9(12)
Ge H (1998) Iterative gram-schmidt orthonormalization for efficient parameter estimation. In: Proceedings of the 1998 IEEE international conference on acoustics, speech and signal processing, ICASSP’98 (Cat. No. 98CH36181). IEEE, vol 4, pp 2477–2480
Gertheiss J, Maity A, Staicu A-M (2013) Variable selection in generalized functional linear models. Stat 2(1):86–101
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press, Cambridge
Górecki T, Krzyśko M (2012) A kernel version of functional principal component analysis. Stat Trans New Ser 13(3):559–668
Hadjipantelis PZ, Müller H-G (2018) Functional data analysis for big data: a case study on california temperature trends. In: Handbook of big data analytics, pp 457–483
Hall P, Müller H-G, Wang J-L (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Stat:1493–1517
Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat:73–102
Horváth L, Kokoszka P (2012) Inference for functional data with applications, vol 200. Springer, New York
Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley, Chichester
Huang Y-W, Yu PS (1999) Adaptive query processing for time-series data. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 282–286
Huang X, Caron M, Hindson D (2001) A recursive gram-schmidt orthonormalization procedure and its application to communications. In: 2001 IEEE third workshop on signal processing advances in wireless communications (SPAWC’01). Workshop Proceedings (Cat. No. 01EX471). IEEE, pp 340–343
Izenman AJ (2008) Modern multivariate statistical techniques. Springer, New York
James GM, Hastie TJ (2001) Functional linear discriminant analysis for irregularly sampled curves. J R Stat Soc Ser B (Stat Methodol) 63(3):533–550
Joy AA, HasanMd AM, Sayeed A (2020) An improved class-wise principal component analysis based feature extraction framework for hyperspectral image classification. In: Proceedings of the international conference on computing advancements, pp 1–6
Kadri H, Preux P, Duflos E, Canu S (2011) Multiple functional regression with both discrete and continuous covariates. Recent Adv Funct Data Anal Relat Top. Springer, pp 189–195
Koel D, Rizzuto Daniel S, Zoran N (2009) Mental state estimation for brain-computer interfaces. IEEE Trans Biomed Eng 56(8):2114–2122
Korenberg M, Billings SA, Liu YP, McIlroy PJ (1988) Orthogonal parameter estimation algorithm for non-linear stochastic systems. Int J Control 48(1):193–210
Kreyszig E (1991) Introductory functional analysis with applications, vol 17. Wiley, New York
Kvam PH, Vidakovic B (2007) Nonparametric statistics with applications to science and engineering. Wiley, New Jersey
Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592
Liu Z-Y, Chiu K-C, Lei X (2003) Improved system for object detection and star/galaxy classification via local subspace analysis. Neural Netw 16(3–4):437–451
Liu R, Wang H, Wang S (2018) Functional variable selection via gram-schmidt orthogonalization for multiple functional linear regression. J Stat Comput Simul 88(18):3664–3680
Lucas B, Shifaz A, Pelletier C, O’Neill L, Zaidi N, Goethals B, Petitjean F, Webb GI (2019) Proximity forest: an effective and scalable distance-based classifier for time series. Data Min Knowl Discov 33(3):607–635
McCullagh P, Nelder JA (1989) Binary data. In: McCullagh P, Nelder JA (eds) Generalized linear models. Springer, New York, pp 98–148
Middlehurst M, Large J, Flynn M, Lines J, Bostrom A, Bagnall A (2021) Hive-cote 2.0: a new meta ensemble for time series classification. Mach Learn 110(11):3211–3243
Middlehurst M, Large J, Bagnall A (2020a) The canonical interval forest (cif) classifier for time series classification. In: 2020 IEEE international conference on big data (big data). IEEE, pp 188–195
Middlehurst M, Large J, Cawley G, Bagnall A (2020b) The temporal dictionary ensemble (tde) classifier for time series classification. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 660–676
Middlehurst M, Vickers W, Bagnall A (2019) Scalable dictionary classifiers for time series classification. In: International conference on intelligent data engineering and automated learning. Springer, pp 11–19
Min W, Ke L, He X (2004) Locality pursuit embedding. Pattern Recogn 37(4):781–788
Pascual-Marqui RD et al (2002) Standardized low-resolution brain electromagnetic tomography (sloreta): technical details. Methods Find Exp Clin Pharmacol 24(Suppl D):5–12
Pfisterer F, Beggel L, Sun X, Scheipl F, Bischl B (2019) Benchmarking time series classification–functional data vs machine learning approaches. Preprint arXiv:1911.07511
Preda C, Saporta G, Lévéder C (2007) Pls classification of functional data. Comput Stat 22(2):223–235
Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2013) Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Trans Knowl Discov from Data (TKDD) 7(3):1–31
Ramsay JO (2004) Functional data analysis. Encyclopedia Stat Sci 4
Ramsay JO (2009) Giles Hooker, and Spencer Graves, Introduction to functional data analysis. Springer, New York
Ramsay JO, Silverman BW (2006) Functional data analysis, 2nd edn. Springer, New York
Ramsay JO, Silverman BW (2007) Applied functional data analysis: methods and case studies. Springer, New York
Ramsay J, Silverman BW (2013) Functional data analysis, springer series in statistics. Springer, New York
Rossion B, Joyce CA, Cottrell GW, Tarr MJ (2003) Early lateralization and orientation tuning for face, word, and object processing in the visual cortex. Neuroimage 20(3):1609–1624
Roy TS, Giri B, Chowdhury AS, Mazumder S, Das K (2020) How our perception and confidence are altered using decision cues. Front Neurosci 13:1371
Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021) The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 35(2):401–449
Saito N, Coifman RR (1995) Local discriminant bases and their applications. J Math Imaging Vis 5(4):337–358
Schäfer P (2015) The boss is concerned with time series classification in the presence of noise. Data Min Knowl Discov 29(6):1505–1530
Schäfer P (2016) Scalable time series classification. Data Min Knowl Discov 30(5):1273–1298
Schäfer P, Leser U (2017) Fast and accurate time series classification with weasel. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 637–646
Shifaz A, Pelletier C, Petitjean F, Webb GI (2020) Ts-chief: a scalable and accurate forest algorithm for time series classification. Data Min Knowl Discov 34(3):742–775
Shin H, Hsing T (2012) Linear prediction in functional data analysis. Stoch Process Appl 122(11):3680–3700
Stein ML (1999) Interpolation of spatial data: some theory for kriging. Springer, Chicago
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Ullah S, Finch CF (2010) Functional data modelling approach for analysing and predicting trends in incidence rates-an application to falls injury. Osteoporosis Int 21(12):2125–2134
Ullah S, Finch CF (2013) Applications of functional data analysis: a systematic review. BMC Med Res Methodol 13(1):1–12
Wang J-L, Chiou J-M, Müller H-G (2016) Functional data analysis. Ann Rev Stat Appl 3:257–295
Wright MN, Ziegler A (2015) Ranger: a fast implementation of random forests for high dimensional data in c++ and r. Preprint arXiv:1508.04409
Acknowledgements
A. Chatterjee is supported by an INSPIRE fellowship from the Department of Science and Technology (DST), Government of India. We sincerely thank the anonymous reviewers for their insightful comments which have significantly improved the manuscript.
Funding
This work was supported by INSPIRE fellowship from the Department of Science and Technology (DST), Government of India (INSPIRE Code: IF170367)
Author information
Authors and Affiliations
Contributions
AC and KD designed the research. AC and SM performed the research. AC wrote the manuscript. KD edited the manuscript and supervised the entire work.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Responsible editor: Michelangelo Ceci.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chatterjee, A., Mazumder, S. & Das, K. Functional classwise principal component analysis: a classification framework for functional data analysis. Data Min Knowl Disc 37, 552–594 (2023). https://doi.org/10.1007/s10618-022-00898-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-022-00898-1