Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Robust and sparse canonical correlation analysis for fault detection and diagnosis using training data with outliers

Published: 01 February 2024 Publication History
  • Get Citation Alerts
  • Abstract

    A well-known shortcoming of the traditional canonical correlation analysis (CCA) is the lack of robustness against outliers. This shortcoming hinders the application of CCA in the case where the training data contain outliers. To overcome this shortcoming, this paper proposes robust CCA (RCCA) methods for the analysis of multivariate data with outliers. The robustness is achieved by the use of weighted covariance matrices in which the detrimental effect of outliers is reduced by adding small weight coefficients on them. The RCCA is then extended to the robust sparse CCA (RSCCA) by imposing the l 1-norm constraints on canonical projection vectors to obtain the sparsity property. Based on the RCCA and RSCCA, a robust data-driven fault detection and diagnosis (FDD) method is proposed for industrial processes. A residual generation model is built using projection vectors of the RCCA or RSCCA. The robust squared Mahalanobis distance of the residual is used for fault detection. A contribution-based fault diagnosis method is developed to identify the faulty variables that may cause the fault. The performance and advantages of the proposed methods are illustrated with two case studies. The results of two case studies prove that the RCCA and RSCCA methods have high robustness against outliers, and the robust FDD method is able to yield reliable results even if using the low-quality training data with outliers.

    References

    [1]
    J.G. Adrover, S.M. Donato, A robust predictive approach for canonical correlation analysis, Journal of Multivariate Analysis 133 (2015) 356–376.
    [2]
    M. Alauddin, F. Khan, S. Imtiaz, S. Ahmed, A bibliometric review and analysis of data-driven fault detection and diagnosis methods for process systems, Industrial & Engineering Chemistry Research 57 (32) (2018) 10719–10735.
    [3]
    A. Alfons, C. Croux, P. Filzmoser, Robust maximum association estimators, Journal of the American Statistical Association 112 (517) (2017) 436–445.
    [4]
    M.T. Amin, F. Khan, S. Ahmed, S. Imtiaz, A data-driven Bayesian network learning method for process fault diagnosis, Process Safety and Environmental Protection 150 (2021) 110–122.
    [5]
    R. Arunthavanathan, F. Khan, S. Ahmed, S. Imtiaz, An analysis of process fault diagnosis methods from safety perspectives, Computers and Chemical Engineering 145 (2021).
    [6]
    R. Arunthavanathan, F. Khan, S. Ahmed, S. Imtiaz, A deep learning model for process fault prognosis, Process Safety and Environmental Protection 154 (2021) 467–479.
    [7]
    R. Arunthavanathan, F. Khan, S. Ahmed, S. Imtiaz, Autonomous fault diagnosis and root cause analysis for the processing system using one-class SVM and NN permutation algorithm, Industrial & Engineering Chemistry Research 61 (3) (2022) 1408–1422.
    [8]
    A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences 2 (1) (2009) 183–202.
    [9]
    J.A. Branco, C. Croux, P. Filzmoser, M.R. Oliveira, Robust canonical correlations: A comparative study, Computational Statistics 20 (2) (2005) 203–229.
    [10]
    J. Cai, W. Dan, X. Zhang, l 0-based sparse canonical correlation analysis with application to cross-language document retrieval, Neurocomputing 329 (2019) 32–45.
    [11]
    Y. Cao, X. Yuan, Y. Wang, W. Gui, Hierarchical hybrid distributed PCA for plant-wide monitoring of chemical processes, Control Engineering Practice 111 (2021).
    [12]
    H. Chen, L. Li, C. Shang, B. Huang, Fault detection for nonlinear dynamic systems with consideration of modeling errors: A data-driven approach, IEEE Transactions on Cybernetics (2022),.
    [13]
    H. Chen, Z. Chen, Z. Chai, B. Jiang, B. Huang, A single-side neural network-aided canonical correlation analysis with applications to fault diagnosis, IEEE Transactions on Cybernetics 52 (9) (2022) 9454–9466.
    [14]
    H. Chen, B. Jiang, S.X. Ding, B. Huang, Data-driven fault diagnosis for traction systems in high-speed trains: A survey, challenges, and perspectives, IEEE Transactions on Intelligent Transportation Systems 23 (3) (2022) 1700–1716.
    [15]
    Z. Chen, Y. Cao, S.X. Ding, K. Zhang, T. Koenings, T. Peng, …., W. Gui, A distributed canonical correlation analysis-based fault detection method for plant-wide process monitoring, IEEE Transactions on Industrial Informatics 15 (5) (2019) 2710–2720.
    [16]
    X. Chen, H. Liu, An efficient optimization algorithm for structured sparse cca, with applications to eqtl mapping, Statistics in Biosciences 4 (1) (2012) 3–26.
    [17]
    D. Chu, L. Liao, M. Ng, X.W. Zhang, Sparse canonical correlation analysis: New formulation and algorithm, IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (2013) 3050–3065.
    [18]
    J.J. Downs, E.F. Vogel, A plant-wide industrial process control problem, Computers & Chemical Engineering 17 (1993) 245–255.
    [19]
    P. Filzmoser, C. Dehon, C. Croux, Outlier resistant estimators for canonical correlation analysis, in: COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg, 2000, pp. 301–306.
    [20]
    X. Gao, Q. Sun, H. Xu, Y. Li, 2D-LPCCA and 2D-SPCCA: Two new canonical correlation methods for feature extraction, fusion and recognition, Neurocomputing 284 (2018) 148–159.
    [21]
    J. Hardin, D.M. Rocke, The distribution of robust distances, Journal of Computational and Graphical Statistics 14 (2005) 928–946.
    [22]
    Q. Jiang, X. Yan, B. Huang, Review and perspectives of data-driven distributed monitoring for industrial plant-wide processes, Industrial & Engineering Chemistry Research 58 (29) (2019) 12899–12912.
    [23]
    G. Karnel, Robust canonical correlation and correspondence analysis, in: The Frontiers of Statistical Scientific and Industrial Applications, American Sciences Press, Strassbourg, 1991, pp. 415–420.
    [24]
    M. Kim, E.J. Min, K. Liu, J. Yan, A.J. Saykin, J.H. Moore, …., L. Shen, Multi-task learning based structured sparse canonical correlation analysis for brain imaging genetics, Medical Image Analysis 76 (2022).
    [25]
    X. Kong, X. Jiang, B. Zhang, J. Yuan, Z. Ge, Latent variable models in the era of industrial big data: Extension and beyond, Annual Reviews in Control 54 (2022) 167–199.
    [26]
    Z. Li, L. Tian, X. Yan, An ensemble framework based on multivariate statistical analysis for process monitoring, Expert Systems With Applications 205 (2022).
    [27]
    O. Lindenbaum, M. Salhov, A. Averbuch, Y. Kluger, l 0-sparse canonical correlation analysis, Proceedings of the 10th International Conference on Learning Representations, Virtual conference, 2022.
    [28]
    L. Luo, S. Bao, C. Tong, Sparse robust principal component analysis with applications to fault detection and diagnosis, Industrial & Engineering Chemistry Research 58 (2019) 1300–1309.
    [29]
    L. Luo, J. Wang, C. Tong, J. Zhu, Multivariate fault detection and diagnosis based on variable grouping, Industrial & Engineering Chemistry Research 59 (2020) 7693–7705.
    [30]
    L. Luo, X. Peng, C. Tong, A multigroup framework for fault detection and diagnosis in large-scale multivariate systems, Journal of Process Control 100 (2021) 65–79.
    [31]
    R.A. Maronna, V.J. Yohai, Robust and efficient estimation of multivariate scatter and location, Computational Statistics and Data Analysis 109 (2017) 64–75.
    [32]
    E.B. Martin, A.J. Morris, Non-parametric confidence bounds for process performance monitoring charts, Journal of Process Control 6 (6) (1996) 349–358.
    [33]
    P. Nomikos, J.F. MacGregor, Multivariate SPC charts for monitoring batch processes, Technometrics 37 (1) (1995) 41–59.
    [34]
    Y.J. Park, S.-K.-S. Fan, C.-Y. Hsu, A review on fault detection and process diagnostics in industrial processes, Processes 8 (9) (2020) 1123.
    [35]
    X. Peng, S.X. Ding, W.L. Du, W.M. Zhong, F. Qian, Distributed process monitoring based on canonical correlation analysis with partly-connected topology, Control Engineering Practice 101 (2020).
    [36]
    M.A. Qadar, A. Aïssa-El-Bey, A.-K. Seghouane, Two dimensional CCA via penalized matrix decomposition for structure preserved fMRI data analysis, Digital Signal Processing 92 (2019) 36–46.
    [37]
    P. Rousseeuw, K. Van Driessen, A fast algorithm for the minimum covariance determinant estimator, Technometrics 41 (3) (1999) 212–223.
    [38]
    B. Song, H.B. Shi, S. Tan, Y. Tao, Multi-subspace orthogonal canonical correlation analysis for quality related plant wide process monitoring, IEEE Transactions on Industrial Informatics 17 (9) (2020) 6368–6378.
    [39]
    Y. Tao, H. Shi, B. Song, S. Tan, A novel dynamic weight principal component analysis method and hierarchical monitoring strategy for process fault detection and diagnosis, IEEE Transactions on Industrial Electronics 67 (9) (2020) 7994–8004.
    [40]
    S. Taskinen, C. Croux, A. Kankainen, E. Ollila, H. Oja, Canonical analysis based on scatter matrices, Journal of Multivariate Analysis 97 (2006) 359–384.
    [41]
    Uurtio, V., Bhadra, S., & Rousu, J. (2019). Large-scale sparse kernel canonical correlation analysis. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, California.
    [42]
    D.M. Witten, R. Tibshirani, T. Hastie, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics 10 (3) (2009) 515–534.
    [43]
    X. Yang, W. Liu, W. Liu, D. Tao, A survey on canonical correlation analysis, IEEE Transactions on Knowledge and Data Engineering 33 (6) (2021) 2349–2367.
    [44]
    E. Yu, L. Luo, X. Peng, C. Tong, A multigroup fault detection and diagnosis framework for large-scale industrial systems using nonlinear multivariate analysis, Expert Systems with Applications 206 (2022).
    [45]
    C. Zhang, J. Yu, L. Ye, Sparsity and manifold regularized convolutional auto-encoders-based feature learning for fault detection of multivariate processes, Control Engineering Practice 111 (2021).
    [46]
    W. Zheng, Multichannel EEG-based emotion recognition via group sparse canonical correlation analysis, IEEE Transactions on Cognitive and Developmental Systems 9 (3) (2017) 281–290.

    Index Terms

    1. Robust and sparse canonical correlation analysis for fault detection and diagnosis using training data with outliers
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Expert Systems with Applications: An International Journal
            Expert Systems with Applications: An International Journal  Volume 236, Issue C
            Feb 2024
            1583 pages

            Publisher

            Pergamon Press, Inc.

            United States

            Publication History

            Published: 01 February 2024

            Author Tags

            1. Canonical correlation analysis
            2. Outlier
            3. Robustness
            4. Sparsity
            5. Fault detection and diagnosis

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            Citations

            View Options

            View options

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media