Abstract
Prognostics and health management (PHM) is a crucial enabler to reduce maintenance costs and enhance the availability and reliability of manufacturing systems. In the context of Industry 4.0, these systems become more complex and can be monitored by different types of sensors. The quality and completeness of data are crucial factors for the success of any PHM task in this paradigm. Here, we investigate the possibility of exploiting additional data sources in manufacturing besides monitoring sensors, e.g. production line cameras or maintenance reports. We first present the terminologies of multimodal learning and the potential it holds for industrial PHM. We then further explore the development and notable works in this field applied to other domains, look at the relevant works in PHM, and finally present a case study to demonstrate how multimodal learning can be performed to improve PHM processes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Gouriveau R, Medjaher K, Zerhouni N (2016) From prognostics and health systems management to predictive maintenance 1: monitoring and prognostics, vol 4
Xu G, Liu M, Wang J, Ma Y, Wang J, Li F, Shen W (2019) Data-driven fault diagnostics and prognostics for predictive maintenance: a brief overview. In: 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), pp 103–108. https://doi.org/10.1109/COASE.2019.8843068, ISSN: 2161-8089
Jia X, Huang B, Feng J, Cai H, Lee J (2018) A review of PHM data competitions from 2008 to 2017 (2018)
Zhao P, Kurihara M, Tanaka J, Noda T, Chikuma S, Suzuki T (2017) Advanced correlation-based anomaly detection method for predictive maintenance. In: 2017 IEEE International Conference on Prognostics and Health Management (ICPHM). https://doi.org/10.1109/ICPHM.2017.7998309
Falk C, van de Sand R, Corasaniti S, Reiff-Stephan J (2021) A comparison study of data-driven anomaly detection approaches for industrial chillers. In: TH Wildau Engineering and Natural Sciences Proceedings 1 (2021). https://www.tib-op.org/ojs/index.php/th-wildau-ensp/article/view/33. https://doi.org/10.52825/thwildauensp.v1i.33. Accessed 23 May 2022
Yan K, Ji Z, Shen W (2017) Online fault detection methods for chillers combining extended Kalman filter and recursive one-class SVM. Neurocomputing 228:205–212. https://doi.org/10.1016/j.neucom.2016.09.076
Tian J, Azarian MH, Pecht M (2014) Anomaly detection using self-organizing maps-based k-nearest neighbor algorithm. In: PHM society European conference, vol 2, no 1. https://papers.phmsociety.org/index.php/phme/article/view/1554. https://doi.org/10.36001/phme.2014.v2i1.1554
Hendrickx K, Meert W, Mollet Y, Gyselinck J, Cornelis B, Gryllias K, Davis J (2020) A general anomaly detection framework for fleet-based condition monitoring of machines. Mech Syst Signal Process 139:106585. https://doi.org/10.1016/j.ymssp.2019.106585. arXiv:1912.12941 [cs, eess, stat]
Lu G, Liu J, Yan P (2018) Graph-based structural change detection for rotating machinery monitoring. Mech Syst Signal Process 99:73–82
Lu G, Zhou Y, Lu C, Li X (2017) A novel framework of change-point detection for machine monitoring. Mech Syst Signal Process C 533–548. https://doi.org/10.1016/j.ymssp.2016.06.030
Pittino F, Puggl M, Moldaschl T, Hirschl C (2020) Automatic anomaly detection on in-production manufacturing machines using statistical learning methods. Sensors 20(8):2344. https://doi.org/10.3390/s20082344 (Multidisciplinary Digital Publishing Institute)
Schlechtingen M, Santos I (2011) Comparative analysis of neural network and regression based condition monitoring approaches for wind turbine fault detection. Mech Syst Signal Process 25:1849–1875. https://doi.org/10.1016/j.ymssp.2010.12.007
Bzymek A (2017) Application of selected method of anomaly detection in signals acquired during welding process monitoring. Int J Mater Prod Technol 54:249–258
Yang Z, Baraldi P, Zio E (2021) A multi-branch deep neural network model for failure prognostics based on multimodal data. J Manuf Syst 59:42–50. https://doi.org/10.1016/j.jmsy.2021.01.007
Tekin C, Atan O, Van Der Schaar M (2015) Discover the expert: context-adaptive expert selection for medical diagnosis. IEEE Trans Emerg Top Comput 3:220–234. https://doi.org/10.1109/TETC.2014.2386133
Yoon J, Davtyan C, van der Schaar M (2016) Discovery and clinical decision support for personalized healthcare. IEEE J Biomed Health Inform 21:1133–1145
Rahimi SA, Jamshidi A, Ruiz A, Aï-Kadi D (2016) A new dynamic integrated framework for surgical patients’ prioritization considering risks and uncertainties. Decis Support Syst 88:112–120
Cai Q, Wang H, Li Z, Liu X (2019) A survey on multimodal data-driven smart healthcare systems: Approaches and applications. IEEE Access 7:133583–133599. https://doi.org/10.1109/ACCESS.2019.2941419
Download a data file | case school of engineering | case western reserve university (2021). https://engineering.case.edu/bearingdatacenter/download-data-file. Accessed 23 May 2022
Chen X (2019) Tennessee Eastman simulation dataset. https://doi.org/10.21227/4519-z502
Shao S (2022) Mechanical-datasets. https://github.com/cathysiyu/Mechanical-datasets, original-date: 2018-01-16T19:12:43Z
Lee J, Qiu H, Yu G, Lin J (2007) Rexnord technical services, bearing data set, IMS, university of Cincinnati. In: NASA AMES prognostics data repository, NASA Ames, Moffett Field, CA
Nectoux P, Gouriveau R, Medjaher K, Ramasso E, Morello BC, Zerhouni N, Varnier C (2012) PRONOSTIA: an experimental platform for bearings accelerated degradation tests. In: IEEE International conference on prognostics and health management, PHM’12., Denver, Colorado
Sas A (2020) Airbus helicopter accelerometer dataset. https://www.research-collection.ethz.ch/handle/20.500.11850/415151. https://doi.org/10.3929/ethz-b-000415151, accepted: 2020-05-19T12:16:26Z publisher: ETH Zurich type: dataset
Ahmad S, Lavin A, Purdy S, Agha Z (2017) Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262:134–147. https://doi.org/10.1016/j.neucom.2017.04.070
Saxena A, Goebel K (2008) Turbofan engine degradation simulation data set. In: NASA Ames prognostics data repository, pp 1551–3203
Silverman BW (1981) Using kernel density estimates to investigate multimodality. J R Stat Soc: Ser B (Methodol) 43:97–99
Leahy W, Sweller J (2011) Cognitive load theory, modality of presentation and the transient information effect. Appl Cogn Psychol 25:943–951
Norris S (2019) Systematically working with multimodal data: research methods in multimodal discourse analysis. Wiley
Lahat D, Adali T, Jutten C (2015) Multimodal data fusion: an overview of methods, challenges, and prospects. Proc IEEE 103:1449–1477
Caesar H, Bankiti V, Lang AH, Vora S, Liong VE, Xu Q, Krishnan A, Pan Y, Baldan G, Beijbom O (2020) Nuscenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11621–11631
Chen C, Jafari R, Kehtarnavaz N (2015) Utd-mhad: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: IEEE international conference on image processing (ICIP). IEEE 2015:168–172
Tsiourti C, Weiss A, Wac K, Vincze M (2017) Designing emotionally expressive robots: a comparative study on the perception of communication modalities. In: Proceedings of the 5th international conference on human agent interaction, pp 213–222
Parcalabescu L, Trost N, Frank A (2021) What is multimodality? arXiv:2103.06304
Srivastava N, Salakhutdinov RR (2012) Multimodal learning with deep boltzmann machines. Adv Neural Inf Proc Syst 25. https://papers.nips.cc/paper/2012/hash/af21d0c97db2e27e13572cbf59eb343d-Abstract.html. Accessed 31 May 2021
Morency L-P, Liang PP, Zadeh A (2022) Tutorial on multimodal machine learning. In: Proceedings of the 2022 conference of the North American chapter of the association for computational linguistics: human language technologies: tutorial abstracts. Association for Computational Linguistics, Seattle, United States, pp 33–38. https://aclanthology.org/2022.naacl-tutorials.5. https://doi.org/10.18653/v1/2022.naacl-tutorials.5
Baltrušaitis T, Ahuja C, Morency L-P (2018) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41:423–443
Blank M (1974) Cognitive functions of language in the preschool years. Dev Psychol 10:229
Roeper T, McNeill D (1973) Review of child language. Ann Rev Anthropol 2:127–137
Keller-Cohen D (1978) Context in child language. Ann Rev Anthropol 7:453–482
McNeill D (1985) So you think gestures are nonverbal? Psychol Rev 92:350–371. https://doi.org/10.1037/0033-295X.92.3.350 (American Psychological Association, US)
Butterworth B, Hadar U (1989) Gesture, speech, and computational stages: a reply to McNeill
Picard RW (2000) Affective computing. MIT Press. Google-Books-ID: GaVncRTcb1gC
Toosi A, Bottino AG, Saboury B, Siegel E, Rahmim A (2021) A brief history of AI: how to prevent another winter (a critical review). PET Clinics 16:449–469
Vesterinen E et al (2001) Affective computing. In: Digital media research seminar, Helsinki, Citeseer
Chang S-F, Chen W, Meng HJ, Sundaram H, Zhong D (1998) A fully automated content-based video search engine supporting spatiotemporal queries. IEEE Trans Circuits Syst Video Technol 8:602–615
Popescu GV, Burdea GC, Trefftz H (2022) Multimodal interaction modeling. In: Handbook of virtual environments. CRC Press, pp 475–494
Zara A, Maffiolo V, Martin JC, Devillers L (2007) Collection and annotation of a corpus of human-human multimodal interactions: emotion and others anthropomorphic characteristics. In: International conference on affective computing and intelligent interaction. Springer, pp 464–475
Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: ICML
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, PMLR, pp 2048–2057
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
LeCun Y, Haffner P, Bottou L, Bengio Y (1999) Object recognition with gradient-based learning. In: Shape, contour and grouping in computer vision. Springer, pp 319–345
Stoyanov D, Taylor Z, Carneiro G, Syeda-Mahmood T, Martel A, Maier-Hein L, Tavares JMR, Bradley A, Papa JP, Belagiannis V et al (2018) Deep learning in medical image analysis and multimodal learning for clinical decision support: 4th international workshop, DLMIA 2018, and 8th international workshop, ML-CDS 2018, held in conjunction with MICCAI 2018, volume 11045. Springer, Granada, Spain
Huang S-C, Pareek A, Seyyedi S, Banerjee I, Lungren MP (2020) Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit Med 3:1–9
Heiliger L, Sekuboyina A, Menze B, Egger J, Kleesiek J (2022) Beyond medical imaging—A review of multimodal deep learning in radiology. https://www.techrxiv.org/articles/preprint/Beyond_Medical_Imaging_-_A_Review_of_Multimodal_Deep_Learning_in_Radiology/19103432/1. https://doi.org/10.36227/techrxiv.19103432.v1 (TechRxiv)
Behrad F, Abadeh MS (2022) An overview of deep learning methods for multimodal medical data mining. Expert Syst Appl 117006
Spasov SE, Passamonti L, Duggento A, Lio P, Toschi N (2018) A multi-modal convolutional neural network framework for the prediction of Alzheimer’s disease. In: Annual international conference of the IEEE engineering in medicine and biology society. IEEE engineering in medicine and biology society. Annual international conference 2018, pp 1271–1274. https://doi.org/10.1109/EMBC.2018.8512468, PMID: 30440622
Yala A, Lehman C, Schuster T, Portnoi T, Barzilay R (2019) A deep learning mammography-based model for improved breast cancer risk prediction. Radiology 292:60–66. https://doi.org/10.1148/radiol.2019182716. PMID: 31063083
Yoo Y, Tang LY, Li DK, Metz L, Kolind S, Traboulsee AL, Tam RC (2019) Deep learning of brain lesion patterns and user-defined clinical and MRI features for predicting conversion to multiple sclerosis from clinically isolated syndrome. Comput Methods Biomechan Biomed Eng: Imaging Vis 7:250–259
Cao B, Zhang H, Wang N, Gao X, Shen D (2020) Auto-GAN: self-supervised collaborative learning for medical image synthesis. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no 07, pp 10486–10493. https://doi.org/10.1609/aaai.v34i07.6619
Li X, Jia M, Islam MT, Yu L, Xing L (2020) Self-supervised feature learning via exploiting multi-modal data for retinal disease diagnosis. IEEE Trans Med Imaging 39:4023–4033. https://doi.org/10.1109/TMI.2020.3008871
Hervella ÁS, Rouco J, Novo J, Ortega M (2019) Self-supervised deep learning for retinal vessel segmentation using automatically generated labels from multimodal data. In: 2019 international joint conference on neural networks (IJCNN). IEEE, pp 1–8
Chen H, Gao M, Zhang Y, Liang W, Zou X (2019) Attention-based multi-NMF deep neural network with multimodality data for breast cancer prognosis model. BioMed Res Int 2019:e9523719. https://doi.org/10.1155/2019/9523719 (Hindawi)
Maghdid HS, Asaad AT, Ghafoor KZ, Sadiq AS, Khan MK (2020) Diagnosing COVID-19 pneumonia from x-ray and CT images using deep learning and transfer learning algorithms. Tech Rep. http://arxiv.org/abs/2004.00038
Lassau N, Ammari S, Chouzenoux E, Gortais H, Herent P, Devilder M, Soliman S, Meyrignac O, Talabard M-P, Lamarque J-P et al (2021) Integrating deep learning CT-scan model, biological and clinical variables to predict severity of COVID-19 patients. Nat Commun 12:1–11
Wang X, Peng Y, Lu L, Lu Z, Summers RM (2018) TieNet: text-image embedding network for common thorax disease classification and reporting in chest x-rays. Tech Rep. http://arxiv.org/abs/1801.04334, https://doi.org/10.48550/arXiv.1801.04334
Johnson A, Pollard T, Mark R, Berkowitz S, Horng S (2019) Mimic-cxr database. PhysioNet 10:13026 (C2JT1Q)
Bustos A, Pertusa A, Salinas J-M, de la Iglesia-Vayá M (2020) Padchest: a large chest x-ray image dataset with multi-label annotated reports. Med Image Anal 66:101797
Abacha AB, Hasan SA, Datla VV, Liu J, Demner-Fushman D, Müller H, VQA-Med: overview of the medical visual question answering task at ImageCLEF 2019. CLEF (Working Notes) 2
Spezialetti M, Placidi G, Rossi S (2020) Emotion recognition for human-robot interaction: Recent advances and future perspectives. Front Robot AI 7. https://www.frontiersin.org/article/10.3389/frobt.2020.532279. Accessed 01 June 2022
Barros P, Weber C, Wermter S (2015) Emotional expression recognition with a cross-channel convolutional neural network for human-robot interaction, pp 582–587. https://doi.org/10.1109/HUMANOIDS.2015.7363421
Val-Calvo M, Álvarez-Sánchez JR, Ferrández-Vicente JM, Fernández E (2020) Affective robot story-telling human-robot interaction: exploratory real-time emotion estimation analysis using facial expressions and physiological signals. IEEE Access 8:134051–134066
Inceoglu A, Aksoy EE, Ak AC, Sariel S (2021) Fino-net: a deep multimodal sensor fusion framework for manipulation failure detection. In: 2021 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 6841–6847
Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G, Sutskever I (2021) Learning transferable visual models from natural language supervision. arXiv:2103.00020
Alayrac J-B, Donahue J, Luc P, Miech A, Barr I, Hasson Y, Lenc K, Mensch A, Millican K, Reynolds M et al (2022) Flamingo: a visual language model for few-shot learning. arXiv:2204.14198
Gao J, Li P, Chen Z, Zhang J (2020) A survey on deep learning for multimodal data fusion. Neural Comput 32:829–864. https://doi.org/10.1162/neco_a_01273
Gaw N, Yousefi S, Gahrooei MR (2021) Multimodal data fusion for systems improvement: a review. IISE Trans 1–19
Trigeorgis G, Nicolaou M, Zafeiriou S, Schuller B (2016) Deep canonical time warping, pp 5110–5118. https://doi.org/10.1109/CVPR.2016.552
D’mello SK, Kory J (2015) A review and meta-analysis of multimodal affect detection systems. ACM Comput Surv (CSUR) 47:1–36
Wöllmer M, Kaiser M, Eyben F, Schuller B, Rigoll G (2013) LSTM-modeling of continuous emotions in an audiovisual affect recognition framework. Image Vis Comput 31:153–163
Liu F, Zhou L, Shen C, Yin J (2013) Multiple kernel learning in the primal for multimodal alzheimer’s disease classification. IEEE J Biomed Health Inform 18:984–990
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data
Pham H, Liang P, Manzini T, Morency L-P, Poczos B (2019) Found in translation: learning robust joint representations by cyclic translations between modalities. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6892–6899. https://doi.org/10.1609/aaai.v33i01.33016892
Marsella S, Xu Y, Lhommet M, Feng A, Scherer S, Shapiro A (2013) Virtual character performance from speech. In: Proceedings—SCA 2013: 12th ACM SIGGRAPH/Eurographics symposium on computer animation. https://doi.org/10.1145/2485895.2485900
Ahuja C, Morency L-P (2019) Language2pose: natural language grounded pose forecasting, pp 719–728. https://doi.org/10.1109/3DV.2019.00084
Zhang Y, Wallace B (2016) A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. Tech Rep. arxiv:1510.03820
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation, arXiv:1505.04597
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence
Zhang T, Shi M (2020) Multi-modal neuroimaging feature fusion for diagnosis of alzheimer’s disease. J Neurosci Methods 341:108795. https://doi.org/10.1016/j.jneumeth.2020.108795
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. arXiv:1211.5063
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling
Staudemeyer RC, Morris ER (2019) Understanding LSTM—A tutorial into long short-term memory recurrent neural networks. arXiv:1909.09586
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets, vol 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2014/hash/5ca3e9b122f61f8f06494c97b1afccf3-Abstract.html. Accessed 01 June 2022
Yuan Z, Zhang L, Duan L (2018) A novel fusion diagnosis method for rotor system fault based on deep learning and multi-sourced heterogeneous monitoring data. Meas Sci Technol 29:115005. https://doi.org/10.1088/1361-6501/aadfb3 (IOP Publishing)
Kao H-Y, Wang Y-Y, Huang C-M, Hsu C-P (2019) Heterogeneous data ensemble learning in end-to-end diagnosis for IPTV. In: 2019 20th Asia-pacific network operations and management symposium (APNOMS), pp 1–6. https://doi.org/10.23919/APNOMS.2019.8892990. ISSN:2576-8565
Ma Y, Guo Z, Su J, Chen Y, Du X, Yang Y, Li C, Lin Y, Geng Y (2014) Deep learning for fault diagnosis based on multi-sourced heterogeneous data. Int Conf Power Syst Technol 2014:740–745. https://doi.org/10.1109/POWERCON.2014.6993854
Zhou F, Yang S, He Y, Chen D, Wen C (2021) Fault diagnosis based on deep learning by extracting inherent common feature of multi-source heterogeneous data. Proc Inst Mech Eng Part I: J Syst Control Eng 235:1858–1872. https://doi.org/10.1177/0959651820933380 (IMECHE)
Marei M, Li W (2021) Cutting tool prognostics enabled by hybrid CNN-LSTM with transfer learning. Int J Adv Manuf Technol. https://doi.org/10.1007/s00170-021-07784-y
Zhang X, Fujiwara T, Chandrasegaran S, Brundage M, Sexton T, Dima A, Ma K-L (2021) A visual analytics approach for the diagnosis of heterogeneous and multidimensional machine maintenance data. https://doi.org/10.1109/PacificVis52677.2021.00033
Wang P, Liu Z, Gao RX, Guo Y (2019) Heterogeneous data-driven hybrid machine learning for tool condition prognosis. CIRP Annal 68:455–458. https://doi.org/10.1016/j.cirp.2019.03.007
Ansari F, Glawar R, Nemeth T (2019) Prima: a prescriptive maintenance model for cyber-physical production systems. Int J Comput Integr Manuf 32:482–503. https://doi.org/10.1080/0951192X.2019.1571236
Ansari F, Glawar R, Sihn W (2020) Prescriptive maintenance of CPPS by integrating multimodal data with dynamic Bayesian networks. In: Technologien für die intelligente automation. Springer, Berlin, Heidelberg, pp 1–8. https://doi.org/10.1007/978-3-662-59084-3_1
Zacharaki A, Vafeiadis T, Kolokas N, Vaxevani A, Xu Y, Peschl M, Ioannidis D, Tzovaras D (2021) Reclaim: toward a new era of refurbishment and remanufacturing of industrial equipment. Front Artif Intell 3:570562
Albawi S, Mohammed TA, Al-Zawi S (2017) Understanding of a convolutional neural network. In: International conference on engineering and technology (ICET). IEEE, pp 1–6
Goldberg Y (2017) Neural network methods for natural language processing. Synth Lect Hum Lang Technol 10:1–309
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Jose, S., Nguyen, K.T.P., Medjaher, K. (2023). Multimodal Machine Learning in Prognostics and Health Management of Manufacturing Systems. In: Tran, K.P. (eds) Artificial Intelligence for Smart Manufacturing. Springer Series in Reliability Engineering. Springer, Cham. https://doi.org/10.1007/978-3-031-30510-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-30510-8_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30509-2
Online ISBN: 978-3-031-30510-8
eBook Packages: EngineeringEngineering (R0)