A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem

Tang, Ziyi; Hou, Xinhao; Wang, Xin; Zou, Jifeng

doi:10.3390/app14167254

Open AccessArticle

A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem

¹

School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China

²

Engineering Training Center, Tianjin University of Technology, Tianjin 300384, China

³

Institute of Intelligent Control and Fault Diagnosis, Tianjin University of Technology, Tianjin 300384, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 7254; https://doi.org/10.3390/app14167254

Submission received: 16 June 2024 / Revised: 16 August 2024 / Accepted: 16 August 2024 / Published: 17 August 2024

(This article belongs to the Special Issue Industrial AI: Applications in Fault Detection, Diagnosis, and Prognosis—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

With the optimization and advancement of industrial production and manufacturing, the application scenarios of bearings have become increasingly diverse and highly coupled. This complexity poses significant challenges for the extraction of bearing fault features, consequently affecting the accuracy of cross-condition fault diagnosis methods. To improve the extraction and recognition of fault features and enhance the diagnostic accuracy of models across different conditions, this paper proposes a cross-condition bearing diagnosis method. This method, named MCR-KAResNet-TLDAF, is based on image fusion and a residual network that incorporates the Kolmogorov–Arnold representation theorem. Firstly, the one-dimensional vibration signals of the bearing are processed using Markov transition field (MTF), continuous wavelet transform (CWT), and recurrence plot (RP) methods, converting the resulting images to grayscale. These grayscale images are then multiplied by corresponding coefficients and fed into the R, G, and B channels for image fusion. Subsequently, fault features are extracted using a residual network enhanced by the Kolmogorov–Arnold representation theorem. Additionally, a domain adaptation algorithm combining multiple kernel maximum mean discrepancy (MK-MMD) and conditional domain adversarial network with entropy conditioning (CDAN+E) is employed to align the source and target domains, thereby enhancing the model’s cross-condition diagnostic accuracy. The proposed method was experimentally validated on the Case Western Reserve University (CWRU) dataset and the Jiangnan University (JUN) dataset, which include the 6205-2RS JEM SKF, N205, and NU205 bearing models. The method achieved accuracy rates of 99.36% and 99.889% on the two datasets, respectively. Comparative experiments from various perspectives further confirm the superiority and effectiveness of the proposed model.

Keywords:

bearings; fault diagnosis; domain adaptation; image fusion; Kolmogorov–Arnold representation theorem; residual network

1. Introduction

With the continued advancement of industrialization, rotating equipment has become an indispensable component in production and manufacturing processes. Bearings are extensively utilized in a variety of applications, including high-speed trains, heavy trucks, generators, wind turbines, and conveyor belts. As critical components of rotating machinery, bearings serve essential functions in reducing friction, supporting loads, ensuring precision, minimizing wear, and lowering energy consumption. However, due to their operation in high-speed rotating equipment, bearing failures can not only exacerbate equipment damage but also lead to safety incidents, resulting in production stoppages and substantial economic losses. Consequently, bearing fault diagnosis and condition monitoring are of paramount importance. Accurate and rapid bearing health monitoring is essential for ensuring the efficient and safe operation of equipment while mitigating adverse economic impacts.

Collecting operational data regarding various health conditions using sensors, processing this data, and establishing data-driven intelligent bearing fault diagnosis models is the current mainstream approach. During the operation of rotating machinery, bearings generate multiple physical parameters, including vibration signals, sound signals, and temperature signals. Among these, vibration signals are the most widely studied [1]. Image processing methods can reveal bearing fault characteristics from a more multidimensional perspective. This transformation not only conveys additional fault features but also facilitates more efficient feature extraction by the models. Yan et al. [2] proposed a fault diagnosis method using a combination of the Markov transformation field (MTF) and deep residual network (ResNet). Tang et al. [3] proposed a fault diagnosis method utilizing Gramian angular summation fields (GASF). The study converts time-series signals from multiple sensors into two-dimensional GASF feature maps, preserving the absolute temporal relationships within the time series. Xie et al. [4] proposed a method based on complementary ensemble empirical mode decomposition. This method identifies the intrinsic mode function with the highest correlation and converts one-dimensional signals into two-dimensional color images using recurrence plots (RP) as inputs for the multiscale perceptron. Zhang et al. [5] proposed a method based on short-time Fourier transform (STFT) and convolutional neural networks (CNN). This method examines five typical window functions, along with their respective widths and overlap widths, to identify the optimal function. It employs stacked dual-layer convolutions to enhance the model’s nonlinear representation capabilities. Kumar et al. [6] proposed applying continuous wavelet transform (CWT) to the collected signals, extracting time–domain statistical features from the CWT coefficients and differentiate them using the K-nearest neighbor classifier. The aforementioned studies employ various image processing techniques to capture underlying relationships and patterns within the data for fault diagnosis. However, current research in image processing faces the following limitations: (1) A single image processing method cannot comprehensively capture all important features within a signal. Furthermore, the performance of a single feature processing method may not be stable when dealing with complex working conditions. (2) The operating environment of bearings is complex and variable. Bearing vibration signals may be coupled with other vibration signals, or the fault signals may be relatively weak. Therefore, extracting fault features from mixed signals requires further attention.

In recent years, deep learning, a rapidly advancing branch of machine learning, has found extensive applications in the field of bearing fault diagnosis. Deep learning leverages multi-layer neural networks for complex feature extraction and pattern recognition, offering robust learning and generalization capabilities. Xu et al. [7] proposed a hybrid deep learning method that uses CNNs to extract fault features from time–frequency images, which are then input into a gcForest classifier. Li et al. [8] employed a dual-stage attention recurrent neural network to enhance minority fault features in an imbalanced dataset and used a convolutional neural network embedded with a convolutional block attention module (CBAM) for fault classification. Ma et al. [9] proposed a multi-objective optimization-based ensemble deep learning method for rotor-bearing diagnosis, integrating convolutional residual networks, deep belief networks, and deep autoencoders. Shen et al. [10] introduced a physics-based deep learning approach that first assesses bearing health levels using a threshold model, followed by a CNN that automatically extracts high-level features from the inputs for bearing fault detection. However, in reality, obtaining a large amount of bearing fault data is often challenging. Additionally, the operating conditions of motors are complex, and vibration signals cannot consistently maintain linearity. These factors contribute to the poor performance of deep learning models in addressing these challenges.

Transfer learning is an effective method that leverages existing knowledge to solve problems in different but related domains. It relies two fundamental assumptions of traditional deep learning: the requirement that the training and testing samples must be independently and identically distributed, and the necessity for a large number of samples to develop an accurate classification model. Consequently, transfer learning is used to tackle the issue of low fault diagnosis accuracy under varying operating conditions for bearings. In transfer learning, domain adaptation algorithms are commonly used to align the source and target domains. Xiao et al. [11] proposed a cross-domain fault diagnosis framework based on transferable features and manifold embedding discriminant distribution adaptation. This method designs a transferability evaluation method using an adjusted Rand index and maximum mean discrepancy (MMD) to quantify the fault discriminability and domain invariance of features. Additionally, a new manifold-embedding discriminant joint distribution adaptation method is proposed to address the class imbalance problem between the target and source domains. P. Chen et al. [12] proposed a model based on the sliced Wasserstein distance for bearing fault diagnosis under different loads and speeds. This model achieves comprehensive domain adaptation by using adversarial training to learn a domain-invariant space. Li et al. [13] introduced a multi-scale extension method based on residual neural networks, combined with the multiple kernel maximum mean discrepancy (MK-MMD), to address issues affecting rotating components such as bearings and gears in noisy and complex environments. Xu et al. [14] introduced a method based on a convolutional kernel dropping mechanism, skip connections, and joint maximum mean discrepancy (JMMD). This approach aims to improve diagnostic accuracy in unsupervised domain discrepancy scenarios by enhancing feature transfer and domain alignment. Wu et al. [15] proposed a model that integrates domain-adversarial neural networks with an attention mechanism. This model incorporates an attention mechanism into the feature extractor to retain fault-related features. Additionally, it replaces the fully connected (FC) layers in the classifier and discriminator with global average pooling layers, thus reducing the number of parameters and enhancing efficiency. Wu et al. [16] introduced the gradient conditional domain adversarial network method, which uses conditional domain adversarial networks (CDAN) as the main component and integrates data filtering and intermediate domain selection. The above article focuses on the application of domain-adaptive transfer learning in bearing fault diagnosis and proposes various domain-adaptive methods to address the differences in data distribution under different working conditions and complex environments. However, there are several shortcomings in the application of transfer learning methods for fault diagnosis: (1) Cooperation between the feature extraction model and the domain adaptation algorithm: Under complex working conditions, the feature extraction model may fail to extract effective features from the source domain, resulting in the domain adaptation algorithm’s inability to effectively bridge the gap between the source and target domains. (2) Limitations of common domain adaptation algorithms: Algorithms such as MMD and MK-MMD primarily focus on reducing marginal distribution differences between the source and target domains but neglect the joint distribution of labels and features. (3) Challenges with CDAN and DANN-based algorithms: These algorithms achieve domain adaptation through adversarial training, but the classifier’s decision boundary can be easily disrupted in the target domain, potentially leading to a decrease in the classifier’s generalization performance on the target domain.

Based on the aforementioned research status and identified shortcomings, this paper proposes a novel method for bearing fault diagnosis that is effective in conditions with high levels of noise and complex working conditions.

(1) This paper innovatively integrates three types of images: MTF, CWT, and RP. The fused images encompass various features, including state transition characteristics, periodicity, autocorrelation, and time–frequency characteristics. This approach overcomes the limitations of existing single-feature processing methods, which struggle to comprehensively capture features and maintain stability under complex working conditions.

(2) This paper proposes a residual network based on the Kolmogorov–Arnold representation theorem. By introducing KABlock layers into the traditional residual network and combining fixed basis functions with spline functions, the model’s adaptability and robustness under complex working conditions are enhanced. This optimization enables the model to better capture the complex dynamic characteristics and nonlinear features of bearing vibration signals.

(3) This paper introduces an innovative domain adaptation method that integrates the MK-MMD and CDAN+E algorithms. MK-MMD effectively aligns the distributions of the source and target domains through a multi-kernel approach, providing a solid foundation for the adversarial learning of CDAN+E, which helps improve the generalization of the CDAN+E classifier. CDAN+E introduces adversarial learning and entropy conditioning, leveraging the joint distribution of labels and features to overcome the limitations of MK-MMD, which only considers marginal distributions. The integrated domain adaptation algorithm more accurately aligns the source and target domains, harnessing their complementary advantages and demonstrating higher adaptability and robustness in complex cross-domain tasks.

2. Theoretical Foundation

2.1. Transfer Learning

Unsupervised domain adaptation is a type of transfer learning method used to address the issue of differing data distributions between the source domain

D_{s}

and the target domain

D_{t}

[17]. In this study, the problem is set within the context of a labeled source domain

D_{s} = {x_{i}, y_{i}}_{i = 1}^{n_{s}}

and an unlabeled target domain

D_{t} = {x_{j}}_{j = 1}^{n_{t}}

. The feature spaces and label spaces of both domains are the same, i.e.,

χ_{s} = χ_{t}, y_{s} = y_{t}

, but the marginal distributions of the two domains differ, i.e.,

P_{s} (x_{s}) \neq P_{t} (x_{t})

. The method proposed in this paper aims to align the distributions of the source and target domains through a domain adaptation algorithm. The principle is illustrated in Figure 1. In transfer learning-based cross-domain fault diagnosis, the common categories include cross-working condition, cross-sensor, and cross-device fault diagnosis [18,19]. This paper primarily focuses on cross-working condition fault diagnosis.

2.2. Continuous Wavelet Transform

CWT provides basis functions with specific time and frequency resolutions to analyze time-varying fault signals. CWT features multi-resolution analysis, where the low-frequency components of the vibration signal have a higher frequency resolution, and the high-frequency components have a higher time resolution. This effectively captures the local time–frequency characteristics of the vibration signal. The equation for CWT is as follows [20]:

W_{x} (a, b) = \frac{1}{\sqrt{a}} \int_{- \infty}^{+ \infty} x (t) ψ^{*} (\frac{t - b}{a}) d t

(1)

where a denotes the scale factor, and b represents the translation factor. In this paper, the Morlet wavelet is chosen as the mother wavelet for the continuous transform.

2.3. Markov Transition Field

MTF is an image processing method based on the Markov transition probability matrix [21]. For a given time series

X = {x_{1}, x_{2}, \dots, x_{i}, \dots, x_{N}}

, the first step is to divide the time series X into Q discrete bins. Each time series value

x_{i}

is then quantized and mapped to its corresponding bin

q_{i}

. The transition probabilities between these bins are calculated using a Markov chain along the time axis, resulting in a Markov state transition matrix W of size

Q \times Q

. The equation for this is as follows:

W = [\begin{matrix} w_{11} & w_{12} & \dots & w_{1 Q} \\ w_{21} & w_{22} & \dots & w_{2 Q} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ w_{Q 1} & w_{Q 2} & \dots & w_{Q Q} \end{matrix}]

(2)

w_{i j} = p \{x_{t + 1} \in q_{j} ∣ x_{t} \in q_{i}\}

(3)

In the matrix,

w_{i j}

represents the probability of transitioning from quantile

q_{i}

to quantile

q_{j}

. Since the Markov transition matrix only reflects the overall state transition probabilities and ignores the dependency between time steps and the distribution of the time series X, some temporal information is lost. To address this, the MTF is constructed by extending matrix W along the time sequence, resulting in matrix M, which considers the relationship between position and temporal information. The matrix M is given by the following:

M = [\begin{matrix} m_{11} & m_{12} & \dots & {m_{1}}_{Q} \\ m_{21} & m_{22} & \dots & m_{2 Q} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ m_{Q 1} & m_{Q 2} & \dots & m_{Q Q} \end{matrix}]

(4)

m_{i j} = p \{w_{i j} ∣ x_{i} \in q_{i}, x_{j} \in q_{j}\}

(5)

In the matrix,

m_{i j}

represents the probability of transitioning from quantile

q_{i}

to quantile

q_{j}

considering specific time steps.

2.4. Recurrence Plot

RP is a method that is commonly used for analyzing nonlinear signals. It is particularly suited for detecting and analyzing the non-stationarity, chaos, and periodic structures in a time series [22]. The RP approach reveals the dynamic characteristics and hidden structures within the time series. For a given time series

X = {x_{1}, x_{2}, \dots, x_{i}, \dots, x_{N}}

, the steps to generate an RP are as follows:

Step 1: By selecting an appropriate embedding dimension m and time delay

τ

, a phase space vector is constructed as follows:

X_{i} = {x_{i}, x_{i + τ}, \dots, x_{i + (m - 1) τ}}

(6)

where

i = 1, 2, \cdot \cdot \cdot, N - (m - 1) τ

, and N is the length of the time series.

Step 2: Calculate the Euclidean distance between each pair of trajectories after phase space reconstruction:

d_{i j} = ∥X_{i} - X_{j}∥

(7)

where

X_{i}

and

X_{j}

are the reconstructed points.

Step 3: Construct the recurrence matrix. The specific equation is as follows:

R_{i j} = H e a v i s i d e (ε - d_{i j})

(8)

H e a ν i s i d e (x) = \{\begin{matrix} 1, x \geq 0 \\ 0, x < 0 \end{matrix}

(9)

where

ε

is a threshold, set to 0.4 in this study.

Step 4: Generate the visual RP. By applying a color map, the distance values in the distance matrix are transformed into different colors, resulting in a color image that encapsulates the complete characteristics of the time series. Each color represents a different measure of similarity between two embedded vectors, forming a color texture map that intuitively displays the dynamic features of the time series.

2.5. Kolmogorov–Arnold Representation Theorem

The Kolmogorov–Arnold representation theorem was originally proposed by Andrey Kolmogorov in 1957 and further refined by Vladimir Arnold. The Kolmogorov–Arnold representation theorem posits that any intricate multivariate function can be expressed as a sum of simple, finite, univariate functions [23]. For a multivariate function f, it can be represented by the following equation:

f (x) = f (x_{1}, \dots, x_{n}) = \sum_{i = 1}^{2 n + 1} Φ_{i} (\sum_{j = 1}^{n} ϕ_{i, j} (x_{j}))

(10)

where

x_{j}

are the input variables,

f : {[0, 1]}^{n} \to R

,

ϕ_{i, j}

are the inner functions, and

Φ_{i}

are the outer functions. This theorem is of significant importance for simplifying the handling and understanding of multivariate functions. In neural network research, the Kolmogorov–Arnold representation theorem suggests that by designing appropriate network structures, complex multivariate continuous functions can be approximated.

3. Proposed Method

This paper introduces the MCR-KAResNet-TLDAF method. The framework of the MCR-KAResNet-TLDAF method is illustrated in Figure 2. The method comprises three main components: image fusion, backbone network optimization, and domain adaptation algorithm integration. The image processing module employs image fusion to present fault features from multiple perspectives. The backbone network module enhances and optimizes the residual network using the Kolmogorov–Arnold representation theorem. The integrated domain adaptation module combines the MK-MMD and CDAN+E domain adaptation algorithms to better align the source and target domains.

3.1. Image Fusion

Images can intuitively reflect relevant characteristics of signals, such as time–frequency features, periodic features, autocorrelation features, and state transition features. In complex mechanical structures, the operating state of bearings is coupled with the states of other components, and the vibration signals are significantly affected by noise and random shocks. In such cases, using a single image processing method to characterize the signals from one perspective may fail to capture the complete information of the bearing vibration signals. This can hinder the model’s subsequent fault diagnosis and identification of the bearings.

This study processes signals using a image fusion method. Specifically, it fuses the MTF, RP, and CWT images through the R, G, and B channels, respectively. This approach enables the two-dimensional images to represent the original signal from multiple perspectives, offering a more comprehensive and enriched depiction of the signal’s characteristics. Figure 3 illustrates the complete image construction process.

Step 1: Use a sliding window to segment the raw vibration signal. The sampling length of the image is determined by the frequency at which the vibration sensor collects the signal and the rotational speed of the motor. The sampling points under different working conditions in the same dataset should be kept consistent. To ensure that a single image fully describes the fault, the sampling length should at least cover one complete rotation of the bearing. When the rotational speeds under different working conditions in the same dataset are different, the sampling length should be at least greater than the maximum sampling length among all conditions. Detailed information can be found in the data description section of the two datasets selected in this paper. The specific equation for calculating the sampling length is as follows:

N = \frac{f_{z}}{n / 60}

(11)

where

f_{z}

is the sampling frequency and n is the rotational speed.

Step 2: To ensure a smooth transition at the window boundaries and to retain the complete detailed characteristics of the signal, this study uses a 50% overlap in the data when sliding the window.

Step 3: Each segmented data sample is processed using MTF, CWT, and RP image processing methods. The resulting color images are then converted to grayscale images to serve as the three inputs for the image fusion method.

Step 4: The generated grayscale images from MTF, CWT, and RP are used as inputs for the R, G, and B channels, respectively, for image fusion. To preserve the important features of the images, each channel is multiplied by a coefficient. In this study, the coefficients for the R, G, and B channels are set to [0.2, 1, 0.2]. The final result is the processed fused image.

The fused images obtained using the above processing steps simultaneously display the time–frequency characteristics, periodic features, autocorrelation properties, and state transition characteristics of the original signal. This provides a rich data foundation for subsequent models to extract fault features.

3.1.1. KAResNet

To enhance the accuracy of feature extraction under complex working conditions, this paper proposes the KAResNet model as the backbone network. The architecture of KAResNet is illustrated in Figure 4, with specific hyperparameters provided in Table 1. The KAResNet model comprises a Conv1 layer; H1, H2, and H3 layers; a global average pooling layer; and two KABlock layers. Each of the H1, H2, and H3 layers is composed of multiple stacked convolutions. To address the issues of gradient vanishing and exploding during training, residual connections are utilized to form residual blocks.

Unlike the conventional ResNet structure, this paper introduces the Kolmogorov–Arnold network architecture [24], which utilizes the KABlock in place of the original fully connected layer using a fixed activation function. This modification enhances the model’s efficiency in approximating complex functions. Notably, the KABlock represents an optimized and improved implementation of the Kolmogorov–Arnold representation theorem. The KABlock consists of a combination of basis and spline functions. The basis function employs a fixed SiLU activation function, while the spline functions consist of B-Spline functions. According to the Kolmogorov–Arnold representation theorem, it is necessary to find a suitable univariate function to approximate complex functions. The KABlock selects B-Spline as the fundamental univariate function. The KABlock maintains flexibility and adjustability while providing high smoothness and numerical stability, enabling more efficient approximations of complex functions. This design enhances the overall performance of the model, achieving more complex function simulations using simpler functions.

3.1.2. MK-MMD

The most crucial aspect of transfer learning domain adaptation is reducing the distribution discrepancy between the source domain and the target domain. MMD is a commonly used metric for measuring the distance between two probability distributions in regenerative kernel Hilbert space (RKHS) [25]. The equation for MMD is defined as follows:

L_{M M D} (P_{s}, P_{t}) = | | E_{P_{s}} [ϕ (x^{s})] - E_{P_{t}} [ϕ (x^{s})] {| |}_{H}^{2}

(12)

where

ϕ (\cdot)

represents the mapping function.

However, the MMD method is limited by the use of a single kernel function, which often performs poorly when the distribution differences between the two domains are complex and diverse. MK-MMD improves upon this by using multiple kernel functions to compute a combined distribution discrepancy, taking into account the various relationships between features. The specific kernel defined by m kernel functions can be expressed as follows:

K \overset{Δ}{=} {k = \sum_{u = 1}^{m} β_{u} k_{u} : \sum_{u = 1}^{m} β_{u} = 1, β_{u} \geq 0, \forall u}

(13)

where

β_{u}

are the weight parameters for the different kernels. The MK-MMD loss can be expressed as follows:

L_{M K - M M D} = \sum_{u = 1}^{m} β_{u} L_{u}

(14)

where

L_{u}

denotes the MMD loss, computed using the kernel function

k_{u}

. Thus, the total MK-MMD loss function can be expressed as follows:

L = L_{c e l} + λ_{M K - M M D} L_{M K - M M D} (D_{s}, D_{s})

(15)

where

L_{c e l}

represents the classification loss, and

λ_{M K - M M D}

is a weight parameter.

3.1.3. CDAN+E

CDAN+E is a network model analogous to generative adversarial networks (GANs), integrating adversarial learning and domain adaptation into a two-player game. Adversarial domain adaptation models typically consist of a feature extractor, a category classifier, and a domain discriminator [26]. This adversarial process introduces multilinear conditioning to simultaneously consider the joint distribution of features and labels. To avoid negative transfer and the non-convergence of adversarial training, CDAN+E uses entropy to define the uncertainty of predictions. The entropy is defined as follows:

H (p) = - \sum_{c = 1}^{c} p_{c} log p_{c}

(16)

where c is the number of training classes, and

p_{c}

is the probability that a sample is predicted to belong to class c. The smaller

H (p)

is, the more accurate the prediction, indicating that such samples should contribute more to domain distribution matching. Based on the above entropy, each sample can be assigned an entropy-based weight, ensuring that the domain discriminator prioritizes samples with a higher prediction accuracy. Thus, the entropy-based certainty measure is as follows:

w (H (p)) = 1 + e^{- H (p)}

(17)

Thus, the loss function for CDAN+E can be defined as follows:

\begin{matrix} L_{C D A N + E} (θ_{f}, θ_{d}) = - E_{x_{i}^{s} \in D_{s}} w (H (p_{i}^{s})) \times log [G_{d} (G_{f} (x_{i}^{s}) \otimes G_{c} (G_{f} (x_{i}^{s})))] \\ - E_{x_{i}^{t} \in D_{t}} w (H (p_{i}^{t})) \times log [1 - G_{d} (G_{f} (x_{i}^{t}) \otimes G_{c} (G_{f} (x_{i}^{t})))] \end{matrix}

(18)

where

θ_{f}

represents the parameters of the feature extractor

G_{f}

,

θ_{d}

represents the parameters of the domain classifier

G_{d}

, and ⊗ denotes the tensor product.

G_{f} (x_{i}^{s}) \otimes G_{c} (G_{f} (x_{i}^{s}))

represents the multilinear conditioning of the CDAN+E domain adaptation algorithm.

The total loss function for CDAN+E is given by the following:

L (θ_{f}, θ_{c}, θ_{d}) = L_{c} (θ_{f}, θ_{c}) - λ_{C D A N + E} L_{C D A N + E} (θ_{f}, θ_{d})

(19)

where

θ_{c}

represents the parameters of the category classifier

G_{c}

,

L_{c} (θ_{f}, θ_{c})

is the classification loss, and

λ_{C D A N + E}

is a weight parameter.

3.1.4. A Domain Adaptive Algorithm Combining MK-MMD and CDAN+E

Based on the backbone model KAResNet, this paper proposes a domain adaptation algorithm that combines MK-MMD and CDAN+E. The integrated model architecture is shown in Figure 5. By merging kernel-based and adversarial-based domain adaptation methods, this approach addresses the limitations of using a single domain adaptation method. MK-MMD minimizes the distribution discrepancy between the source and target domains by measuring the difference in the mean embeddings of their feature distributions in the RKHS. It enhances the testing capability by utilizing different kernel functions to find the optimal kernel and minimize distribution differences. CDAN+E overcomes the limitations of MK-MMD by accounting for the joint distribution differences of features and labels. CDAN+E leverages multilinear conditioning and adversarial learning to consider the joint distribution of features and labels, addressing the limitations of MK-MMD, which cannot account for the differences in the joint distribution of features and labels. Meanwhile, MK-MMD effectively aligns the distributions of the source and target domains using the multicore method, providing a solid foundation for the adversarial learning of CDAN+E and enhancing the generalization capability of the CDAN+E classifier. The adaptive algorithm, through fusion, enables KAResNet to better extract domain-invariant features and achieve better domain-adaptive effects and fault classification effects.

The specific process of domain-adaptive fusion is as follows: First, features are extracted from the input images using KAResNet. Simultaneously, MK-MMD is utilized to reduce the distribution differences between the source and target domains through kernel mapping. CDAN+E weights the entropy predicted by the classifiers through entropy conditioning, allowing the domain discriminator to prioritize the higher-weighted parts. CDAN+E simultaneously considers the joint distributions of features and labels using multilinear conditioning. Additionally, the distributions of the source and target domains are further aligned through adversarial learning between the feature extractors and domain classifiers. After completing feature extraction and adversarial learning, faults are identified based on the category classifiers.

The MK-MMD loss, CDAN+E loss, and classification loss are combined to form the total loss function of the MCR-KAResNet-TLDAF method. The total loss function is given as follows:

L = L_{C} + λ_{M K - M M D} L_{M K - M M D} (D_{s}, D_{s}) - λ_{C D A N + E} L_{C D A N + E} (θ_{f}, θ_{d})

(20)

λ_{M K - M M D} = λ_{C D A N + E} = \frac{2}{1 + e^{- 10 \times (\frac{E_{c u r r e n t}}{E_{m a x}})}} - 1

(21)

where

E_{c u r r e n t}

represents the current number of iterations and

E_{m a x}

represents the maximum number of iterations. By adjusting

λ_{M K - M M D}

and

λ_{C D A N + E}

, the model achieves a better balance between classification and domain adaptation.

4. Experimental Validation

The Adam optimizer is used for experimental validation. The initial learning rate is set to 0.001. The learning rate is adjusted to 0.0001 after 40 epochs and further updated to 0.00001 after 70 epochs. The experimental platform includes a 13th Gen Intel(R) Core(TM) i9-13900HX 2.20 GHz processor and an NVIDIA GeForce RTX 4060 Laptop GPU.

4.1. Case 1

4.1.1. Case Western Reserve University Dataset Description

The Case Western Reserve University (CWRU) dataset is chosen as Case 1. The test rig is shown in Figure 6. Vibration acceleration sensors collect the vibration acceleration data from the test rig. Vibration acceleration sensors were positioned at the 12 o’clock positions on both the drive end and fan end of the motor housing. The experiment employed a 16-channel DAT recorder to capture the vibration signals. The selected data include normal baseline data and 12k drive end bearing vibration fault data [27]. The drive end bearing used is a 6205-2RS JEM SKF model, which is a deep groove ball bearing. The specific information about the bearing is shown in Table 2.

The drive end has four health conditions: healthy, roller fault, inner race fault, and outer race fault. Each fault type includes three fault depths, created using electric discharge machining: 0.007, 0.014, and 0.021 inches (1 mil = 0.001 inches). Therefore, the bearings have 10 different health states, as categorized in Table 3. The time series waveforms of the bearing under different health conditions are shown in Figure 7.

Based on different loads and speeds, the dataset is divided into four operating conditions, with specific information detailed in Table 4. In the load information, 0HP represents a 0 hp motor. The sampling points for the CWRU dataset are calculated using Equation (11), with the final selected sampling point set to 800. During training, the training and test sets are divided in a ratio of 75:25. The number of epochs for the CWRU dataset is set to 80.

4.1.2. Image Fusion Experiment

This paper proposes multiplying the MTF, CWT, and RP images by specific coefficients and using them as the R, G, and B channels for image fusion. To determine the optimal coefficients and corresponding images for each channel that contain the most bearing fault features, we conducted three sets of experiments, progressively analyzing each layer. The experiments are named as follows: when MTF is used as the R channel input, CWT as the G channel input, and RP as the B channel input, with the coefficients [0.2, 1, 0.21], the configuration is named 0.2MTF-CWT-0.2RP. The experimental results are presented in Figure 8 and Table 5.

The experiments are divided into three groups:

First group: By using different images as the B channel input and setting its coefficient to 1 for emphasis, we compared which image processing method had the most significant impact on the results. The experimental results indicated that emphasizing the CWT image with a coefficient of 1 yielded the highest fault diagnosis accuracy.

Second group: This experiment explored which channel input for CWT achieved the highest accuracy. The results showed that when CWT was used as the G channel input, the fault diagnosis accuracy was the highest.

Third group: With CWT set as the G channel input, this experiment investigated how changes in the R and B channel images or their coefficients affected fault diagnosis performance. The results demonstrated that the combination of 0.2MTF-CWT-0.2RP yielded the best fault diagnosis performance and contained the most comprehensive image information.

Therefore, this paper proposes using 0.2MTF-CWT-0.2RP as the input for the MCR-KAResNet-TLDAF method.

4.1.3. Image Comparison Experiments

To verify the feasibility of the MCR image processing method, this paper conducts comparative experiments using MTF, CWT, and RP images. To ensure the rigor of the experiments, KAResNe is selected as the backbone model, and algorithms integrating MK-MMD and CDAN+E are employed as the domain adaptation methods across all comparisons.

Table 6 shows that the MCR-KAResNet-TLDAF method achieved the highest accuracy. Compared to the RP-KAResNet-TLDA method, it improved by 21.54%, and compared to the MTF-KAResNet-TLDA method, it improved by 43.64%. Additionally, it outperformed the CWT-KAResNet-TLDA method by 1.916%. The MCR-KAResNet-TLDAF method surpassed the RP-KAResNet-TLDA and MTF-KAResNet-TLDA methods across all transfer tasks. Although it showed a decline in accuracy for the 0-2 and 2-0 transfer tasks compared to the CWT-KAResNet-TLDA method, it achieved improvements of 14.8% and 10.67% in the 3-0 and 3-1 transfer tasks, respectively. In the 3-0 and 3-1 transfer tasks, it broke through the bottleneck where the accuracy was below 90% for the other three conditions.

Compared to the other three image processing methods, MCR provides a richer depiction of signal features, offering a more comprehensive feature base for the subsequent model to recognize and diagnose faults. This comprehensive approach ensures that the model can leverage a broader spectrum of diagnostic information, leading to improved fault identification performance.

4.1.4. Comparison of Domain Adaptation Algorithms

The choice of domain adaptation method significantly impacts the model’s fault diagnosis results. This paper validates the reliability of the proposed integrated domain adaptation method by comparing different domain adaptation strategies. The method using only the MK-MMD domain adaptation algorithm is referred to as Method A, and the method using only the CDAN+E algorithm is referred to as Method B. These two methods are compared with the proposed method, which integrates both the MK-MMD and CDAN+E domain adaptation algorithms.

To ensure the rigor of the experiments, all three methods use images processed using the proposed MCR method as input. The feature extraction model for all methods is the proposed KAResNet model. The results of the domain adaptation algorithm experiments are shown in Table 7 and Figure 9.

The experimental results show that theMCR-KAResNet-TLDAF method achieved the highest average accuracy among the three methods. MCR-KAResNet-TLDAF attained the highest accuracy in most transfer tasks, particularly in transfer task 3-0, where Method A and Method B struggled to achieve a high accuracy. In the 3-0 transfer task, the MCR-KAResNet-TLDAF method improved the accuracy by 8.13% and 14.8% compared to Method A and Method B, respectively. This indicates that the integrated MK-MMD and CDAN+E domain adaptation method performs better in complex transfer tasks.

4.1.5. Comparative Experiments with Different Models

In this paper, several widely used bearing fault diagnosis models were selected for comparison with the proposed model, including ResNet18 [28], Vgg16 [29], and a standard 2D-CNN. The 2D-CNN is a standard convolutional neural network architecture consisting of two convolutional modules. Each module consists of a convolutional layer, a batch normalization layer, a ReLU activation function, and a max-pooling layer.

Based on the basic architectures of these models, we designed the following experiments to evaluate their performance in bearing fault diagnosis tasks. In this experiment, MCR images were used as the input for all models. The domain adaptation algorithm used was the integrated MK-MMD and CDAN+E method.

The accuracy of the models is shown in Table 8 and Figure 10. The experimental results show that the KAResNet model significantly outperforms several common fault diagnosis models. MCR-KAResNet-TLDAF improved the accuracy by 4.26% compared to the MCR-ResNet18-TLDAF method, 4.55% compared to the MCR-2D-CNN-TLDAF method, and 4.7375% compared to the MCR-Vgg16-TLDAF method. In more complex transfer tasks, such as 0-2, 0-3, 3-0, and 3-1, the KAResNet model surpasses the accuracy limitations of conventional models, achieving higher accuracy rates. Notably, in transfer tasks 0-3 and 3-1, the KAResNet model achieved a 100% accuracy, demonstrating its superior performance in these scenarios. This indicates that the KAResNet model excels at extracting fault features from bearings and can more effectively integrate with domain adaptation algorithms to achieve better alignment between the source and target domains.

4.1.6. Ablation Study

To gain a deeper understanding of the impact of each module on the overall performance of the proposed model, ablation experiments were conducted on the CWRU dataset. In these experiments, Method 8 represents the complete proposed method, while Method 1 serves as the baseline method.

In the table, the “Image Processing” module uses ✓ to indicate the selection of 0.2MTF-CWT-0.2RP as the image input and × to indicate the selection of CWT as the image input. In the “Model Selection” module, ✓ indicates the use of the proposed KAResNet, while × indicates the removal of the KABlock, replaced with two regular fully connected layers in a standard residual network. In the “Domain Adaptation” module, ✓ indicates the use of the TLDAF module, while × indicates the absence of any domain adaptation algorithm.

Based on the experimental results (Table 9), indicate that Method 2, Method 3, and Method 4 each add one module to the baseline method (Method 1). Notably, adding the domain adaptation algorithm module (Method 4) results in a significant improvement in the accuracy. Methods 5, 6, and 7 add two modules compared to Method 1, and each shows an improvement in the accuracy. Method 8, which is the complete proposed method, includes all three modules. The results show that the combination of all three modules achieves the highest accuracy in bearing fault diagnosis, validating the feasibility and superiority of the proposed MIF-KAResNet-TLDAF model.

4.2. Case 2

4.2.1. Jiangnan University Dataset Description

In this paper, the Jiangnan University (JNU) dataset is selected as Case 2 for evaluating and validating the MCR-KAResNet-TLDAF method. The test rig setup is shown in Figure 11, and the bearings used in the experiments are the N205 and NU205 models. Acceleration sensors were used to collect vibration signals from the bearings at different positions on the motor. The vibration signal of the bearings was selected for this study. The details of the bearings are shown in Table 10.

There are four health states for the bearings: healthy, outer ring failure, roller failure, and inner ring failure. The first three health states are from the N205 bearing, while the fourth is from the NU205 bearing [30]. The four health states in the JNU dataset are detailed in Table 11. In addition, the timing waveforms of different health states of the bearings are demonstrated in Figure 12.

The vibration signals in the JNU dataset are collected using a vibration acceleration device (PCB MA352A60) at a sampling frequency of 50kHz. The vibration accelerometers have a bandwidth of 5 Hz to 60 kHz, with an output of 10 mV/g. The dataset is divided into three different operating conditions based on three different speeds, as shown in Table 12. The sampling points for the JNU dataset are calculated using Equation (11), with the final selected sampling point set to 5000. During the experiments, the training and test sets are divided in the same way as the CWRU dataset. The number of iterations for the JNU dataset is set to 150. The experimental details of the JNU dataset are consistent with those of the CWRU dataset.

4.2.2. Image Fusion Experiment

On the JNU dataset, we conducted three sets of experiments to compare the impact of different model inputs on fault diagnosis performance. The experimental results are shown in Figure 13 and Table 13.

First group: When using MTF and RP as the highlighted modules, the fault diagnosis performance was significantly lower than with CWT images. This may be because CWT better captures the time–frequency characteristics of the signal, providing a greater advantage in handling bearing signals.

Second group: Placing CWT in the green channel resulted in the highest accuracy. This could be because the green channel makes the fault features more prominent.

Third group: Using 0.1MTF-CWT-0.1RP as the input led to a decrease in the average fault diagnosis accuracy. This might be due to the parameters for the MTF and RP channels being set too low, which weakened the image features excessively. When using MTF-CWT-RP as the input, the average fault diagnosis rate decreased by 2.499%, significantly reducing fault diagnosis accuracy. This could be because the lack of a primary focus among the three channels led to important features being overshadowed. When changing the order of RP and MTF to use 0.2RP-CWT-0.2MTF as the model input, the average accuracy also decreased compared to the method proposed in this paper.

These three sets of experiments gradually verified the feasibility of setting the parameters to 0.2MTF-CWT-0.2RP.

4.2.3. Image Comparison Experiments

The comparative image experiments are shown in Table 14. The results indicate that the proposed MCR-KAResNet-TLDAF method achieved a higher average accuracy than the other three individual image processing methods.

Although the MCR-KAResNet-TLDAF method did not achieve the highest accuracy in the 0-1 transfer task, it outperformed the other methods in the remaining five transfer tasks. The MCR-KAResNet-TLDAF method achieved an accuracy of over 99.5% across all the transfer tasks; a benchmark that the other three methods could not reach. It improved by 16.449% compared to the RP-KAResNet-TLDAF method, by 44.109% compared to the MTF-KAResNet-TLDAF method, and by 3.111% compared to the CWT-KAResNet-TLDAF method. Notably, in the 2-0 transfer task, the MCR-KAResNet-TLDAF method achieved an accuracy of 99.667%, breaking the bottleneck where the other three methods had accuracies below 87%. These results demonstrate the significant advantages of the MCR image processing method on the JNU dataset.

4.2.4. Comparison of Domain Adaptation Algorithms

The proposed method, which integrates two domain adaptation algorithms, achieved the highest average accuracy and attained the highest accuracy across all the transfer tasks, as shown in Table 15 and Figure 14. Additionally, it was observed that all three domain adaptation methods achieved accuracies exceeding 99% on the JNU dataset. This may be related to the strong feature extraction ability of the backbone network KAResNet. On the relatively simple JNU dataset, a high accuracy can be achieved using various domain adaptation methods.

In the 2-0 transfer task, we observed a significant decline in the accuracy of all the methods. This was primarily due to the considerable speed difference from 1000 rpm to 600 rpm. Specifically, in Condition 2 (1000 rpm), the bearing speed was higher, which likely introduced more high-frequency noise and complex features in the signal. Conversely, in Condition 0 (600 rpm), these features may have been less prominent or masked. Therefore, in the 2-0 transfer task, the features learned by the model in the source domain (1000 rpm) did not generalize effectively to the target domain (600 rpm), leading to a drop in the accuracy.

In contrast, the accuracy in the 0-2 transfer task was relatively higher. This might be because at 600 rpm, the signal noise was lower, making the features more distinct and clear, which allowed the model to learn these features more effectively. Additionally, the features between the source domain (600 rpm) and the target domain (1000 rpm) were easier to align, resulting in better performance in the 0-2 transfer task.

In the 1-0 transfer task, we found that all the methods achieved a 100% accuracy. This is mainly due to the high similarity in distribution between the source domain (800 rpm) and the target domain (600 rpm). The data features and noise distribution levels were similar, and the fault features learned by the model at 800 rpm could adapt well to the 600 rpm condition, enabling the model to generalize effectively in the target domain.

4.2.5. Comparative Experiments with Different Models

The results of the model comparison experiments on the JNU dataset are presented in Table 16 and Figure 15. KAResNet achieved a fault diagnosis accuracy of 100% across multiple operating conditions. There was a slight decrease in accuracy in transfer tasks 0-1 and 2-0. Specifically, in transfer task 0-1, the speed changed from 600 rpm in condition 0 to 800 rpm in condition 1, which is a small range of speed variation. Due to the high complexity of the MCR-KAResNet-TLDAF model, its performance on the test data may be inferior to the relatively simpler MCR-2D-CNN-TLDAF in this small-range speed variation task. In transfer task 2-0, the accuracy of MCR-KAResNet-TLDAF declines. One possible reason is that although KABlock provides a high smoothness, handling large speed variations may affect its adaptability to data of different speeds, leading to inferior performance in the 2-0 task compared to MCR-2D-CNN-TLDAF. Although its accuracy declined in the 0-1 and 2-0 transfer tasks, where it was surpassed by the other models, KAResNet still maintained the highest overall fault diagnosis accuracy. This fully demonstrates the powerful capability of KAResNet in handling complex transfer tasks.

The Vgg16 model’s performance on the JNU dataset is noticeably inferior to the other models, especially in the 2-0 and 2-1 transfer tasks, with accuracies of 70.333% and 71%, respectively. This could be due to its fixed convolutional and fully connected layer structure, which lacks the flexibility needed for complex bearing fault signals and struggles with effective domain alignment and feature adaptation, despite incorporating domain adaptation methods. Additionally, the high-speed Condition 2 (1000 rpm) introduces more high-frequency noise and complex features, interfering with Vgg16’s ability to extract accurate fault information, thereby impacting its performance in the target domain. These insights highlight the necessity of using more adaptable and robust models, like KAResNet, for effectively managing the intricate dynamics present in bearing fault diagnosis tasks under varying operational conditions.

4.2.6. Ablation Study

Ablation experiments were conducted on the JNU dataset, and the results are shown in Table 17. It can be observed that by adding a single module, both Method 2 and Method 4 significantly improved the accuracy, and Method 3 also showed some improvement. Methods 6 and 7 indicate that the overall performance is further enhanced when different modules are used in combination. Method 5 shows a noticeable improvement over Method 1, but a decline compared to Method 4, possibly due to increased model complexity and interactions between modules.

Although Methods 6 and 8 achieved a similar accuracy, Method 8 exhibited a more stable performance across various testing conditions, without significant degradation due to changes in specific conditions. This demonstrates that on the JNU dataset, the synergistic effects of multiple modules in Method 8 not only improved model performance but also enhanced stability and generalization ability. These results validate the effectiveness and superiority of the proposed method in bearing fault diagnosis.

5. Conclusions

This paper proposes a cross-condition bearing diagnosis method based on image fusion and a residual network incorporating the Kolmogorov–Arnold representation theorem. Specifically, MTF, CWT, and RP images are fused to extract bearing fault information from different perspectives. The residual network, optimized using the Kolmogorov–Arnold representation theorem, is employed to more efficiently extract bearing fault features. By integrating MK-MMD and CDAN+E domain adaptation algorithms, the method better aligns the source and target domains to achieve more accurate cross-condition fault diagnosis.

This method was validated on both the CWRU and JNU datasets. Comparisons with ResNet18, Vgg16, and a 2D-CNN model demonstrated the reliability of the MCR-KAResNet-TLDAF method. The MCR-KAResNet-TLDAF method achieved an average accuracy of 99.36% on the CWRU dataset and 99.889% on the JNU dataset. The experimental results confirm the reliability of the model, indicating that the proposed method more effectively accomplishes cross-condition bearing fault diagnosis [31].

Author Contributions

Conceptualization, Z.T., X.H. and X.W.; methodology, Z.T. and X.H.; software, Z.T., X.H. and J.Z.; validation, Z.T. and X.H.; formal analysis, Z.T.; investigation, Z.T., X.H. and X.W.; resources, Z.T. and X.W.; data curation, Z.T. and J.Z.; writing—original draft, Z.T.; funding acquisition, Z.T. and X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work were supported by the key project of the National University Innovation and Entrepreneurship Training Programs Foundation under grant number 202210060002, and the Tianjin University Innovation and Entrepreneurship Training Foundation under project number 202410060027.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are included in this article. For detailed information, please contact the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhao, W.; Lv, Y.; Liu, J.; Lee, C.K.; Tu, L. Early fault diagnosis based on reinforcement learning optimized-svm model with vibration-monitored signals. Qual. Eng. 2023, 35, 696–711. [Google Scholar] [CrossRef]
Yan, J.; Kan, J.; Luo, H. Rolling bearing fault diagnosis based on markov transition field and residual network. Sensors 2022, 22, 3936. [Google Scholar] [CrossRef]
Tang, H.; Tang, Y.; Su, Y.; Feng, W.; Wang, B.; Chen, P.; Zuo, D. Feature extraction of multi-sensors for early bearing fault diagnosis using deep learning based on minimum unscented kalman filter. Eng. Appl. Artif. Intell. 2024, 127, 107138. [Google Scholar] [CrossRef]
Xie, S.; Li, Y.; Tan, H.; Liu, R.; Zhang, F. Multi-scale and multi-layer perceptron hybrid method for bearings fault diagnosis. Int. J. Mech. Sci. 2022, 235, 107708. [Google Scholar] [CrossRef]
Zhang, Q.; Deng, L. An intelligent fault diagnosis method of rolling bearings based on short-time fourier transform and convolutional neural network. J. Fail. Anal. Prev. 2023, 23, 795–811. [Google Scholar] [CrossRef]
Kumar, H.; Upadhyaya, G. Fault diagnosis of rolling element bearing using continuous wavelet transform and k-nearest neighbour. Mater. Today Proc. 2023, 92, 56–60. [Google Scholar] [CrossRef]
Xu, Y.; Li, Z.; Wang, S.; Li, W.; Sarkodie-Gyan, T.; Feng, S. A hybrid deep-learning model for fault diagnosis of rolling bearings. Measurement 2021, 169, 108502. [Google Scholar] [CrossRef]
Li, J.; Liu, Y.; Li, Q. Intelligent fault diagnosis of rolling bearings under imbalanced data conditions using attention-based deep learning method. Measurement 2022, 189, 110500. [Google Scholar] [CrossRef]
Ma, S.; Chu, F. Ensemble deep learning-based fault diagnosis of rotor bearing systems. Comput. Ind. 2019, 105, 143–152. [Google Scholar] [CrossRef]
Shen, S.; Lu, H.; Sadoughi, M.; Hu, C.; Nemani, V.; Thelen, A.; Webster, K.; Darr, M.; Sidon, J.; Kenny, S. A physics-informed deep learning approach for bearing fault detection. Eng. Appl. Artif. Intell. 2021, 103, 104295. [Google Scholar] [CrossRef]
Yu, X.; Yin, H.; Sun, L.; Dong, F.; Yu, K.; Feng, K.; Zhang, Y.; Yu, W. A new cross-domain bearing fault diagnosis framework based on transferable features and manifold embedded discriminative distribution adaption under class imbalance. IEEE Sens. J. 2023, 23, 7525–7545. [Google Scholar] [CrossRef]
Chen, P.; Zhao, R.; He, T.; Wei, K.; Yang, Q. Unsupervised domain adaptation of bearing fault diagnosis based on join sliced wasserstein distance. ISA Trans. 2022, 129, 504–519. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Yuan, P.; Su, K.; Li, D.; Xie, Z.; Kong, X. Innovative integration of multi-scale residual networks and mk-mmd for enhanced feature representation in fault diagnosis. Meas. Sci. Technol. 2024, 35, 086108. [Google Scholar] [CrossRef]
Xu, F.; Hong, D.; Tian, Y.; Wei, N.; Wu, J. Unsupervised deep transfer learning method for rolling bearing fault diagnosis based on improved convolutional neural network. J. Phys. Conf. Ser. 2024, 2694, 012050. [Google Scholar] [CrossRef]
Wu, H.; Li, J.; Zhang, Q.; Tao, J.; Meng, Z. Intelligent fault diagnosis of rolling bearings under varying operating conditions based on domain-adversarial neural network and attention mechanism. ISA Trans. 2022, 130, 477–489. [Google Scholar] [CrossRef]
Wu, C.-g.; Zhao, D.; Han, T.; Xia, Y. Bearing fault diagnosis using gradual conditional domain adversarial network. Appl. Soft Comput. 2024, 158, 111580. [Google Scholar] [CrossRef]
Zhao, Z.; Zhang, Q.; Yu, X.; Sun, C.; Wang, S.; Yan, R.; Chen, X. Applications of unsupervised deep transfer learning to intelligent fault diagnosis: A survey and comparative study. IEEE Trans. Instrum. Meas. 2021, 70, 1–28. [Google Scholar] [CrossRef]
Chen, X.; Yang, R.; Xue, Y.; Huang, M.; Ferrero, R.; Wang, Z. Deep transfer learning for bearing fault diagnosis: A systematic review since 2016. IEEE Trans. Instrum. Meas. 2023, 72, 1–21. [Google Scholar] [CrossRef]
Kuang, J.; Xu, G.; Tao, T.; Wu, Q.; Han, C.; Wei, F. Domain conditioned joint adaptation network for intelligent bearing fault diagnosis across different positions and machines. IEEE Sens. J. 2023, 23, 4000–4010. [Google Scholar] [CrossRef]
Gu, J.; Peng, Y.; Lu, H.; Chang, X.; Chen, G. A novel fault diagnosis method of rotating machinery via vmd, cwt and improved cnn. Measurement 2022, 200, 111635. [Google Scholar] [CrossRef]
Wang, Z.; Oates, T. Imaging time-series to improve classification and imputation. arXiv 2015, arXiv:1506.00327. [Google Scholar]
Wang, D.-F.; Guo, Y.; Wu, X.; Na, J.; Litak, G. Planetary-gearbox fault classification by convolutional neural network and recurrence plot. Appl. Sci. 2020, 10, 932. [Google Scholar] [CrossRef]
Schmidt-Hieber, J. The kolmogorov–arnold representation theorem revisited. Neural Netw. 2021, 137, 119–126. [Google Scholar] [CrossRef]
Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. Kan: Kolmogorov-arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar]
Li, J.; Ye, Z.; Gao, J.; Meng, Z.; Tong, K.; Yu, S. Fault transfer diagnosis of rolling bearings across different devices via multi-domain information fusion and multi-kernel maximum mean discrepancy. Appl. Soft Comput. 2024, 159, 111620. [Google Scholar] [CrossRef]
Long, M.; Cao, Z.; Wang, J.; Jordan, M.I. Conditional adversarial domain adaptation. In Proceedings of the Advances in Neural Information Processing Systems 31, Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
Bearings Data Center, Seeded Fault Test Data, Case Western Reserve University. Available online: https://engineering.case.edu/bearingdatacenter/download-data-file (accessed on 19 February 2024).
Ullah, A.; Elahi, H.; Sun, Z.; Khatoon, A.; Ahmad, I. Comparative analysis of alexnet, resnet18 and squeezenet with diverse modification and arduous implementation. Arab. J. Sci. Eng. 2022, 47, 2397–2417. [Google Scholar] [CrossRef]
Kaya, Y.; Kuncan, F.; Ertunç, H.M. A new automatic bearing fault size diagnosis using time-frequency images of cwt and deep transfer learning methods. Turk. J. Electr. Eng. Comput. Sci. 2022, 30, 1851–1867. [Google Scholar] [CrossRef]
Li, K.; Ping, X.; Wang, H.; Chen, P.; Cao, Y. Sequential fuzzy diagnosis method for motor roller bearing in variable operating conditions based on vibration analysis. Sensors 2013, 13, 8013–8041. [Google Scholar] [CrossRef]
Xu, F.; Ding, N.; Li, N.; Liu, L.; Hou, N.; Xu, N.; Guo, W.; Tian, L.; Xu, H.; Wu, C.-M.L.; et al. A review of bearing failure modes, mechanisms and causes. Eng. Fail. Anal. 2023, 152, 107518. [Google Scholar] [CrossRef]

Figure 1. Domain adaptation.

Figure 2. The proposed MCR-KAResNet-TLDAF framework.

Figure 3. MCR image fusion process.

Figure 4. KAResNet framework.

Figure 5. Principles of domain adaptation algorithms incorporating MK-MMD and CDAN+E.

Figure 6. CWRU dataset experimental platform.

Figure 7. Time series waveforms of CWRU dataset bearings under different health conditions.

Figure 8. Results of the image fusion experiments on the CWRU dataset: (a) first group; (b) second group; (c) third group.

Figure 9. The average accuracy of different domain adaptation algorithms on the CWRU dataset.

Figure 10. Average accuracy of fault diagnosis for different models on the CWRU dataset.

Figure 11. JNU dataset experimental platform.

Figure 12. Time series waveforms of CWRU dataset bearings under different health conditions.

Figure 13. Results of the image fusion experiments on the JNU dataset: (a) first group; (b) second group; (c) third group.

Figure 14. Average accuracy of different domain adaptation algorithms on the JNU dataset.

Figure 15. Average accuracy of fault diagnosis for different models on the JNU dataset.

Table 1. KAResNet parameters.

Layers		Out_Channels	Kernel_Size	Padding
Conv1	Conv2d	64	7	3
Maxpool	MaxPool2d	64	3	1
H1	Conv2d	64	3	1
	Conv2d	64	3	1
	Conv2d	64	3	1
	Conv2d	64	3	1
H2	Conv2d	128	3	1
	Conv2d	128	3	1
	Conv2d	128	1	1
	Conv2d	128	3	1
	Conv2d	128	3	1
H3	Conv2d	256	3	1
	Conv2d	256	3	1
	Conv2d	256	1	1
	Conv2d	256	3	1
	Conv2d	256	3	1
Avgpool	AdaptiveAvgPool2d	1	/	/
Fc	KABlock	/	/	/
Output	KABlock	/	/	/

Table 2. Detailed description of bearing models in the CWRU dataset.

Item	Description
Bearing model	SKF 6205-2RS JEM
Outer diameter (mm)	52
Inner diameter (mm)	25
Width (mm)	15
Number of rolling elements	8
Diameter of rolling elements (mm)	7.94
Cage material	Synthetic resin cage
Sealing type	Double seal (2RS, indicating both sides of the bearing have rubber seals)
Operating conditions	Performance of the bearing under various fault simulations and operating conditions on a test rig

Table 3. CWRU dataset fault condition information.

Class Label	0	1	2	3	4	5	6	7	8	9
Fauit size (mils)	0	7	7	7	14	14	14	21	21	21
Fault location	Normal	Inner race	Roller	Outer race	Inner race	Roller	Outer race	Inner race	Roller	Outer race
Abbreviation	Normal	IR07	R07	OR07	IR14	R14	OR14	IR21	R21	OR21

Table 4. CWRU dataset operating conditions.

Task	0	1	2	3
Load (HP)	0 HP	1 HP	2 HP	3 HP
Speed (rpm)	1797	1772	1750	1730

Table 5. Experimental results of the image fusion method on the CWRU dataset.

Group	Methods	0-1	0-2	0-3	1-0	1-2	1-3	2-0	2-1	2-3	3-0	3-1	3-2	Average
First group	0.2RP-MTF-0.2CWT	99.33	99.6	95.73	99.6	100	99.87	99.47	99.6	99.87	96	97.07	100	98.84
	0.2CWT-RP-0.2MTF	84.53	87.33	81.73	98.4	98.93	99.6	99.07	98.93	99.73	94.53	95.73	99.73	94.86
	0.2MTF-CWT-0.2RP	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36
Second group	0.2MTF-0.2RP-CWT	90	76.53	70	95.73	100	100	90.93	94.27	100	76.27	100	100	91.14
	CWT-0.2MTF-0.2RP	100	100	99.73	100	100	100	95.87	96.4	100	79.6	100	100	97.63
	0.2MTF-CWT-0.2RP	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36
Third group	0.1MTF-CWT-0.1RP	100	99.87	99.87	100	100	100	97.33	99.73	100	95.07	93.33	100	98.77
	MTF-CWT-RP	100	99.73	100	100	100	100	93.87	94	100	93.87	100	100	98.46
	0.2RP-CWT-0.2MTF	100	100	100	100	100	100	93.6	100	100	89.2	100	100	98.57
	0.2MTF-CWT-0.2RP	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36

Table 6. Experimental results of the image fusion method on the CWRU dataset.

Methods	0-1	0-2	0-3	1-0	1-2	1-3	2-0	2-1	2-3	3-0	3-1	3-2	Average
MCR-KAResNet-TLDAF	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36
RP-KAResNet-TLDAF	66.8	66.67	67.33	81.81	84.27	81.73	81.14	79.47	85.33	78.48	77.87	82.93	77.82
MTF-KAResNet-TLDAF	52.13	58.8	51.47	61.33	60.27	53.47	55.33	61.2	55.73	48.67	52.53	57.73	55.72
CWT-KAResNet-TLDAF	100	100	100	100	100	100	100	100	100	80	89.333	100	97.444

Table 7. Experimental results of different domain adaptation algorithms on the CWRU dataset.

Methods	0-1	0-2	0-3	1-0	1-2	1-3	2-0	2-1	2-3	3-0	3-1	3-2	Average
Method A	100	99.87	99.73	100	100	100	98	100	100	86.67	100	100	98.69
Method B	100	93.07	80.53	100	99.87	94.4	93.07	90.67	100	80	86.13	99.6	93.11
MCR-KAResNet-TLDAF	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36

Table 8. Comparative experimental results of different models on the CWRU dataset.

Methods	0-1	0-2	0-3	1-0	1-2	1-3	2-0	2-1	2-3	3-0	3-1	3-2	Average
MCR-Vgg16-TLDAF	99.73	94.27	82.53	99.47	97.87	96.67	89.33	99.87	100	91.33	87.6	96.8	94.6225
MCR-ResNet18-TLDAF	100	88	89.6	100	100	100	98.13	97.6	99.87	78.4	89.6	100	95.1
MCR-2D-CNN-TLDAF	100	89.87	75.87	100	100	100	90.4	100	100	89.6	92	100	94.81
MCR-KAResNet-TLDAF	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36

Table 9. CWUR dataset ablation experiment result.

	Image Processing	Model Selection	Domain Adaptation	Accuracy
	MCR	KAResNet	TLDAF	0-1	0-2	0-3	1-0	1-2	1-3	2-0	2-1	2-3	3-0	3-1	3-2	Average
Method 1	×	×	×	90.8	90.8	68.8	84.8	98	92.8	84.8	89.6	94.8	79.6	78.4	90.4	86.97
Method 2	✓	×	×	96.8	90.4	78.8	96.4	96	94	88.4	89.6	92.4	79.6	79.6	84.8	88.9
Method 3	×	✓	×	94.8	82	77.6	82.4	93.6	86.8	90	98	98.4	80	79.2	90	87.73
Method 4	×	×	✓	100	100	100	100	100	100	88.8	100	100	80	100	100	97.4
Method 5	×	✓	✓	100	100	100	100	100	100	100	100	100	80	89.333	100	97.44
Method 6	✓	×	✓	100	100	100	99.867	100	100	96.533	100	100	86.267	93.333	100	97.92
Method 7	✓	✓	×	98.4	87.2	80.4	95.6	98	92.8	89.2	90	99.2	74.4	70.4	93.6	89.1
Method 8	✓	✓	✓	100	99.87	100	100	100	100	97.6	100	100	94.8	100	100	99.36

Table 10. Detailed description of bearing models in the JNU dataset.

Item	N205	NU205
Outer diameter (mm)	52	52
Inner diameter (mm)	25	25
Width (mm)	15	15
Roller diameter (mm)	7	7
Number of rolling elements	10	11
Out-race defect (width × depth)	0.3 × 0.25 mm early stage	/
Rolling element defect (width × depth)	0.5 × 0.15 mm early stage	/
Inner-race defect (width × depth)	/	0.3 × 0.25 mm early stage

Table 11. JNU Dataset Fault Condition Information.

Class Label	0	1	2	3
Fault location	Normal	Inner ring	Outer ring	Roller
Bearing type	N205	NU205	N205	N205

Table 12. JNU Dataset operating conditions.

Task	0	1	2
Speed (rpm)	600	800	1000

Table 13. Experimental results of image fusion method on the JNU dataset.

Group	Methods	0-1	0-2	1-0	1-2	2-0	2-1	Average
First group	0.2RP-MTF-0.2CWT	72	88	71.33	83.33	86	83.67	80.72
	0.2CWT-RP-0.2MTF	80.33	87.67	82.67	92.67	82.67	96	87
	0.2MTF-CWT-0.2RP	99.667	100	100	100	99.667	100	99.889
Second group	0.2MTF-0.2RP-CWT	100	100	100	100	83.67	100	97.28
	CWT-0.2MTF-0.2RP	100	98.33	100	100	98.33	100	99.44
	0.2MTF-CWT-0.2RP	99.667	100	100	100	99.667	100	99.889
Third group	0.1MTF-CWT-0.1RP	100	98.67	100	100	97.67	100	99.39
	MTF-CWT-RP	99.33	100	100	100	85.33	99.67	97.39
	0.2RP-CWT-0.2MTF	100	95.67	100	100	99	99.67	99.06
	0.2MTF-CWT-0.2RP	99.667	100	100	100	99.667	100	99.889

Table 14. Experimental results of image fusion method on JNU dataset.

Methods	0-1	0-2	1-0	1-2	2-0	2-1	Average
MCR-KAResNet-TLDAF	99.667	100	100	100	99.667	100	99.889
RP-KAResNet-TLDAF	88.67	74	92.33	91.67	63	91	83.44
MTF-KAResNet-TLDAF	34	60	49.67	60	73	58	55.78
CWT-KAResNet-TLDAF	100	99.667	98.667	99.667	86.667	96	96.778

Table 15. Experimental results of different domain adaptation algorithms on the JNU dataset.

Methods	0-1	0-2	1-0	1-2	2-0	2-1	Average
Method A	99.667	100	100	100	99	100	99.778
Method B	99.33	98.67	100	99.67	97	99.67	99.06
MCR-KAResNet-TLDAF	99.667	100	100	100	99.667	100	99.889

Table 16. Comparative experimental results of different models on the JNU dataset.

Methods	0-1	0-2	1-0	1-2	2-0	2-1	Average
MCR-Vgg16-TLDAF	94.333	92	97.333	97.667	70.333	71	87.111
MCR-ResNet18-TLDAF	100	99.333	100	100	99.333	94.333	98.833
MCR-2D-CNN-TLDAF	99	98.333	100	100	100	99	99.389
MCR-KAResNet-TLDAF	99.667	100	100	100	99.667	100	99.889

Table 17. JNU dataset ablation experiment results.

	Image Processing	Model Selection	Domain Adaptation	Accuracy
	MCR	KAResNet	TLDAF	0-1	0-2	1-0	1-2	2-0	2-1	Average
Method 1	×	×	×	95.667	78	89.333	77.667	73	77.333	81.833
Method 2	✓	×	×	83	86	95.667	99.667	97.667	99.667	93.611
Method 3	×	✓	×	82.333	81	91	81	84.667	97.667	86.278
Method 4	×	×	✓	99.667	100	99	100	99.667	96	98.889
Method 5	×	✓	✓	100	99.667	98.667	99.667	86.667	96	96.778
Method 6	✓	×	✓	99.667	100	100	100	99	100	99.778
Method 7	✓	✓	×	99.778	99.889	99.222	99.889	95.111	97.333	98.481
Method 8	✓	✓	✓	99.667	100	100	100	99.667	100	99.889

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, Z.; Hou, X.; Wang, X.; Zou, J. A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem. Appl. Sci. 2024, 14, 7254. https://doi.org/10.3390/app14167254

AMA Style

Tang Z, Hou X, Wang X, Zou J. A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem. Applied Sciences. 2024; 14(16):7254. https://doi.org/10.3390/app14167254

Chicago/Turabian Style

Tang, Ziyi, Xinhao Hou, Xin Wang, and Jifeng Zou. 2024. "A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem" Applied Sciences 14, no. 16: 7254. https://doi.org/10.3390/app14167254

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem

Abstract

1. Introduction

2. Theoretical Foundation

2.1. Transfer Learning

2.2. Continuous Wavelet Transform

2.3. Markov Transition Field

2.4. Recurrence Plot

2.5. Kolmogorov–Arnold Representation Theorem

3. Proposed Method

3.1. Image Fusion

3.1.1. KAResNet

3.1.2. MK-MMD

3.1.3. CDAN+E

3.1.4. A Domain Adaptive Algorithm Combining MK-MMD and CDAN+E

4. Experimental Validation

4.1. Case 1

4.1.1. Case Western Reserve University Dataset Description

4.1.2. Image Fusion Experiment

4.1.3. Image Comparison Experiments

4.1.4. Comparison of Domain Adaptation Algorithms

4.1.5. Comparative Experiments with Different Models

4.1.6. Ablation Study

4.2. Case 2

4.2.1. Jiangnan University Dataset Description

4.2.2. Image Fusion Experiment

4.2.3. Image Comparison Experiments

4.2.4. Comparison of Domain Adaptation Algorithms

4.2.5. Comparative Experiments with Different Models

4.2.6. Ablation Study

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI