1. Introduction
With the rapid development of industrial intelligence, intelligent fault diagnosis technology is playing an increasingly important role in maintaining the health of mechanical equipment and ensuring its safe and stable operation [
1]. Rolling bearings, as critical components of rotating machinery, often operate in harsh and variable environments, such as at high speed and under heavy loads, making them prone to wear and failure and leading to severe mechanical accidents [
2]. To avoid significant economic losses and casualties, research on intelligent fault diagnosis methods for rolling bearings is particularly important [
3]. Fault diagnosis is fundamentally a pattern recognition problem, and analyzing and processing vibration signals during the operation of rolling bearings is an effective approach for diagnosing faults in rotating machinery. One of the most crucial aims is to break the “curse of dimensionality” and extract low-dimensional features with high sensitivity [
4,
5,
6].
To enhance the quality of feature extraction and address the severe “curse of dimensionality” issue at the current stage, manifold learning algorithms have emerged. Classical manifold learning algorithms mainly include Principal Component Analysis (PCA), Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA), and Multidimensional Scaling (MDS). However, these algorithms are primarily used for linear dimensionality reduction and may not be suitable for high-dimensional nonlinear vibration data from rolling bearings [
7,
8].
In the year 2000, Joshua B. Tenenbaum and Sam T. Roweis [
9,
10] proposed two classic nonlinear manifold learning dimensionality reduction algorithms, Isometric Feature Mapping (Isomap) and Locally Linear Embedding (LLE), in science. Since then, manifold learning algorithms have been extensively researched by researchers and have gradually become a research hotspot in the fields of dimensionality reduction and pattern recognition [
9,
10,
11]. Based on their different mathematical assumptions, manifold learning algorithms are divided into two major categories: locally preserving embedding methods and globally preserving embedding methods. Laplacian Eigenmaps (LE) [
12], LLE [
10], Hessian-based Locally Linear Embedding (HLLE) [
13], and Local Tangent Space Alignment (LTSA) [
14] are considered local preserving embedding methods in manifold learning, while Isomap [
8], Diffusion Maps (DM) [
2], and t-Stochastic Neighbor Embedding (t-SNE) [
15] are regarded as globally preserving embedding methods. However, regardless of the manifold learning algorithm’s mathematical assumptions, there is a bottleneck in selecting neighboring points [
6,
11].
To address the sensitivity of neighbor point selection in manifold learning algorithms, Zhenyue Zhang et al. conducted research from two perspectives: adaptive neighbor selection and the interaction between manifold curvature and sampling density [
16]. They explored methods for constructing nonlinear low-dimensional manifolds from high-dimensional space samples, providing directions for subsequent researchers. Chuang Sun et al. conducted research from the perspective of adaptive neighbors and used the kernel sparse representation method to select sample neighbors and reconstruct the weights of the neighbor graph for the LLE algorithm [
17]. Yan Zhang et al. integrated nonnegative matrix factorization with sparsity constraints based on the work in reference [
17] and applied it to the LLE algorithm to jointly minimize the neighborhood reconstruction error on the weight matrix [
18]. All of these methods use sparsity constraints to select neighbor points, but they perform relatively averagely when the data contain noise points and outliers.
To address this issue, Yunlong Gao et al. proposed a discriminant analysis based on the reliability of local neighborhoods, enhancing the performance of effective samples in low-dimensional space and filtering the interference of outliers, thereby improving the dimensionality reduction ability [
19]. Jing An et al. introduced an adaptive neighborhood-preserving discriminant projection model [
20]. By updating sparse reconstruction coefficients, the adverse effects of noise and outliers on the dimensionality reduction were mitigated, enhancing sample clustering. Jiaqi Xue et al. proposed a locally linear embedding method by applying an adaptive neighbor strategy, preserving more original information when embedding high-dimensional data manifolds into low-dimensional space and achieving better clustering results [
11]. It can be observed that various discrimination methods have been widely applied in manifold learning models. Most of these adaptive neighbor strategies and discrimination methods are applied in locally preserving embedding methods of manifold learning algorithms, while their application in globally preserving embedding manifold learning algorithms is limited, especially in unsupervised learning models.
To address the aforementioned issue, incorporating label information into the algorithm’s supervised learning mode can further enhance its clustering capability. Ratthachat Chatpatanasiri et al. proposed a general framework for manifold learning semi-supervised dimensionality reduction, providing research directions for subsequent researchers [
21]. Jing Wang et al. proposed a semi-supervised manifold alignment algorithm that utilizes sample points and their corresponding relationships to construct connections between different manifolds [
22]. Zohre Karimi et al. introduced a novel hierarchical spatial semi-supervised metric learning approach, integrating local constraints and information-theoretic nonlocal constraints to better represent the smoothness assumption of multiple manifolds using the metric matrix [
23]. Mingxia Chen et al. proposed a robust semi-supervised manifold learning framework applied in locally preserving embedding manifold learning algorithms to eliminate adverse effects caused by noise points [
24]. Ding Li et al. derived an extension of a semi-supervised manifold regularization algorithm for classification tasks, optimizing the algorithm’s performance on multi-class problems using weighted strategies [
25]. Jun Ma et al. proposed a secure semi-supervised learning framework, using both manifold and discriminant regularization to mitigate the influence of unlabeled points and boundary points in the pattern recognition process [
26]. However, the impact of unlabeled points and boundary points in the semi-supervised learning mode on the model’s clustering and classification capabilities remains unresolved.
Therefore, researchers have applied supervised learning modes to manifold learning models, which, compared to manifold learning models under the semi-supervised learning mode, demonstrate stronger robustness in handling classification problems [
27,
28,
29]. However, current research methods are limited to dimensionality reduction and fault diagnosis tasks on a single feature space within the manifold learning model. The feature information in the data is singular and incomprehensive.
In summary, manifold learning methods have been widely applied in the fields of dimensionality reduction and fault diagnosis. However, they still have limitations. The issues of neighbor point selection in manifold global preservation embedding, the influence of data outliers on clustering effectiveness, and the singular and incomplete feature information contained in the data have not been fully addressed. To address these problems and build upon existing research, this paper proposes a supervised manifold learning approach for rolling bearing fault diagnosis based on the discriminative fusion of multiple pieces of feature information using an adaptive nearest neighbor strategy.
The main contributions of this paper are summarized as follows:
Propose an adaptive neighbor selection strategy that amalgamates the Euclidean distance and cosine similarity measures. This strategy systematically computes both the distance and angular information among neighboring points, utilizing the metric mean as the discriminant criterion. By configuring the preset neighboring points as the criterion object, it dynamically adjusts the proximity graph to refine the local structure of the manifold. This process is aimed at enhancing the precision of the manifold space depiction and local feature representation and reducing the adverse effects of data outliers on clustering performance.
Propose three methods for transforming feature spaces and extracting spatial feature information and space information. Notably, this paper proposes a unique form of the kernel function, the exponential linear kernel function, which serves to project data into a novel kernel Hilbert space. Concurrently, this function is employed as the nonlinear discriminant mapping function in the Supervised Version of the Isometric Feature Mapping (S-Isomap) algorithm, thus providing a distinct representation of data in the manifold space. The extracted feature information, originating from diverse kernel Hilbert spaces and manifold spaces, ensures the intricate and sensitive nature of the features.
Propose a fault diagnosis algorithm model for rolling bearings by employing a supervised learning paradigm. Under the adaptive neighbor selection strategy, features from different spaces are merged which are both sensitive and complex to form a multi-space metric matrix. This matrix is designed to encapsulate substantial multi-space feature information, enabling its fusion with machine learning classifier algorithms to facilitate fault diagnosis.
The structure of this paper is organized as follows:
Section 2 introduces the foundational manifold learning algorithms Isomap and S-Isomap along with their relevant theories.
Section 3 presents the proposed supervised manifold learning method involving an adaptive neighbor strategy, the extraction of multi-space feature information, and the discriminative fusion of multiple pieces of feature information.
Section 4 conducts an evaluation of the model’s clustering and classification capabilities, analyzing and comparing the proposed approach in this paper with traditional manifold learning methods from both qualitative and quantitative perspectives. Finally,
Section 5 provides a comprehensive summary of the entire paper.
5. Conclusions
Addressing the issue of dimensionality is a challenge in the current field of intelligent fault diagnosis of rolling bearings, so this paper proposes a supervised manifold learning method that integrates multiple pieces of feature information for diagnosing rolling bearing faults. Firstly, an adaptive nearest neighbor strategy is employed to reconstruct the manifold neighbor graph. Subsequently, multiple spatial transformation techniques are introduced to acquire feature information in different spaces. Notably, an innovative exponential linear kernel function and the KS-Isomap algorithm are presented to enrich the feature space with novel information. The multi-space feature information is then fused with discriminative information derived from the data labels, leading to the development of a supervised manifold learning method for feature extraction. Finally, this method is employed in collaboration with classifiers to conduct fault diagnosis on rolling bearings. The experimental validation using the CWRU open dataset and our laboratory-built experimental data demonstrates that the proposed AN-MSDIS-Isomap algorithm outperforms traditional manifold learning methods in clustering, dimension reduction, and fault diagnosis. It exhibits consistently good classification accuracy across various classifiers, with the highest classification accuracy reaching 100%.
The proposed method addresses the challenge of dimensionality and effectively extracts significant features representing the data. When combined with a classifier, it performs well in fault diagnosis tasks. Future research will focus on fault diagnosis tasks for bearing vibration signals in strong noise environments and the optimization of fault diagnosis performance between algorithms and different classifiers. Meanwhile, aiming to solve the problem of the high computational complexity of machine learning algorithms, research on methods to reduce computational costs and improve computational efficiency should be conducted.