A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification

Liang, Zexiao; Gong, Ruyi; Tan, Guoliang; Ji, Shiyin; Zhan, Ruidian

doi:10.3390/app14125342

Open AccessArticle

A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification

by

Zexiao Liang

¹

,

Ruyi Gong

²,

Guoliang Tan

²,

Shiyin Ji

³ and

Ruidian Zhan

^1,4,*

¹

School of Integrated Circuits, Guangdong University of Technology, Guangzhou 510006, China

²

School of Automation, Guangdong University of Technology, Guangzhou 510006, China

³

School of Biomedical and Phamaceutical Sicences, Guangdong University of Technology, Guangzhou 510006, China

⁴

School of Advanced Manufacturing, Guangdong University of Technology, Jieyang 515200, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5342; https://doi.org/10.3390/app14125342

Submission received: 22 April 2024 / Revised: 5 June 2024 / Accepted: 18 June 2024 / Published: 20 June 2024

Download

Browse Figures

Versions Notes

Abstract

:

With the increasing demand for high-resolution images, handling high-dimensional image data has become a key aspect of intelligence algorithms. One effective approach is to preserve the high-dimensional manifold structure of the data and find the accurate mappings in a lower-dimensional space. However, various non-sparse, high-energy occlusions in real-world images can lead to erroneous calculations of sample relationships, invalidating the existing distance-based manifold dimensionality reduction techniques. Many types of noise are difficult to capture and filter in the original domain but can be effectively separated in the frequency domain. Inspired by this idea, a novel approach is proposed in this paper, which obtains the high-dimensional manifold structure according to the correlationships between data points in the frequency domain and accurately maps it to a lower-dimensional space, named Frequency domain-based Manifold Dimensionality Reduction (FMDR). In FMDR, samples are first transformed into frequency domains. Then, interference is filtered based on the distribution in the frequency domain, thereby emphasizing discriminative features. Subsequently, an innovative kernel function is proposed for measuring the similarities between samples according to the correlationships in the frequency domain. With the assistance of these correlationships, a graph structure can be constructed and utilized to find the mapping in a low-dimensional space. To further demonstrate the effectiveness of the proposed algorithm, FMDR is employed for the semi-supervised classification problems in this paper. Experiments using public image datasets indicate that, compared to baseline algorithms and state-of-the-art methods, our approach achieves superior recognition performance. Even with very few labeled data, the advantages of FMDR are still maintained. The effectiveness of FMDR in dimensionality reduction and feature extraction of images makes it widely applicable in fields such as image processing and image recognition.

Keywords:

Manifold Learning; Dimensionality Reduction; Frequency Domain Correlationship; Semi-supervised Classification

1. Introduction

In modern intelligent image processing and application scenarios, one of the most tricky problems is dealing with high-dimensional data [1]. With the advancements in camera lenses and remote sensing technologies, obtaining high-quality images of a target object has gradually achieved breakthroughs [2]. However, high-quality images imply high-dimensional data, which not only contain a large amount of redundant information but also pose challenges for image processing algorithms due to the heavy computational resources they demand, so called “the curse of dimensionality” [3]. On the other hand, imprudently reducing the dimensionality of data may lead to the loss of valuable discriminative information, leading to the invalidation of recognition [4]. Therefore, a key points is to delicately reduce the dimensions while preserving the major details and relationships between samples.

In the fields of pattern recognition and data mining, dimensionality reduction techniques have been developed for many years and continue to achieve new advancements in various applications [5]. Jose Luis Vieira Sobrinho et al. designed a two-stage dimensionality reduction technique for social media engagement classification [6]. Mahmood et al. put forward a dimensionality reduction-based machine learning technique for tumor classification of RNA data [7]. Barkalov et al. proposed a method for constructing multiple Peano curves to preserve the object proximity information in the course of dimensionality reduction [8]. Jie Li et al. proposes a novel non-negative matrix factorization method with dual-graph regularization constrained for dimensionality reduction and clustering [9]. An algorithm that computes all rough sets constructed for dimensionality reduction was proposed by Yanir et al. [10]. Ramin Heidarian Dehkordi et al. retrieved the wheat crop traits of hyperspectral data using machine learning and dimensionality reduction algorithms [11]. Chao Yao et al. proposed a framework named ColAE for hyperspectral image feature extraction and clustering with auto-encoders and dimension reduction techniques [12]. Md Rashedul Islam et al. designed a dimension reduction method that combines feature extraction and selection for effective image classification [13].

Generally speaking, the goal of dimensionality reduction is to transform high-dimensional data from the original space into a new low-dimensional space by utilizing linear or nonlinear transformation [1]. The most widely known techniques include linear discriminant analysis (LDA) [14], principle component analysis (PCA) [15], robust principle component analysis (RPCA) [16] and manifold learning-based algorithms [17]. LDA is a supervised technique that aims to reduce the dimensionality of high-dimensional pattern samples by projecting them onto an optimal discriminant vector space [18]. This process facilitates the extraction of discriminative information for classification purposes while simultaneously reducing the dimensionality of the feature space through compression. PCA is a linear transformation that selects multiple eigenvectors for feature extraction and dimension reduction [19]. RPCA assumes that observed data are composed of low-rank clean data and sparse noise and it achieves feature extraction and dimensionality reduction by solving an optimization problem [20]. Manifold learning aims to reveal the geometric relationship between samples in a high-dimensional manifold space. ISOMAP computes the geometric distance between two data points by constructing a neighborhood graph and searching for the shortest paths [21]. Locally linear embedding (LLE) [22] reconstructs the linear weights of two neighbors and maps it to the embedded coordinates. Laplacian eigenmaps (LE) [23] define a heart-kernel function for measuring the distance between two points and finds the best low-dimensional mapping by solving a trace minimization problem.

However, the existing dimensionality reduction techniques basically treat data in the original domain. In other words, the relationships between high-dimensional data are measured based on the distances in the original domain. As is widely known, there are various types of interference that are difficult to capture and eliminate in the original domain, but that are much easier to address in the frequency domain. For instance, in the case of facial images affected by extensive occlusions, such as shadows and obstructions, the interference is not sparse and possesses high energy. This makes it challenging to separate in the original domain. PCA and RPCA might recognize the occlusions as the principle components due to their high energy. Distance-based algorithms may erroneously classify different objects that are corrupted by similar occlusion into the same category.

On the other hand, by transforming the corrupted data into the frequency domain, the key features and interferences can be effectively separated according to their coordinated frequency band, offering valuable guidance for the subsequent feature extraction and dimensionality reduction processes. Inspired by this idea, in this paper, a novel dimensionality reduction algorithm via frequency domain correlationship is proposed, named Frequency domain-based Manifold Dimensionality Reduction (FMDR), which measures the high-dimensional manifold similarities between data points by the frequency domain information. The overall workflow and processing phases are illustrated in Figure 1. Specifically, FMDR includes the following major steps. Firstly, a transformation is proposed, including transforming data into the frequency domain, extracting the discriminative details, reducing the interference and reconstructing datasets. Secondly, a novel kernel function is proposed for measuring the high-dimensional manifold relationship between samples based on frequency domain information. Finally, according to the relationship measured by the new kernel function, a graph structure is constructed by calculating the Laplacian matrix, which can be effectively utilized for finding low-dimensional mapping for the original data. Once the graph structure is obtained, the manifold relationship of the original data can be preserved, which can be conveniently adopted to solve various problems such as classification, clustering and semi-supervised classification.

Integrating frequency domain information to design machine learning models has become a new trend. FreMixer [24] is a lightweight machine learning architecture for short-term waves forecasting, which utilizes frequency domain information for data training. Stuchi, José A et al. integrated a new frequency extractor layer in the deep neural network for feature extraction [25]. However, most existing methods only treat frequency domain information as a feature or input, without effectively incorporating it into the model design and kernel function design.

In recent years, there have been several relevant studies about manifold learning and dimensionality reduction. Wenhui Song et al. introduced spectral information divergence into a neighbor manifold graph for hyperspectral image dimensionality reduction [26]. But, the distance of pairwise data points was calculated by Euclidean distance in the original domain. FMDR conducts a novel kernel function for measuring the similarity between samples based on the frequency domain correlationship, which is more precise, especially when samples are corrupted by interference. Jinghao Situ proposed a manifold learning-based contrastive framework, which applies ISOMAP to extract nonlinear structures and neural networks for comparative learning training [27]. Compared to this framework, our FMDR does not require additional neural networks or training processes and can achieve excellent dimensionality reduction results with minimal computational resource requirements.

2. Materials and Methods

In this section, several related background concepts are briefly introduced, including two-dimensional Discrete Fourier Transformation (2D DFT), high-frequency texture components and Laplacian eigenmaps.

2.1. Related Background

2.1.1. Discrete Fourier Transformation for Two-Dimensional Image

In the field of signal processing, there is a fundamental principle that every signal can be decomposed into the sum of several sine waves, a concept known as the Fourier series. In the field of image processing, a common approach involves treating data as two-dimensional discrete signals, referred to as the spatial domain signals. Consider an object image with dimensions denoted by H and W for its height and width, respectively. The pixels can be organized into a matrix

Pix

\in R^{H \times W}

. The frequency domain signal

Fre

of the

Pix

can be obtained using the two-dimensional Discrete Fourier Transform (2D DFT):

Fre (u, v) = \sum_{m = 0}^{H - 1} \sum_{n = 0}^{W - 1} Pix (m, n) e^{- j 2 π (\frac{m u}{H} + \frac{n v}{W})},

(1)

where

(u, v)

and

(m, n)

denote the coordinates in frequency domain and original domain, respectively. Equation (1) can be further expand into a matrix multiplication form, written as:

\begin{matrix} Fre & = B_{1} Pix B_{2} = \\ [\begin{matrix} e^{- j 2 π \frac{0 \times 0}{H}} & e^{- j 2 π \frac{0 \times 1}{H}} & \dots & e^{- j 2 π \frac{0 \times (H - 1)}{H}} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ e^{- j 2 π \frac{(H - 1) \times 0}{H}} & e^{- j 2 π \frac{(H - 1) \times 1}{H}} & \dots & e^{- j 2 π \frac{(H - 1) \times (H - 1)}{H}} \end{matrix}] \\ \times Pix \times \\ [\begin{matrix} e^{- j 2 π \frac{0 \times 0}{W}} & e^{- j 2 π \frac{0 \times 1}{W}} & \dots & e^{- j 2 π \frac{0 \times (W - 1)}{W}} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ e^{- j 2 π \frac{(W - 1) \times 0}{W}} & e^{- j 2 π \frac{(W - 1) \times 1}{W}} & \dots & e^{- j 2 π \frac{(W - 1) \times (W - 1)}{W}} \end{matrix}] . \end{matrix}

(2)

B_{1}

and

B_{2}

are the base matrices, which only related to the height and width of the image. The transformation indicates that the frequency domain signal

Fre

contains all the characteristics of the original signal while

B_{1}

and

B_{2}

are irrelevant.

2.1.2. High-Frequency Texture Component

Images are often described as a combination of structure and texture components, with each playing a vital role in the overall visual representation. The structure components primarily focus on defining the fundamental characteristics and prominent features present in an image, providing a framework for understanding its basic layout and composition. On the other hand, texture components delve into the finer details and nuances within the image, capturing the subtle variations and complex patterns that contribute to its overall appearance and aesthetic quality (see Figure 2).

By analyzing and interpreting the structure and texture components of an image, researchers and practitioners can gain deeper insights into its content, allowing for more advanced image processing tasks such as segmentation, object recognition and content-based retrieval. Understanding how these components interact and contribute to overall visual perception is essential in various fields such as computer vision, medical imaging, remote sensing and the digital arts.

High-frequency texture components are particularly valuable for identifying subjects within images, like human faces. Various techniques exist for extracting or enhancing these texture components from an image. Leveraging high-frequency texture components helps preserve essential image characteristics, facilitating image recognition and reconstruction [28].

2.2. The Proposed Method

In this section, the detailed design and implementation of FMDR are introduced, including its mathematical expressions, theoretical derivations and its application in semi-supervised classification problems.

In practice, images are usually assembled as a data matrix, where each column (or row) represents a vectorized sample. At first, a transformation will be adopted for extracting the discriminative information and reducing the interference in frequency domain. The transformation consists of several steps. Firstly, the vectorized samples are reshaped as images according to their height and width. Then, each image in the original spatial domain is converted into the frequency domain using 2D DFT. Based on the distribution of crucial morphological features, a designed filter is employed to extract the high-frequency texture components. Lastly, in order to maintain consistency in the data structure, the extracted features are then vectorized and assembled into a new matrix.

For the ith sample in the data matrix

X

, the high-frequency texture components

{HF}_{i}

of pixel matrix

{Pix}_{i}

can be written in the matrix multiplication form according to (2):

{HF}_{i} = B_{1} K B_{2} ⊙ B_{1} {Pix}_{i} B_{2} .

(3)

Base matrices

B_{1}

and

B_{2}

are introduced in 2, which are the irrelevant bases.

K

refers to the convolution kernel of the filter in the original domain. ⊙ is a Hadamard product operator. For each pixel point, the transformation can be rewritten in a more detailed form:

{HF}_{i} (u, v) = \sum_{m = 0}^{H - 1} \sum_{n = 0}^{W - 1} F {Pix}_{i} (m, n) e^{- j 2 π (\frac{m u}{H} + \frac{n v}{W})},

(4)

where

F

represents the designed filter. In practical applications, the distribution of valuable information within images is different across every dataset. There is no universally applicable filter design that can suit all scenarios. In this study, we employed the Butterworth filter, which is a high-pass filter and primarily involves two adjustable parameters: the order and the cut-off frequency. Based on experimental findings, the characteristics of high-frequency texture components are correlated with the filter’s parameters. Specifically, the quantity of features is linked to the cut-off frequency, while the texture thickness depends on the order of the filter. With lower parameters, a significant amount of irrelevant information and interference is retained. As the parameters increase, more low-frequency interference is filtered out, making the texture components more pronounced. However, excessively large parameters result in the abandonment of a substantial portion of the information. Samples from diverse datasets contain noise with varying distributions and structures. To achieve optimal performance, comprehensive consideration of filter parameter tuning is essential in practical applications. After extracting the high-frequency texture components corresponding to n samples, we can finally reassemble the frequency information into a data matrix by vectorization:

HF = [v e c ({HF}_{1}), v e c ({HF}_{2}), \dots, v e c ({HF}_{n})] .

(5)

Subsequently, a novel kernel function is designed for measuring the relationship between samples based on the frequency domain information. The key to manifold learning algorithms is to determine the distance measurement of samples in the high-dimensional manifold space. A more precise distance measure leads to better preservation of sample relationships after dimensionality reduction. With conventional graph-based methods, the distance between samples is calculated by the absolute distance. The heat kernel, proposed in Laplacian eigenmaps [23], is widely adopted for estimating manifold similarities, which are defined as:

s (x_{i}, x_{l}) = \exp (- \frac{{||x_{i} - x_{l}||}_{F}^{2}}{t}),

(6)

where

\exp

(x)

refers to

e^{(x)}

and e is the natural logarithm. t is known as the heat kernel parameter and

{| | . | |}_{F}

is the Frobenius norm. As can be seen, the similarity between samples

x_{i}

and

x_{l}

is calculated based on the absolute distance

{||x_{i} - x_{l}||}_{F}^{2}

. Approaches that rely on absolute distances face a common challenge in manifold space: samples may have close absolute distances while exhibiting significant actual differences, posing difficulties for accurate recognition. For instance, when considering two different faces both wearing the same scarf, distance-based algorithms may mistakenly recognize them as belonging to the same category.

In order to leverage the discriminative information provided by the frequency domain, we innovatively proposed a kernel function for measuring the relationships between samples. Firstly, the similarity degree between frequency domain signal

{HF}_{i}

and

{HF}_{l}

corresponding to

x_{i}

and

x_{l}

is calculated according to their correlationship, denoted as:

C o r ({HF}_{i}, {HF}_{l}) = \frac{v e c {({HF}_{i})}^{T} v e c ({HF}_{l})}{\sqrt{∥ {HF}_{i} ∥_{F}^{2} {∥ {HF}_{l} ∥}_{F}^{2}}} .

(7)

Then, the similarity

S_{i l}

is calculated by:

S_{i l} = e x p (- \frac{t}{C o r ({HF}_{i}, {HF}_{l})}) .

(8)

It is worth noting that, different from the absolute distance, the correlation coefficient between two signals will be larger when their similarity is higher. Hence, if

x_{i}

and

x_{l}

are more similar in the high-dimensional space, their discriminative frequency domain signals

{HF}_{i}

and

{HF}_{l}

should be closer to each other and the correlation

C o r ({HF}_{i}, {HF}_{l})

should be larger, resulting in a greater similarity

S_{i l}

. Conversely, when

x_{i}

and

x_{l}

are dissimilar, their similarity

S_{i l}

will be smaller.

By substituting Equation (7) into (8), we can finally get the complete definition of the similarity measurement, named the high-frequency kernel function (HFK):

S_{i l} = H F K ({HF}_{i}, {HF}_{l}) = e x p (- \frac{t \sqrt{∥ {HF}_{i} ∥_{F}^{2} {∥ {HF}_{l} ∥}_{F}^{2}}}{v e c {({HF}_{i})}^{T} v e c ({HF}_{l})}) .

(9)

The high-frequency kernel function satisfies the definition and requirements of kernel functions, guaranteed by the

Mercer

theorem and Proof 1.

Proof.

HFK satisfies the definition of the kernel function. based on the

Mercer

theorem. The following two conditions need to be verified:

The symmetry: for arbitrary pairwise input

X_{i}

and

X_{j}

, HFK calculates the distance according to the correlation (7), which satisfies

C o r (X_{i}, X_{j}) = C o r (X_{j}, X_{i})

. Therefore,

H K F (X_{i}, X_{j}) = H K F (X_{j}, X_{i})

, proving that HFK is a symmetric function.

The positive semi-definiteness: for any input

X_{i}

and

X_{j}

, the Gram matrix of

H F K (X_{i}, X_{j})

is a positive semi-definite matrix. Suppose matrix

\hat{H}

is the Gram matrix of

H F K (X_{i}, X_{j})

:

\hat{H} = [\begin{matrix} H F K (X_{1}, X_{1}) & H F K (X_{1}, X_{2}) & \dots & H F K (X_{1}, X_{n}) \\ H F K (X_{2}, X_{1}) & H F K (X_{2}, X_{2}) & \dots & H F K (X_{2}, X_{n}) \\ ⋮ & ⋮ & ⋮ & ⋮ \\ H F K (X_{n}, X_{1}) & H F K (X_{n}, X_{2}) & \dots & H F K (X_{n}, X_{n}) \end{matrix}] .

(10)

For any n dimensional column vector

a

, it should satisfy:

a^{T} \hat{H} a \geq 0 .

(11)

Expanding the above equation for each element, we have:

\sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{i} a_{j} H F K (X_{i}, X_{j}) \geq 0 .

(12)

By substituting the definition of HFK (9), we can get:

\sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{i} a_{j} e x p (- \frac{t \sqrt{∥ {HF}_{i} ∥_{F}^{2} {∥ {HF}_{l} ∥}_{F}^{2}}}{v e c {({HF}_{i})}^{T} v e c ({HF}_{l})} \geq 0

(13)

Based on the property of the natural logarithm exponential function, the above inequality is clearly satisfied.

Hence, with fulfilling the above two conditions, according to the

Mercer

theorem, HFK satisfies the definition of the kernel function. □

After computing the similarity between all pairs of samples based on HFK function (9), the similarity matrix (also known as the adjacency matrix) S can be obtained. According to the matrix S, the topological graph structure of the data in manifold space can also be simultaneously constructed.

Following the guiding ideology of Laplacian eigenmaps, we need to find out the best low-dimensional mapping of the original data which maintains the manifold relationships in high-dimensional space. Given n samples in the original space

X = \{x_{i} \dots x_{n}\} \in R^{d_{o} \times n}

. Our goal is to find out the best mapping

F = \{f_{i} \dots f_{n}\} \in R^{d_{t} \times n}

in the target low-dimensional space, where

d_{t} ≪ d_{o}

. While reducing the dimensions, it is essential to preserve the relationships between data points in the original high-dimensional manifold space, therefore, an objective minimization function is constructed with the constraint of matrix S:

O b j = min \sum_{i, j}^{n} {‖ f_{i} - f_{j} ‖}_{2}^{2} S_{i j},

(14)

where

| | \cdot {| |}_{2}

is the vector norm for calculating the Euclidean distance of two vectors. The target is to minimize the value of

O b j

. Constrained by the similarity matrix S, those two sample points that are close in the original high-dimensional space must also be close in the mapping of the low-dimensional space, otherwise this will result in a larger penalty. Given the definition of Laplacian matrix

L = D - S

, where D is a diagonal matrix and its elements are defined as the sum of rows (or columns) of S, named the degree matrix. Then, the objective function (14) can be rewritten in the matrix form:

O b j = min T r (F^{T} LF), s . t . F^{T} DF = I .

(15)

I is the identity matrix. The equivalence of (14) and (15) is guaranteed by the following Proof 2.

Proof.

The equivalence of (14) and (15). Equation (14) can be further derived:

\begin{matrix} O b j & = min \sum_{i, j}^{n} {‖ f_{i} - f_{j} ‖}_{2}^{2} S_{i j} \\ = min \sum_{i, j}^{n} ({(f_{i} - f_{j})}^{T} (f_{i} - f_{j})) S_{i j} \\ = min \sum_{i, j}^{n} (f_{i}^{T} f_{i} - 2 f_{i}^{T} f_{j} + f_{j}^{T} f_{j}) S_{i j} \\ = min \sum_{i}^{n} f_{i}^{T} f_{i} D_{i i} + \sum_{j}^{n} f_{j}^{T} f_{j} D_{j j} - 2 \sum_{i j}^{n} f_{i}^{T} f_{j} S_{i j} \\ = min 2 \sum_{i j}^{n} f_{i}^{T} f_{j} (D_{i j} - S_{i j}) \\ = min 2 \sum_{i j}^{n} f_{i}^{T} f_{j} L_{i j} \end{matrix}

(16)

Noting that

f_{i}

and

f_{j}

are column vectors of matrix F, expressing the above conclusion in a matrix form yields:

O b j = min T r (F^{T} LF),

(17)

which is equivalent to Equation (15). □

It is worth noting that, in order to enhance the robustness against outliers and improve computational efficiency, non fully connected graphs are commonly constructed in practice. Generally speaking, the range of neighborhoods can be defined based on the k-nearest or

ϵ

-distance rules. The similarities of vertexes beyond the neighborhood’s range will be assigned as zero.

Once the minimization of

O b j

is achieved, the optimal low-dimensional mapping F is obtained, with a significantly reduced dimensionality while preserving the manifold relationships. The problem (15) can be converted into the eigenvalue decomposition problem and be solved effectively. The low-dimensional data F can be conveniently applied to solve various problems, such as feature selection, clustering and semi-supervised classification.

In order to further demonstrate effectiveness, FMDR is adopted for a semi-supervised classification problem. Given n data points with l labeled samples and u unlabeled samples (

1 \leq l \leq n

), where

Y_{l}

represents the label matrix such that each row

y_{i}

denotes the label vector corresponding to the ith sample. The matrix

F

is composed of two parts,

F = [F_{l}; F_{n - l}]

. The semi-supervised graph-based learning problem can be formulated as follows:

min_{F} T r (F^{T} L F) s . t . F_{l} = Y_{l},

(18)

where the Laplacian matrix

L

consists of four parts:

L = [\begin{matrix} L_{l, l} & L_{l, n - l} \\ L_{n - l, l} & L_{n - l, n - l} \end{matrix}] .

(19)

Based on the conclusion from [29], there is a harmonic solution to problem (18), which is defined as:

F_{n - l} = - {(L_{n - l, n - l})}^{- 1} L_{n - l, l} Y_{l} .

(20)

Therefore, the labels of unknown samples can be directly indicated by

F_{n - l}

. The algorithm of FMDR for semi-supervised classification is summarized in Algorithm 1.

Algorithm 1: FMDR for semi-supervised classification

One of the prominent advantages of FMDR is its minimal use of computing resources. The high-frequency texture components are extracted by 2D DFT and filtering, with a computational complexity of

O (n)

. The similarity is calculated by the HFK, which adopts matrix multiplication instead of a loop algorithm, only with the computational complexity

O (1)

. The complexities of dimensionality reduction and classification are

O (n)

and

O (3 n)

, respectively. Finally, the total computational complexity of FMDR is

O (n + 1 + n + 3 n)

, which is simplified to

O (n)

with the assumption that n will be large.

3. Experiments and Discussions

A convenient method for evaluating the effectiveness of dimensionality reduction techniques is to apply them to address specific recognition problems and compare the predictive accuracy. A higher accuracy implies more precise and effective dimensionality reduction and feature extraction. In this paper, FMDR is utilized for solving the image semi-supervised classification problem. To better demonstrate the advantages and superiority of our approach, several public facial image datasets are selected as the experimental objects. Facial images not only contain abundant morphological texture information, but also commonly suffer from occlusions such as shadows, sunglasses and scarfs, which are challenging for conventional techniques.

In this section, experiments are designed from several aspects to demonstrate the mechanism and evaluate the performance of FMDR. At first, the preparations of implementation are introduced, including the details of objective datasets, the performance indicators and their definitions and the comparison baselines and the state-of-the-art methods. Subsequently, the implementations and results of the experiments are introduced.

3.1. Preparations

This section presents the preparations for the experiments, detailing the datasets selected, the indicators for evaluating performance and the comparison algorithms.

3.1.1. Datasets

Several real-world image datasets are selected to evaluate the effectiveness of our methods. The public facial datasets are widely adopted for machine learning recognition tasks, including AT&T, ORL, AR, Yale and YaleB. AT&T [30], which is a commonly used dataset for face recognition, consisting of 400 grayscale images of 40 individuals. Each person has 10 images captured under different conditions (varying angles, lighting, facial expressions, etc.). ORL [31] is another widely used dataset for face recognition, comprising 400 grayscale images of 40 individuals. Each person has 10 images taken at different times and under different states (with glasses, without glasses, etc.). The AR (Augmented Reality) Face Database [32] contains 2600 images of 126 individuals. Each person has 20 images, with 10 unoccluded and 10 occluded (with obstructions such as glasses, scarfs, etc.). Here, we conduct a mini-batch of AR with 260 samples of 10 individuals. Yale [33] consists of 165 grayscale images of 15 individuals. Each person has 11 images captured under varying lighting conditions. YaleB [34] is an extension of the Yale Face Database, containing 2540 grayscale images of 38 individuals. Each person has 64 images covering different lighting conditions and facial expressions. The partial samples are presented in Figure 3.

3.1.2. The Filter and Parameters

In FMDR, the primary concept revolves around describing the relationships between samples by leveraging the advantage of separating features and noise in the frequency domain. The selection and design of filters offer considerable flexibility, allowing for customization based on specific applications and research objectives. The Butterworth filter is selected in this study mainly because of its suitability for extracting morphological features from facial images. Its adjustable cut-off frequency and order parameters allow for tailored adjustments, making it particularly suited for this classification task.

In practice, the distribution of valuable information in facial images varies across different datasets. There is no universally applicable filter design that can be suitable for all situations. Based on the experimental experience, the features of high-frequency texture components are relevant to the parameters of the filter. Specifically, the quantity of features is related to the cut-off frequency and the texture thickness is dependent on the order of the filter. When parameters are small, most of the useless information and interference are retained. As parameters increase, more low-frequency interference is filtered out and the texture components become more obvious. However, when parameters are set to be too large, most of the information is abandoned. Therefore, in practical applications, for optimal recognition performance, if there is significant interference in the image, it is advisable to increase the parameters appropriately to filter out more noise. Conversely, reducing the parameters would be preferable to retain more information. The different results extracted by different parameters setting are presented in Figure 4.

Samples from diverse datasets contain noise with different distributions and structures. In order to achieve the best clustering performance, the tuning of the filter’s parameters should be comprehensively considered in practice.

3.1.3. Comparison Methods and Performance Indicators

To comprehensively evaluate the effectiveness of the proposed method, different types of algorithms are selected for comparison. Firstly, the conventional techniques are selected as the baselines, including the semi-supervised k-nearest neighbors (

{KNN}_{semi}

) [35] and the semi-supervised k-means (

{KMeans}_{semi}

) [36], both based on the absolute distance. Subsequently, the dimensionality reduction technique is utilized for semi-supervised problem in the same experimental condition. Robust principal component analysis (RPCA) [37] and non-negative matrix factorization (NMF) [38] are two widely adopted techniques for dimension reduction. RPCA assumes that the observed data consists of the clean low-rank component and a sparse noise part and finds the low-rank component of the data by solving a rank-minimization objective function. NMF model decomposes the data matrix into two non-negative matrices to search for the low-dimensional representation. Laplacian eigenmaps (LE) [23] search for the best low-dimensional mapping according to the manifold Gaussian kernel in the data. At last, two state-of-the-art algorithms are also employed. ABNMTF is a semi-supervised non-negative matrix tri-factorization model that learns the similarity graph via adaptive k-nearest neighbors [39]. Deep Autoencoder-like Nonnegative Matrix Factorization (DANMF) consists of an encoder and decoder to learn the hierarchical mapping of the original data [40].

There are several evaluation indicators designed for evaluating the performance of semi-supervised classification algorithms. Four common indicators, namely accuracy (Acc), precision (Pre), recall (Rec) and F1 score, are chosen in this paper. In the pattern recognition task, the prediction of a single sample can fall into one of four situations: true positive (

T P

), false negative (

F N

), false positive (

F P

) and true negative (

T N

). Accuracy is calculated as the ratio of correctly classified samples to the total number of samples, defined as follows:

\begin{matrix} A c c u r a c y = \frac{T P + T N}{T P + F N + F P + T N} . \end{matrix}

(21)

Precision indicates the proportion of correctly classified positive samples to the total number of samples determined as positive by the algorithm. It is defined as follows:

\begin{matrix} P r e c i s i o n = \frac{T P}{T P + F P} . \end{matrix}

(22)

Recall refers to the proportion of correctly classified positive samples to the total number of positive samples in the dataset. It is defined as follows:

\begin{matrix} R e c a l l = \frac{T P}{T P + F N} . \end{matrix}

(23)

The F1 score is the harmonic mean of precision and recall. It is calculated using the following formula:

\begin{matrix} F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} . \end{matrix}

(24)

The values of these indicators are recorded as percentages. In general, a higher value indicates a better performance in clustering tasks.

3.2. Visualization of Dimensionality Reduction

One direct way to evaluate the effectiveness of the dimensionality reduction technique is to visualize the low-dimensional data. As is widely known, the original data points are distributed in a high-dimensional manifold space, which is difficult to visualize and observe. One of the primary goals of dimensionality reduction algorithms is to identify the most significant feature dimensions within the data and map the samples according to these dimensions, so called principle components. Once the dimensionality of the data points is reduced, the retained features represent the principal components of the dataset. By examining the distribution of samples along these principal component dimensions, we can clearly observe the variations in the key discriminated features of the samples.

Therefore, FMDR is employed to reduce the dimensionality of the image dataset, compressing it into two dimensions and visualizing the results. As shown in Figure 5, when the samples of the ORL dataset are compressed to two dimensions by FMDR, “facial orientation angle” and “eye closure degree” are recognized as the two major feature dimensions. Samples can be effectively distinguished and categorized along these two dimensions.

Similarly, we utilize FMDR to compress the dimensions of the YaleB dataset to two dimensions as illustrated in Figure 6. It can be observed that, “size of shadow coverage area” and “angle of the shadow” are considered to be the most discriminative major components that effectively identify every sample.

In summary, FMDR effectively identifies the principal component features of data while reducing the dimensionality of samples, thus providing an accurate low-dimensional representation.

3.3. Semi-Supervised Classification for Facial Images

In this section, we utilize the FMDR for facial image semi-supervised classification. A conventional way to evaluate the effectiveness of the dimensionality reduction is to classify the data in the low-dimensional space. As the effectiveness of dimensionality reduction improves, the discriminability of the resulting low-dimensional data becomes stronger, thereby leading to higher classification accuracy. In this experiment, to smooth the bias, we conducted cross-validation that contains 100 repeated trials, randomly partitioning 20% of the data as labeled samples. The labeled and unlabeled samples are assembled into a unified data matrix, serving as input for all semi-supervised algorithms. The performances of semi-supervised classification for facial images is presented in Table 1. The highest values are indicated in bold font.

The experimental results indicate that FMDR achieves the best performance in the majority of cases. Specifically, FMDR gains the highest values in AR, with 97.14% Acc, 97.87% Pre, 97.29% Rec and 53.91% F1 score. Also, for YaleB, FMDR achieves 76.18% Acc, 77.32% Pre, 78.29% Rec and 39.68% F1 score. The samples of those two datasets exhibit significant large area occlusions, such as masks, sunglasses, scarves and shadows, which are difficult to handle in the original domain and invalidate the existing algorithms. The superiority of our algorithm on these datasets demonstrates its ability to accurately extract key information from samples in the frequency domain, thereby mitigating the interference of occlusions and improving recognition.

Samples from the AT&T and ORL datasets exhibit variations in multiple angles, leading to challenges in accurate categorization. As can be seen, the conventional dimensionality reduction techniques fail to recognize the same object at different angles, resulting in very low accuracy.

On the other hand, compared to RPCA and LE, FMDR achieves superiority of 46.79% and 45.46% in Acc, 54.5% and 51.89% in Pre, respectively. For the ORL dataset, FMDR gains 64.44% Acc and 62.76% Pre, which are 32.11% and 16.05% higher than RPCA, respectively. This phenomenon indicates that FMDR can accurately describe the manifold relationship between samples according to the high-frequency kernel function, which leads to increased similarity between different angle samples of the same object, enabling correct classification.

Furthermore, the standard deviations of 100 repeated trials in th AR dataset are recorded in Table 2. The units are the same as the indicators, presented as percentages, with bold highlighting indicating optimal performance. As the results indicate, compared to the baseline method and SOTA method, the FMDR algorithm achieves the lowest standard deviation of results across multiple iterations, indicating the higher stability of the algorithm and better performance obtained in repeated experiments.

3.4. Algorithm Performance with Changes in Labeled Data Proportion

For semi-supervised algorithms, one of the best evaluation perspectives is the ability of the algorithm to maintain its effectiveness as the proportion of labeled data varies. According to our analysis, the proposed approach effectively captures the intrinsic manifold relationship within the original data in the frequency domain, thereby achieving accurate mapping in low-dimensional space. Different from the supervised classification techniques, despite relying on only a small amount of labeled data, the algorithm can still accurately classify samples based on precise low-dimensional mapping relationships. In this section, experiments are conducted to compare the performance variation of our FMDR with comparison algorithms under varying proportions of labeled data. Specifically, six levels of labeled data proportions are set from small to large: 5%, 10%, 15%, 20%, 25% and 30%. In the semi-supervised classification experiments conducted on four datasets, the accuracy variations of FMDR and the comparative algorithms are recorded in Table 3.

Achieving high accuracy with very few labeled datasets presents a challenge for all semi-supervised algorithms. All accuracies decrease due to the insufficient information, with some algorithms even becoming ineffective. However, FMDR demonstrates an advantage in accuracy even with a smaller proportion of labeled data. Despite having only 5% of labeled data, FMDR achieves the highest accuracy in four datasets, with values of 52.50%, 74.80%, 52.00% and 53.93%, respectively. As the proportion of labeled data increases, the accuracy of FMDR also improves. In the AR dataset, with labeled data comprising 10%, FMDR achieved an accuracy of 92.61%, marking a significant increase of 17.84% compared to when only 5% of the data were labeled.

The performances of all algorithms are improved when the labeled data increase. FMDR still maintains its superiority with a higher improvement effect. To visually observe the trends of each algorithm’s changes, a set of line charts are plotted in Figure 7. The red solid line represents FMDR, while other algorithms are represented by different colors of dashed lines. As the figure shows, FMDR consistently maintains its leading advantage at all four indicators as the proportion of labeled data varies.

In summary, FMDR accurately describes the manifold relationships between samples through frequency domain correlationships, enabling precise recognition and classification of samples. Moreover, FMDR demonstrates robustness to labeled data, achieving optimal classification accuracy both when labeled data are scarce and when they are sufficient.

4. Conclusions

In this paper, a novel dimensionality reduction approach is proposed, named FMDR. The existing manifold learning dimensionality reduction techniques mainly treat data in the original space and estimate the similarity between samples by distance such as absolute distance and Gaussian distance. When samples are corrupted by high-energy or non-sparse interference, such as shadows and occlusions, the distance calculated in the original space cannot accurately estimate the manifold relationships between samples. Although such noise is challenging to handle in the original space, it can be efficiently captured and separated in the frequency domain. Inspired by this idea, FMDR transforms the data into the frequency domain for noise filtering and calculates the manifold relationships between samples via a novel kernel function. Then, the graph structure is constructed based on the correlationship in the frequency domain and can be conveniently utilized to find the accurate low-dimensional representation and solve recognition tasks.

In order to evaluate the effectiveness of FMDR comprehensively, several experiments are conducted in the public image datasets. Firstly, dimensions of samples from ORL and YaleB are compressed into two dimensions by FMDR and visualized by the feature dimensions. The illustrations indicate that FMDR is able to recognize the major components for distinguishing samples and dimensionality reduction. Even with the interference such as shadows, angles and various expression, samples can be effectively distinguished in the low-dimensional space. Further, FMDR is employed to solve the semi-supervised classification problems in five datasets. According to four indicators, our approach achieves the best classification performances in the most cases compared to the baselines and the state-of-the-art algorithms. At last, by setting up labeled datasets with varying proportions, FMDR demonstrates its superiority and stability of the recognition performance as the proportion of labeled data increases from low to high.

In conclusion, FMDR leverages the correlationship in the frequency domain of samples to obtain the intrinsic high-dimensional manifold relationship, which enables FMDR to find an optimal low-dimensional representation of the original data. According to its outstanding dimensionality reduction performance and operational efficiency, FMDR can be conveniently applied across multiple domains. It can be utilized for preprocessing tasks such as feature extraction and compressed sensing for high-dimensional data, as well as for recognition and clustering of multi-type data. This facilitates the development of subsequent intelligent algorithms and applications.

Author Contributions

Conceptualization, Z.L.; methodology, Z.L.; software, R.G. and G.T.; validation, G.T. and S.J.; formal analysis, G.T. and R.Z.; investigation, Z.L.; resources, R.Z.; writing—original draft preparation, Z.L.; writing—review and editing, R.Z.; visualization, S.J.; supervision, R.Z.; project administration, R.Z.; funding acquisition, R.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Key-Area Research and Development Program of Guangdong Province grant number 2022B0701180001.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Espadoto, M.; Martins, R.M.; Kerren, A.; Hirata, N.S.T.; Telea, A.C. Toward a Quantitative Survey of Dimension Reduction Techniques. IEEE Trans. Vis. Comput. Graph. 2021, 27, 2153–2173. [Google Scholar] [CrossRef] [PubMed]
Li, K.; Wan, G.; Cheng, G.; Meng, L.; Han, J. Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. Photogramm. Remote Sens. 2020, 159, 296–307. [Google Scholar] [CrossRef]
Köppen, M. The curse of dimensionality. In Proceedings of the 5th Online World Conference on Soft Computing in Industrial Applications (WSC5), Online, 4–18 September 2000; Volume 1, pp. 4–8. [Google Scholar]
Wang, Z.; Zhang, G.; Xing, X.; Xu, X.; Sun, T. Comparison of dimensionality reduction techniques for multi-variable spatiotemporal flow fields. Ocean. Eng. 2024, 291, 116421. [Google Scholar] [CrossRef]
Zeng, C.; Xia, S.; Wang, Z.; Wan, X. Multi-Channel Representation Learning Enhanced Unfolding Multi-Scale Compressed Sensing Network for High Quality Image Reconstruction. Entropy 2023, 25, 1579. [Google Scholar] [CrossRef] [PubMed]
Vieira Sobrinho, J.L.; Teles Vieira, F.H.; Assis Cardoso, A. Two-Stage Dimensionality Reduction for Social Media Engagement Classification. Appl. Sci. 2024, 14, 1269. [Google Scholar] [CrossRef]
Al-khassaweneh, M.; Bronakowski, M.; Al-Sharoa, E. Multivariate and Dimensionality-Reduction-Based Machine Learning Techniques for Tumor Classification of RNA-Seq Data. Appl. Sci. 2023, 13, 12801. [Google Scholar] [CrossRef]
Barkalov, K.; Shtanyuk, A.; Sysoyev, A. A Fast kNN Algorithm Using Multiple Space-Filling Curves. Entropy 2022, 24, 767. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Li, Y.; Li, C. Dual-Graph-Regularization Constrained Nonnegative Matrix Factorization with Label Discrimination for Data Clustering. Mathematics 2024, 12, 96. [Google Scholar] [CrossRef]
González-Díaz, Y.; Martínez-Trinidad, J.F.; Carrasco-Ochoa, J.A.; Lazo-Cortés, M.S. An Algorithm for Computing All Rough Set Constructs for Dimensionality Reduction. Mathematics 2024, 12, 90. [Google Scholar] [CrossRef]
Heidarian Dehkordi, R.; Candiani, G.; Nutini, F.; Carotenuto, F.; Gioli, B.; Cesaraccio, C.; Boschetti, M. Towards an Improved High-Throughput Phenotyping Approach: Utilizing MLRA and Dimensionality Reduction Techniques for Transferring Hyperspectral Proximal-Based Model to Airborne Images. Remote Sens. 2024, 16, 492. [Google Scholar] [CrossRef]
Yao, C.; Zheng, L.; Feng, L.; Yang, F.; Guo, Z.; Ma, M. A Collaborative Superpixelwise Autoencoder for Unsupervised Dimension Reduction in Hyperspectral Images. Remote Sens. 2023, 15, 4211. [Google Scholar] [CrossRef]
Islam, M.R.; Siddiqa, A.; Ibn Afjal, M.; Uddin, M.P.; Ulhaq, A. Hyperspectral Image Classification via Information Theoretic Dimension Reduction. Remote Sens. 2023, 15, 1147. [Google Scholar] [CrossRef]
Tharwat, A.; Gaber, T.; Ibrahim, A.; Hassanien, A.E. Linear discriminant analysis: A detailed tutorial. AI Commun. 2017, 30, 169–190. [Google Scholar] [CrossRef]
Greenacre, M.; Groenen, P.J.; Hastie, T.; d’Enza, A.I.; Markos, A.; Tuzhilina, E. Principal component analysis. Nat. Rev. Methods Prim. 2022, 2, 100. [Google Scholar] [CrossRef]
Gao, Z.; Cheong, L.F.; Wang, Y.X. Block-sparse RPCA for salient motion detection. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 1975–1987. [Google Scholar] [CrossRef]
Shi, S.; Xu, Y.; Xu, X.; Mo, X.; Ding, J. A Preprocessing Manifold Learning Strategy Based on t-Distributed Stochastic Neighbor Embedding. Entropy 2023, 25, 1065. [Google Scholar] [CrossRef]
Li, H.; Cui, J.; Zhang, X.; Han, Y.; Cao, L. Dimensionality Reduction and Classification of Hyperspectral Remote Sensing Image Feature Extraction. Remote Sens. 2022, 14, 4579. [Google Scholar] [CrossRef]
Das, S.; Routray, A.; Deb, A.K. Fast Semi-Supervised Unmixing of Hyperspectral Image by Mutual Coherence Reduction and Recursive PCA. Remote Sens. 2018, 10, 1106. [Google Scholar] [CrossRef]
Wright, J.; Ganesh, A.; Rao, S.; Peng, Y.; Ma, Y. Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In Advances in Neural Information Processing Systems; Neural Information Processing Systems: La Jolla, CA, USA, 2009; Volume 22. [Google Scholar]
Tenenbaum, J.B.; de Silva, V.; Langford, J.C. A global geometric framework for nonlinear dimensionality reduction. Science 2000, 290, 2319–2323. [Google Scholar] [CrossRef]
Roweis, S.T.; Saul, L.K. Nonlinear dimensionality reduction by locally linear embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef]
Belkin, M.; Niyogi, P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 2003, 15, 1373–1396. [Google Scholar] [CrossRef]
Zhan, K.; Li, C.; Zhu, R. A frequency domain-based machine learning architecture for short-term wave height forecasting. Ocean. Eng. 2023, 287, 115844. [Google Scholar] [CrossRef]
Stuchi, J.A.; Angeloni, M.A.; Pereira, R.F.; Boccato, L.; Folego, G.; Prado, P.V.S.; Attux, R.R.F. Improving image classification with frequency domain layers for feature extraction. In Proceedings of the 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, Japan, 25–28 September 2017; pp. 1–6. [Google Scholar] [CrossRef]
Song, W.; Zhang, X.; Chen, Y.; Xu, H.; Wang, L.; Wang, Y. Dimensionality Reduction and Research of Hyperspectral Remote Sensing Images Based on Manifold Learning. Preprints 2024, 2024011274. [Google Scholar] [CrossRef]
Situ, J. Contrastive Learning Dimensionality Reduction Method Based on Manifold Learning. Adv. Eng. Technol. Res. 2024, 9, 522. [Google Scholar] [CrossRef]
Sun, Y.; Chen, J.; Liu, Q.; Liu, B.; Guo, G. Dual-Path Attention Network for Compressed Sensing Image Reconstruction. IEEE Trans. Image Process. 2020, 29, 9482–9495. [Google Scholar] [CrossRef] [PubMed]
Xiaojin, Z. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the International Conference on Machine Learning, Los Angeles, CA, USA, 23–24 June 2003; Volume 3, p. 912. [Google Scholar]
Zheng, J. Targeted Image Reconstruction by Sampling Pre-trained Diffusion Model. In Intelligent Systems and Applications, Proceedings of the Intelligent Systems Conference, Amsterdam, The Netherlands, 7–8 September 2023; Springer: Cham, Switzerland, 2023; pp. 552–560. [Google Scholar]
Ji, P.; Reid, I.; Garg, R.; Li, H.; Salzmann, M. Adaptive low-rank kernel subspace clustering. arXiv 2017, arXiv:1707.04974. [Google Scholar]
Martinez, A.; Benavente, R. The AR Face Database: CVC Technical Report, 24; CVC: Luxembourg, 1998. [Google Scholar]
Belhumeur, P.N.; Hespanha, J.P.; Kriegman, D.J. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 711–720. [Google Scholar] [CrossRef]
Georghiades, A.S.; Belhumeur, P.N.; Kriegman, D.J. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 643–660. [Google Scholar] [CrossRef]
Chang, Y.; Liu, H. Semi-supervised classification algorithm based on the KNN. In Proceedings of the 2011 IEEE 3rd International Conference on Communication Software and Networks, Xi’an, China, 27–29 May 2011; pp. 9–12. [Google Scholar]
Yoder, J.; Priebe, C.E. Semi-supervised k-means++. J. Stat. Comput. Simul. 2017, 87, 2597–2608. [Google Scholar] [CrossRef]
Shahid, N.; Kalofolias, V.; Bresson, X.; Bronstein, M.; Vandergheynst, P. Robust principal component analysis on graphs. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 2812–2820. [Google Scholar]
Lee, D.; Seung, H.S. Algorithms for non-negative matrix factorization. In Advances in Neural Information Processing Systems; Neural Information Processing Systems: La Jolla, CA, USA, 2000; Volume 13. [Google Scholar]
Li, S.; Li, W.; Lu, H.; Li, Y. Semi-supervised non-negative matrix tri-factorization with adaptive neighbors and block-diagonal learning. Eng. Appl. Artif. Intell. 2023, 121, 106043. [Google Scholar] [CrossRef]
Ye, F.; Chen, C.; Zheng, Z. Deep autoencoder-like nonnegative matrix factorization for community detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, Turin, Italy, 22–26 October 2018; pp. 1393–1402. [Google Scholar]

Figure 1. Demonstration of the FMDR. Data points of the same color belong to the same category. The goal of dimensionality reduction is to search for an accurate low-dimensional mapping that reveals the intrinsic relationships of the original data. Given a set of images, Samples (a,b), (c,d) belong to two different categories, respectively. Due to the high energy of the occlusions (such as scarfs), samples could be closed to each other in the high-dimensional manifold space, such as (b,c). FMDR consists of several key steps. (1) Frequency domain transformation and feature extraction. Firstly, the samples are transformed into the frequency domain. According to the frequency bands of noise distribution, interference is filtered out and discriminative details are emphasized. (2) Assisted by a novel kernel function, the similarities between samples are measured based on the correlationship of the frequency domain features. Due to effective extraction and the kernel function, samples of the same type have higher correlations, while the differences between samples of different types are also greater. (3) Graph structure construction. Constructing a manifold topological graph by treating sample points as vertexes and linking pairs of points with edge lengths based on their similarity. Finally, the graph structure can be conveniently applied for dimensionality reduction and recognition tasks.

Figure 2. The effectiveness of extracting high-frequency texture components from images of different objects. (a) Images of various objects, including a mountain, human face, aerial photo and wood. (b) Extraction results of high-frequency texture components.

Figure 3. The partial samples of the five real-world facial datasets.

Figure 4. Subfigure (a) presents the original image and the corresponding frequency spectrum. Subfigures (b–d) present the differences between the extracted texture components with the corresponding frequency response curve of the filter. With a bigger parameters setting, more low-frequency information is filtered out and the high-frequency texture component becomes more obvious. When the parameters are set to be too large, most of the information is abandoned, which is not sufficient for recognition. (a) The original image and the corresponding frequency spectrum. (b) Cut-off frequency is too small. (c) Appropriate setting. (d) Cut-off frequency is too large.

Figure 5. The dimensionality of the samples from the ORL dataset is reduced to two dimensions using FMDR. The original image of the selected sample circled in red is presented nearby. Two feature dimensions identified by FMDR are “facial orientation angle” and “degree of eye closure”. Along the x-axis, the degree of eye closure of the samples increases gradually from right to left. From bottom to top on the y-axis, facial orientation changes from right to left.

Figure 6. The dimensionality of the samples from the YaleB dataset is reduced to two dimensions using FMDR. The original image of the selected sample circled in red is presented nearby. Two feature dimensions identified by FMDR are “size of shadow coverage area” and “angle of the shadow”. Along the x-axis from left to right, the shadow coverage area decreases from large to small. Along the y-axis from bottom to top, the angle of the shadow changes from right to left.

Figure 7. The Variation in algorithm performance with changing proportions of labeled data. (a) The accuracy varies with the change in labeled data in ORL. (b) The accuracy varies with the change in labeled data in AR. (c) The accuracy varies with the change in labeled data in Yale. (d) The accuracy varies with the change in labeled data in YaleB.

Table 1. The performances of semi-supervised classification for facial images.

Dataset	Acc
Dataset	FMDR	${KNN}_{semi}$	${KMeans}_{semi}$	RPCA	NMF	LE	ABNMTF	DANMF
AT&T	79.81	73.89	57.00	33.02	50.31	34.35	78.35	51.95
ORL	64.44	45.59	42.25	32.33	31.85	31.85	41.30	33.60
AR	97.14	77.00	38.46	50.23	52.23	50.23	67.62	61.27
Yale	64.44	45.59	28.48	32.85	31.85	31.85	41.30	33.60
YaleB	76.18	43.30	32.14	31.38	32.63	28.07	45.56	29.05
Dataset	Pre
Dataset	FMDR	${KNN}_{semi}$	${KMeans}_{semi}$	RPCA	NMF	LE	ABNMTF	DANMF
AT&T	82.81	75.80	71.23	28.31	45.00	30.92	81.56	88.49
ORL	62.76	58.24	60.05	46.71	31.87	40.99	55.79	46.54
AR	97.87	75.54	42.62	48.58	50.34	53.58	70.15	63.11
Yale	62.76	58.24	35.09	46.71	31.87	40.99	55.79	46.54
YaleB	77.32	41.46	48.30	32.96	31.24	31.99	57.47	29.77
Dataset	Rec
Dataset	FMDR	${KNN}_{semi}$	${KMeans}_{semi}$	RPCA	NMF	LE	ABNMTF	DANMF
AT&T	79.12	79.12	62.64	31.58	48.17	35.97	78.47	55.78
ORL	62.11	52.20	46.63	34.36	29.61	38.33	47.61	31.84
AR	97.29	78.43	42.76	49.63	48.41	51.04	67.77	63.40
Yale	62.11	52.20	30.82	34.36	29.61	38.33	47.61	31.84
YaleB	78.29	43.41	33.12	29.84	31.45	28.84	46.04	29.77
Dataset	F1 score
Dataset	FMDR	${KNN}_{semi}$	${KMeans}_{semi}$	RPCA	NMF	LE	ABNMTF	DANMF
AT&T	40.19	39.35	34.01	14.54	23.13	16.73	41.23	35.77
ORL	32.37	29.28	24.10	22.05	15.99	22.34	25.23	18.45
AR	53.91	43.07	20.59	26.71	26.15	28.84	47.39	42.81
Yale	32.37	29.28	17.86	22.05	15.99	22.34	25.23	18.45
YaleB	39.68	21.50	19.23	15.68	16.42	16.06	34.01	23.38

The highest values are emphasized in bold font.

Table 2. Standard deviations of 100 repeated trials in the AR dataset.

Algorithm	Acc	Pre	Rec	F1
FMDR	2.34	3.27	3.43	2.87
KNN	4.24	5.63	6.02	3.50
Kmeans	3.62	3.53	3.69	3.77
ATNMTF	5.22	5.89	7.34	5.35
LE	3.55	4.17	4.53	2.95
NMF	4.32	5.37	5.13	3.06
RPCA	3.94	4.55	4.58	3.26
DANMF	3.02	4.21	4.88	3.23

The highest values are emphasized in bold font.

Table 3. The accuracy with different proportions of labeled data.

Dataset	Ratio	Acc
Dataset	Ratio	FMDR	${KNN}_{semi}$	${KMeans}_{semi}$	RPCA	NMF	LE	ABNMTF	DANMF
ORL	5%	52.50	48.89	22.50	35.83	45.56	35.28	51.39	52.22
	10%	55.83	46.11	33.00	35.83	46.11	36.11	52.22	53.61
	15%	63.86	53.87	32.75	46.69	60.76	49.07	58.33	58.77
	20%	71.91	66.77	42.25	46.46	51.23	30.88	57.99	59.63
	25%	68.55	68.55	45.50	59.45	64.60	60.48	61.19	62.66
	30%	70.44	64.56	45.75	58.48	64.97	60.63	62.77	63.84
AR	5%	74.80	53.20	22.31	30.00	30.40	30.40	47.60	44.00
	10%	92.61	74.03	25.77	46.98	41.30	43.53	53.04	49.00
	15%	94.57	75.11	27.31	49.33	48.87	50.45	62.53	51.67
	20%	97.14	77.00	38.46	50.23	52.80	50.23	67.62	61.27
	25%	96.94	81.63	38.85	57.25	57.95	57.00	73.85	56.36
	30%	96.26	81.05	40.38	60.21	64.25	58.51	72.13	75.44
Yale	5%	52.00	41.33	16.67	25.33	25.33	25.33	32.00	20.67
	10%	56.00	45.33	22.03	24.00	24.67	24.67	35.33	25.67
	15%	58.52	45.26	25.39	38.52	33.58	35.04	39.71	28.53
	20%	64.44	45.26	28.48	32.85	31.85	31.85	41.30	33.60
	25%	57.02	48.36	28.27	38.52	45.60	40.32	44.63	41.20
	30%	62.50	51.22	30.60	38.21	37.90	38.02	41.94	42.80
YaleB	5%	53.93	24.29	13.67	22.70	27.59	22.02	19.95	22.53
	10%	60.09	32.42	21.13	26.72	29.17	26.16	26.35	25.76
	15%	68.45	41.03	24.36	31.91	34.47	33.11	42.40	28.08
	20%	76.18	43.30	32.14	31.38	32.63	28.07	45.56	29.05
	25%	77.08	50.11	37.30	37.38	41.03	37.90	49.02	31.50
	30%	77.32	52.86	40.36	40.16	43.19	40.20	53.72	32.70

The highest values are emphasized in bold font.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liang, Z.; Gong, R.; Tan, G.; Ji, S.; Zhan, R. A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification. Appl. Sci. 2024, 14, 5342. https://doi.org/10.3390/app14125342

AMA Style

Liang Z, Gong R, Tan G, Ji S, Zhan R. A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification. Applied Sciences. 2024; 14(12):5342. https://doi.org/10.3390/app14125342

Chicago/Turabian Style

Liang, Zexiao, Ruyi Gong, Guoliang Tan, Shiyin Ji, and Ruidian Zhan. 2024. "A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification" Applied Sciences 14, no. 12: 5342. https://doi.org/10.3390/app14125342

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Frequency Domain Kernel Function-Based Manifold Dimensionality Reduction and Its Application for Graph-Based Semi-Supervised Classification

Abstract

1. Introduction

2. Materials and Methods

2.1. Related Background

2.1.1. Discrete Fourier Transformation for Two-Dimensional Image

2.1.2. High-Frequency Texture Component

2.2. The Proposed Method

3. Experiments and Discussions

3.1. Preparations

3.1.1. Datasets

3.1.2. The Filter and Parameters

3.1.3. Comparison Methods and Performance Indicators

3.2. Visualization of Dimensionality Reduction

3.3. Semi-Supervised Classification for Facial Images

3.4. Algorithm Performance with Changes in Labeled Data Proportion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI