1. Introduction
A hyperspectral image [
1,
2,
3,
4] contains hundreds of continuous narrow spectral bands, spanning the visible to infrared spectrum. Hyperspectral sensors have attracted much interest in remote sensing for providing abundant and valuable information over the last few decades. With the useful information, HSI has played a vital role in many applications, among which classification [
5,
6,
7] is one of the crucial processing steps that has received enormous attention. The foremost task in hyperspectral classification is to train an effective classifier with the given training set from each class. Therefore, sufficient training samples are crucial to train a reliable classifier. However, in reality, it is time-consuming and expensive to obtain a large number of samples with class labels. This difficulty will result in the curse of dimensionality (i.e., Hughes phenomenon) and will induce the risk of overfitting.
Much work has been carried out to design suitable classifiers to deal with the above-mentioned problems in the last decades. In general, those methods can be categorized into three types, i.e., unsupervised, supervised and semi-supervised methods. Unsupervised methods focus on training models from large unlabeled samples. Since no labeled samples are required, the unsupervised methods can be easily applied in the hyperspectral processing area. Many unsupervised methods, such as fuzzy clustering [
8], fuzzy C-Means method [
9], artificial immune algorithm [
10], graph-based method [
11], have demonstrated impressive results in hyperspectral classification. However, one cannot ensure the relationship between clusters and classes with too little priori knowledge.
Supervised classifiers, which are widely used in hyperspectral classification, can yield improved performance by utilizing the priori information of the class labels. Typical supervised classifiers include the support vector machine (SVM) [
12,
13], artificial neural networks (ANN) [
14] and sparse representation-based classification (SRC) [
15,
16], etc. SVM is a kind of kernel-based method that aims at exploring the optimal separating hyperplane between different classes, ANN is motivated by the biological learning process of human brain, while SRC stems from the rapid development of compressed sensing in recent years. Versatile as the supervised classifiers are, their performance heavily depends on the number of labeled samples. In contrast to the urgent needs of labeled samples, they ignore the large number of unlabeled samples to assist classification.
Semi-supervised learning is designed to alleviate the “small-sample problem” by utilizing both the limited labeled samples and the wealth of unlabeled samples that can be easily obtained without significant cost. The semi-supervised methods can be roughly divided into four types: (1) generative models [
17,
18], which estimate the conditional density to obtain the labels of unlabeled samples. (2) Low density separation, which aims to place boundaries in regions where few samples (labeled or unlabeled) existed. One of the state-of-the-art algorithms is the transductive support vector machine (TSVM) [
19,
20,
21]. (3) Graph-based methods [
22,
23,
24,
25,
26] that utilize labeled and unlabeled samples to construct graphs and minimize the energy function, and thus, assigning labels to unlabeled samples. (4) Wrapper-based methods, which apply a supervised learning method iteratively and a certain amount of unlabeled samples are labeled in each iteration. The self-training [
27,
28] and co-training [
29,
30] algorithms are commonly-used wrapper-based methods.
Notably that the samples within a small neighborhood are likely to belong to the same class and thus, the spatial correlation between neighboring samples can be incorporated into the classification to further improve the performance of the classifiers. For instance, the spatial contextual information [
31,
32,
33,
34,
35,
36,
37,
38,
39] can be extracted by various spatial filters. Segmentation methods (e.g., watershed segmentation [
40] and superpixel segmentation [
41,
42]) can also be adopted to exploit the spatial homogeneity of the HSI. One can also use the spatial similarity of neighboring samples [
43,
44,
45] in the classification stages. Regularizations [
15,
46,
47,
48,
49,
50,
51,
52,
53] can be added in the classifiers to refine the classification performance. Different from the above-mentioned vector/matrix-based methods, there are some three-dimension (3D)/tensor-based methods [
34,
54,
55,
56,
57] that respect the 3D nature of the HSI and process the 3D cube as a whole entity. The 3D/tensor-based methods have demonstrated considerable improvement since the joint spectral-spatial structure information is effectively exploited.
However, most of the aforementioned methods can only extract features of the original HSI dataset in a shallow manner. Deep learning [
58], which can hierarchically obtain the high-level abstract representation, has recently become a hotspot in the image processing area, especially in hyperspectral classification. Typical deep architectures involve the stacked autoencoder (SAE) [
59], deep brief network (DBN) [
60] and convolutional neural networks (CNN) [
61]. The above-mentioned classification frameworks are supervised, which require a large number of labeled samples for training. Recently, a semi-supervised classifier based on multi-decision labeling and contextual deep learning (i.e., CDL-MD-L) is proposed by [
62], which has demonstrated promising results in hyperspectral classification.
In this paper, a generative adversarial networks (GANs)-based semi-supervised method is proposed for hyperspectral classification. To extract the spectral-spatial features, we extend the existing two-dimensional bilateral filter (2DBF) [
36,
63,
64] into its three-dimensional version (i.e., 3DBF), which is a non-iterative method for nonlinear and edge-preserving smoothing. The 3DBF is suitable for spectral-spatial feature extraction since it respects the 3D nature of the HSI cube. Subsequently, the outputs of the previous step can be utilized to train GANs [
65,
66], which are promising neural networks that have been the focus of attention in recent years. In this paper, the GANs are trained for semi-supervised classification of HSI to use the limited labeled samples and vast of unlabeled samples. The semi-supervised learning is performed by adding samples from the generators to the extracted features and increasing the dimension of the classifier output.
Compared to the existing literature, the contribution of this paper lies in two aspects:
We extract the spectral-spatial features by the 3DBF. Compared to the vector/matrix-based methods, the structural features extracted by the 3DBF can effectively preserve the spectral-spatial information by naturally obeying the 3D form of the HSI and treating the 3D cube as a whole entity.
We classify the HSI in a semi-supervised manner by the GANs. Compared to the supervised methods, the GANs can utilize both limited training samples and abundant of unlabeled samples. Compared to the non-adversarial networks, the GANs take advantage of the discriminative models to train the generative network based on game theory.
The remaining part of this paper is organized as follows.
Section 2 describes the proposed semi-supervised classification method in detail.
Section 3 reports the experimental results and analyses on three benchmark HSI datasets. Finally, discussions and conclusions are drawn in
Section 4 and
Section 5.
2. Proposed Semi-Supervised Method
The conceptual framework of the proposed method is shown in
Figure 1, which is composed of two parts: (1) feature extraction; (2) semi-supervised learning. The spectral-spatial features of the original HSI cube
can be extracted by the 3DBF, which is a 3D filter that can obey the 3D nature of the HSI and extract the spectral-spatial features simultaneously. Subsequently, GANs are utilized in the feature space for semi-supervised classification by taking full advantage of both the limited labeled samples and the sufficient unlabeled samples. The classification map can be achieved by visualizing the classification results of different samples.
It is noteworthy that both 3DBF and GANs are of great importance for semi-supervised learning of HSI classification. On the one hand, 3DBF is adopted for extracting the spectral-spatial features of the HSI. As emphasized in
Section 1, incorporating spatial information into the hyperspectral classification helps to improve the performance the classifiers, and thus, exploring spectral-spatial feature extraction methods has become an important research topic in the hyperspectral community. In addition, since the HSI data is naturally a 3D cube, the 3D/tensor-based methods are more effective to extract the joint spectral-spatial structure information than the vector/matrix-based methods. As will be shown in
Section 3.3, the GANs with the original spectral features (i.e., Spec-GANs) provide much worse performance than the GANs with 3DBF features (i.e., 3DBF-GANs), which further highlights the significance of the 3DBF. On the other hand, GANs are utilized for semi-supervised classification of the HSI. The recent development of deep learning has opened up new opportunities for hyperspectral classification. GANs, which are newly proposed deep architectures for training deep generative models by a minimax game, have shown promising results in unsupervised/semi-supervised learning. Although the GANs have been successfully employed in various areas and demonstrated remarkable success, the application of GANs in semi-supervised hyperspectral classification has never been addressed in the literature to the best of our knowledge. Therefore, it is valuable for us to represent the first attempt to develop a semi-supervised hyperspectral classification framework based on GANs. In this section, we introduce the detailed procedure of the proposed semi-supervised classification method, elaborating on the spectral-spatial feature extraction based on 3DBF and semi-supervised classification of HSI by GANs.
2.1. Spectral-Spatial Features Extracted by 3D Bilateral Filter
The bilateral filter was originally introduced by [
63] under the name “SUSAN”. It was then rediscovered by [
67] termed as “bilateral filter”, which is now the widely used name in the literature. Over the past few years, the bilateral filter has emerged as a powerful tool for several applications, such as image denoising [
64] and hyperspectral classification [
36]. The great success of the bilateral filter stems from several properties. It is a local, non-iterative and simple filter, which smooths images while preserving edges in terms of a nonlinear combination of the neighboring pixels. Although the bilateral filter has announced impressive results in hyperspectral classification, it is performed in each two-dimensional probability map, and thus ignoring the 3D nature of the HSI cube.
In this paper, we extend the bilateral filter to 3DBF for spectral-spatial feature extraction of the HSI volumetric data. Suppose the original HSI cube can be represented as
, where
and
b indicate the number of rows, columns and spectral bands, respectively, the result
of the 3DBF, which replaces each pixel in the
by a weighted average of its neighbors, can be defined by
with
where
refers to the coordinate of the HSI cube
, i.e.,
,
indicates the index of the neighborhoods centered at
,
denotes the normalizing term of the neighborhood pixels
,
and
are the Gaussian filters measuring the distance in the 3D image domain (i.e., the spectral-spatial domain
) and the distance on the intensity axis (i.e., the range domain
), respectively.
To speed up the implementation, we decompose the 3DBF into a convolution followed by two nonlinearities based on signal processing grounds. Note that the nonlinearity of the 3DBF (see Equation (
1)) originates from the division by
and the dependency on the intensities by
, we study each point separately and isolate them during computation. Multiplying both sides of Equation (
1) by
, Equations (
1) and (
2) can be rewritten as
We then define a function
whose value is 1 everywhere (
is a function whose value is 1 everywhere, i.e.,
. Therefore, the size of
is the same as that of the original HSI cube) to maintain the weighted mean property of the 3DBF and represent Equation (
3) as
The above-mentioned Equation (
4) can be equivalently expressed as
where
denotes the intensity interval,
is the Kronecker symbol with
if
, and
otherwise. Specifically,
if and only if
. The sum in Equation (
5) is over the product space
, on which we express the functions by lowercases. That means,
represents a Gaussian kernel given by
Based on Equation (
5), two functions
i and
w can be build on
by
and
Observed from the definitions of
i and
w in Equations (
7) and (
8), we have
Let the input of
be
, the input of
i and
w be
, Equation (
5) becomes
where “⊗” indicates the convolution operator.
Therefore, the 3DBF can be modeled by
where the functions
and
are defined as
.
In hyperspectral analysis, the 3D image domain (i.e., the spectral-spatial domain
) is a
volume and the range domain
is a simple axis labelled
. As described in Equation (
13), the 3DBF can be achieved by the following three steps:
Convolve and w with a Gaussian defined on . In this step, and w are “blurred” into and , respectively.
Obtain by dividing by ;
Compute the value of at to get the filtered result .
Moreover, the 3DBF can be accelerated by downsample and upsample without changing the major steps of the implementation. That is, we downsample
to obtain
, perform the convolution to generate
, followed by upsample
to get
. The remaining steps are the same as the above-mentioned steps 2 and 3. To sum up, the schematic diagram of the 3DBF can be depicted in
Figure 2, by which the original HSI cube
is filtered and the spectral-spatial feature cube
is obtained. It is worth underlining that the dimension of the 3DBF cube
is the same as that of the original HSI cube, i.e.,
. As will be shown in Figures 9 and 10, the spectral and spatial profiles of the 3DBF smooth the original data while still preserving edges.
2.2. Semi-Supervised Classification of HSI by Generative Adversarial Networks
2.2.1. Brief of Generative Adversarial Networks
GANs are newly proposed deep architectures based on adversarial nets to train the model in an adversarial fashion to generate data mimicking certain distributions. Unlike the other deep learning methods, a GAN is an architecture around two functions (see
Figure 3), i.e., a generator
G, which can map a sample from a random uniform distribution to the data distribution, and a discriminator
D, which is trained to distinguish whether a sample belongs to the real data distribution. In GANs, the generator and discriminator are learned jointly based on game theory. The generator
G and the discriminator
D can be trained in an alternating manner. In each step,
G produces a sample from the random noise
that may fool
D, and
D is then presented the real data samples as well as the samples generated by
G to classify the samples as “real” or “fake”. Subsequently,
G is rewarded for producing samples that can “fool”
D and
D for correct classification. Both functions are updated and the iteration stops until a Nash equilibrium is achieved. In greater detail, let
be the probability that
comes from the real data rather than the generator,
G and
D play a minimax game with the following value function
Much work has been carried out to improve the GAN since it was pioneered by Goodfellow et al. [
65] in 2014. Two remarkable aspects can be highlighted: theory and application. On the one hand, several improved versions of GANs in aspects of stability of training, perceptual quality, etc., have been proposed in recent literature, including the well-known deep convolutional GAN (DC-GAN) [
68], conditional GAN (C-GAN) [
69], Laplacian pyramid GAN (LAP-GAN) [
70], information-theoretic extension to the GAN (Info-GAN) [
71], unrolled GAN [
72] and Wasserstein GAN (W-GAN) [
73]. On the other hand, recent work has also shown that GANs can provide very successful results in image generation [
74], image super resolution [
75], image inpainting [
76] and semi-supervised learning [
77].
2.2.2. Generative Adversarial Networks for Classification
GANs, which can train deep generative models with a minimax game, have recently emerged as powerful tools for unsupervised and semi-supervised classification. Several unsupervised/ semi-supervised techniques motivated by the GANs have sprung up over the past few years to overcome the difficulties of labeling large amounts of training samples. For instance, DC-GAN is proposed in [
68] to bridge the gap between the success of the CNN for supervised and unsupervised learning. Several constraints are evaluated to make the convolutional GANs stable to train, and the trained discriminators are applied for image classification tasks, resulting in competitive performance with other unsupervised methods. Info-GAN is proposed in [
71] to learn disentangled representations in a completely unsupervised manner. As an information-theoretic extension to the GAN, the Info-GAN maximizes the mutual information between a small subset of the latent variables and the observation, and therefore, interpretable and disentangled representations can be learned. Categorical GAN (CatGAN) [
77], which is a framework for robust unsupervised and semi-supervised learning, combines ANN classifiers with an adversarial generative model that regularizes a discriminatively trained classifier. By heuristically understanding the non-convergence problem, an improved semi-supervised learning method is proposed in [
66], which can be regarded as a continuation and refinement of the effort in [
77]. Moreover, Premachandran and Yuille [
78] learns a deep network by generative adversarial training. Features learned by adversarial training is fused with a traditional unsupervised classification approach, i.e., k-means clustering, and the combination produces better results than direct prediction. In situation of semi-supervised classification, the adversarial training has the potential to outperform supervised learning.
Note that different versions of GANs have different objective functions and procedures, it is hard to obtain a unified architecture for describing the unsupervised/semi-supervised techniques. In this section, we try to give a schematic illustration of the procedure for unsupervised/semi-supervised learning in
Figure 4, which contains the main steps in most of the scenarios but not all of them. It is noteworthy that the logistic regression classifier based on the soft-max function is employed to discriminate different classes in
Figure 4. That means, by applying the soft-max function, the class probabilities of
can be expressed as
and the class label of
can be determined by
In addition, despite remarkable success of GANs, their applications in semi-supervised classification of HSI are surprisingly unstudied to the best of our knowledge. Therefore, this study represents the first attempt to develop a semi-supervised classification framework for the HSI.
2.2.3. Hyperspectral Classification Framework Using Generative Adversarial Networks
In hyperspectral classification, a standard classifier assigns each sample
to one of the
C possible classes based on the training samples available for each class. For instance, a logistic regression classifier takes
as input and outputs a
C-dimensional vector, which can be turned into the class probabilities by soft-max
. Classifiers like this usually have a cross-entropy objective function in supervised scenario. That means, a discriminative model can be trained by minimizing the objective function between observed labels and the model predictive distribution
. However, the supervised learning usually needs enough labeled training samples to guarantee the representativeness and prevent the classifier from overfitting, especially for a deep discriminative model with huge parameter volume such as CNN. The strong demand for abundant training samples conflicts with the fact that the labels of the samples are extremely difficult and expensive to identify. At the same time, there are vast of unlabeled samples in the HSI. Therefore, we propose a GANs-based classification method [
65,
66] to simultaneously utilize both the limited labeled samples and the sufficient unlabeled samples in a semi-supervised fashion.
To establish a new semi-supervised hyperspectral classification framework based on GANs, we add the generated samples to the HSI dataset and denote them as the
th class. The dimension of the classifier output is correspondingly increased from
C to
. The probability when
comes from
G can be represented as
, which is a substitution of
in the objective function
of the original GANs [
65]. Note that the unlabeled training samples belong to the former
C classes, we can learn from those unlabeled samples to improve the classification performance by maximizing
.
Without loss of generality, assuming half of the dataset consists of real data and half is the generated data, the loss function
L of the classifier yields
where
represents the negative log probability of the label with the data is from the real HSI features,
equals the standard GAN game-value function in case we substitute
into Equation (
19)
According to the Output Distribution Matching (ODM) cost theory of [
79], if we have
and
for some undetermined scaling function
, the unsupervised loss will be consistent with the supervised loss. As such, by combining
and
, we can get the total cross entropy loss
L, whose optimal solution can be estimated by minimizing both loss functions jointly.
Moreover, to address the instability of the unsupervised optimization part related to the GANs, we adopt a strategy called feature matching to substitute the traditional way of training the generator
G by requiring it to match the statistics characteristics of the real data. In greater detail, the generator
G is trained to match the expected value of the output
on an intermediate layer in the discriminator
D. By optimizing an alternative objective function defined as
, we obtain a fixed point where
G matches the distribution of training data. Based on the above analysis, we show a visual illustration of the semi-supervised hyperspectral classification method by GANs in
Figure 5. The network parameters of the generator
G and the discriminator
D in
Figure 5 are trained by optimizing the loss function in Equation (
17). The unlabeled data is taken as the true data
in Equation (
19) to train both generator
G and discriminator
D. Moreover, the latent space of the generator
G is chosen from the unlabeled data (To be exact, the latent space can also be chosen from the labeled data by ignoring the class labels), the noise follows the uniform distribution, and the output of the generator
G is the fake data. By jointly minimizing the loss functions in Equation (
17), the parameters of the generator
G are updated to fool the discriminator
D, and the fake examples are generated accordingly. The logistic regression classifier based on the soft-max function is adopted to perform the multi-class classification in the GANs. It is notable that the actual differences between the traditional GANs and the modified GANs used in this paper lie in threefold: (1) the objective functions are changed to make full use of both labeled and unlabeled samples; (2) the output layer of the discriminator is modified from binary classification to multi-class semi-supervised learning; (3) feature matching is adopted to improve the stability of the traditional GANs.
3. Experimental Section
In this section, we investigate the performance of the proposed method (abbreviated as 3DBF-GANs for simplicity) on three benchmark HSI datasets. A series of experiments are conducted to perform a comprehensive comparison with other state-of-the-art methods, including 2DBF [
64], SVM [
12], Laplacian SVM (LapSVM) [
22,
24] and CDL-MD-L [
62]. 2DBF and 3DBF are feature extraction methods, SVM is a widely-used supervised classifier, while LapSVM, GANs and CDL-MD-L are classifiers based on semi-supervised learning. Moreover, the original spectral features are also considered as a baseline for comparison.
3.1. Dataset Description
In the experiments, three publicly available hyperspectral datasets (i.e., Indian Pines data, University of Pavia data and Salinas data) are employed as benchmark datasets. What follows are details of the three hyperspectral datasets.
Indian Pines data: the first dataset was captured by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor over the agricultural Indian Pines test site in the Northwestern Indiana, USA, on 12 June 1992. The original image contains 224 spectral bands. After removing 4 bands full of zero and 20 bands affected by noise and water-vapor absorption, 200 bands are left for experiments. It consists of
pixels with a spatial resolution of 20 m per pixel, and the spectral coverage ranging from
to
m.
Figure 6 depicts the color composite of the image as well as the ground truth map. There are 16 classes of interest and the number of samples in each class is displayed in
Table 1, whose background color denotes different classes of land-covers. Since the number of samples is unbalanced and the spatial resolution is relatively low, it poses a big challenging to the classification task.
University of Pavia data: the second dataset was acquired by the Reflective Optics System Imaging Spectrometer (ROSIS) sensor over an urban area surrounding the University of Pavia, northern Italy, on 8 July 2002. The original data contains 115 spectral bands ranging from
to
m and the size of each band is
with a spatial resolution of
m per pixel. After removing 12 noisiest channels, 103 bands remained for experiments. The dataset contains 9 classes with various types of land-covers. The color composite image together with the ground truth data are shown in
Figure 7. The detailed number of samples in each class is listed in
Table 2, whose background color also corresponds to the color in
Figure 7.
Salinas data: the third dataset was collected by the AVIRIS sensor over the Salinas Valley, Southern California, USA, on 8 October 1998. The original dataset contains 224 spectral bands covering from the visible to short-wave infrared light. After discording 20 water absorption bands, 204 bands are preserved for experiments. This dataset consists of
pixels with a spatial resolution of
m per pixel. The color composite of the image and the ground truth are plotted in
Figure 8, which contains 16 classes of interest. The detailed number of classes in each class is shown in
Table 3, whose background color represents different classes of land-covers.
3.2. Experimental Setup
In order to evaluate the performance of the proposed 3DBF-GANs method, we compare it with some other algorithms, i.e., 2DBF, SVM, LapSVM, and CDL-MD-L. The original spectral features (abbreviated as “Spec”) are also considered in the experiments. Specifically, the “Spec”, 2DBF and 3DBF are feature extraction methods, while SVM, LapSVM and CDL-MD-L are supervised/semi-supervised classifiers. The LapSVM, which is a graph-based semi-supervised learning method, introduces an additional manifold regularizer on the geometry of both unlabeled and labeled data in terms of the Graph Laplacian. It has been applied to hyperspectral classification and the results have demonstrated the advantage of this graph-based method in semi-supervised classification of the HSI. As to the GANs, the standard framework is used, except for adding a softmax classifier in the output of the discriminator and adopting feature matching to improve the stability of the original GANs. By combining the feature extraction and classification methods in pairs, 12 methods (i.e., Spec-SVM, Spec-LapSVM, Spec-GANs, Spec-CDL-MD-L, 2DBF-SVM, 2DBF-LapSVM, 2DBF-CDL-MD-L, 2DBF-GANs, 3DBF-SVM, 3DBF-LapSVM, 3DBF-CDL-MD-L and 3DBF-GANs) are obtained for comparison. Since the spectral-spatial information is used in the original CDL-MD-L, Spec-CDL-MD-L and 3DBF-CDL-MD-L denote the input of the CDL-MD-L is the original HSI and the dataset given by 3DBF, respectively.
In the experiments, a training/test sample is a single pixel, whose size is . Each pixel can be taken as the feature of a certain class and classified by the discriminator of the GANs or other classifiers. Each pixel corresponds to a unique label. The whole cube contains many pixels and therefore, has lots of labels. All the HSI datasets are normalized between zero and one at the beginning of the experiments. All the experiments are implemented on the normalized hyperspectral datasets, whose available data is randomly divided into two parts, i.e., about 60% for training and the rest for testing. In all the datasets, very limited labeled samples, i.e., 5 samples per class, are randomly selected from the training samples as labeled samples, and the remaining ones are used as unlabeled samples. The experiments are repeated ten times using random selection of training and test sets, and the average accuracies are reported. To assess the experimental results quantitatively, we compare the aforementioned methods by three popular indexes, i.e., overall accuracy (OA), average accuracy (AA) and kappa coefficient (). Moreover, the F-Measure of various methods is also compared.
For the parameter settings, since the number of labeled samples is limited, leave one out cross validation is adopted in this paper. The range of the filtering size
and blur degree
in the 2DBF are selected in the range of
and
, respectively, whereas both
and
in the 3DBF are chosen from
. In the SVM and LapSVM, radial basis function (RBF) kernels are adopted. The RBF parameter
is obtained from the range
and the penalty term is set to 60. 4 spectral neighbors are adopted to calculate the Laplacian graph in the LapSVM. Three layers are used in the CDL-MD-L, whose window size and the number of hidden units are set to the same as [
62]. The generator in the GANs has two hidden layers, and the number of units is set to 500 and 300, respectively. In the discriminator, three hidden layers are adopted, and the number of units is set to
and 150, respectively. Gaussian noise is added to the output of each layer of the discriminator. Moreover, the learning rate and training epoch are set to 0.001 and 100, respectively.
3.3. Experimental Results
To demonstrate the effectiveness of the 3DBF for spectral-spatial feature extraction, we compare the spectral profiles of the pixel (18,6) from the original Indian Pines data, and the features obtained by the 2DBF and the 3DBF in
Figure 9. Moreover, the spatial scenes of the 4th, 22nd, 34th bands are compared in
Figure 10. As can be seen, the profiles of 3DBF preserve the trend of the original data while provide smoother features in both spectral and spatial domains.
The qualitative evaluations of various methods are shown in
Table 4,
Table 5 and
Table 6, and the classification maps are also visually compared in
Figure 11,
Figure 12 and
Figure 13. Based on the above-mentioned experimental results, a few observations and discussions can be highlighted. It can be first seen that, the methods (i.e., Spec-SVM, 2DBF-SVM, and 3DBF-SVM) using only the limited labeled training samples provide worse classification performance than the semi-supervised methods that take the unlabeled training samples into consideration. This stresses yet again the importance of unlabeled samples for HSI classification. For instance, it is observed from
Table 4 that the SVM leads to lower classification accuracies than other classifiers (i.e., LapSVM, CDL-MD-L and GANs). Taking the same original “Spec” features as inputs, the OA of SVM is 2.15%, 23.28% and 9.49% lower than those of the LapSVM, CDL-MD-L and GANs, respectively. Similar properties can also be found in
Table 5 and
Table 6. The above-mentioned phenomena demonstrate the effectiveness of utilizing the abundant unlabeled samples for the HSI data.
Second, the “Spec”-based features provide higher classification errors than the 2DBF/3DBF-based features. As shown in
Table 5, the OA, AA,
and F-Measure of Spec-SVM are lower than those of the 2DBF-SVM and 3DBF-SVM. Similarly, the OA, AA,
and F-Measure of Spec-LapSVM/ CDL-MD-L/GANs are also lower than 2DBF-LapSVM/CDL-MD-L/GANs and 3DBF-LapSVM/ CDL-MD-L/GANs. It is also clearly visible that more scattered noise is generated in
Figure 12a than in
Figure 12e,i. This is due to the fact that the “Spec” features based only on spectral characteristics, while 2DBF and 3DBF methods can effectively incorporate the spatial information. Since the CDL-MD-L can make use of both spectral and spatial information in the classification process, the classification accuracies of Spec-CDL-MD-L are much higher than those of the Spec-SVM, Spec-LapSVM and Spec-GANs. As shown in
Table 5, the OA of Spec-CDL-MD-L is at least 8% higher than other classifiers. Moreover, with the same classifiers, the 3DBF performs much better than 2DBF. For instance, the OA of 3DBF-GANs in
Table 5 is about 4% higher than that of the 2DBF-GANs. The reason for good results of 3DBF is that it exploits the spectral-spatial features by obeying the 3D nature of the HSI cube.
Finally, as to different classifiers, the GANs with 2DBF or 3DBF features provides better or comparable classification results as compared with SVM, LapSVM and CDL-MD-L. It is observed from
Table 4 that the OA of 2DBF-GANs is 19.65%, 17.02% and 0.25% higher than those of the 2DBF-SVM, 2DBF-LapSVM and 2DBF-CDL-MD-L, respectively, the OA of 3DBF-GANs is also much higher than 3DBF-SVM and 3DBF-LapSVM, and slightly higher than 3DBF-CDL-MD-L. Classification results of the University of Pavia data (see
Table 5) and the Salinas data (see
Table 6) also yield similar properties. Specifically, it is noteworthy that the “meadows” (i.e., class 2) and “bare soil” (i.e., class 6) in the University of Pavia are difficult to be separated, and the classification accuracies of those two classes obtained by the 3DBF-GANs outperform other methods (see
Table 5). Moreover, the GANs with the original spectral features are much inferior to the CDL-MD-L. As shown in
Table 4, the OA of Spec-GANs is 13.79% less than that of the Spec-CDL-MD-L. In
Table 5 (or
Table 6), the OA of Spec-GANs is also 8.17% (or 5.55%) lower than the Spec-CDL-MD-L. The main reason why Spec-GANs obtains poor results is the ignorance of spatial information. In a nutshell, the afore-mentioned analysis validates the effectiveness of the proposed 3DBF-GANs method in semi-supervised hyperspectral classification.