Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
CCE-OMBOC: A Simple and Efficient Constant-Envelope Technology for Multicarrier Navigation Modulation by Clipping
Previous Article in Journal
Deep Learning-Based Emergency Rescue Positioning Technology Using Matching-Map Images
Previous Article in Special Issue
Semantic Segmentation and Classification of Active and Abandoned Agricultural Fields through Deep Learning in the Southern Peruvian Andes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification

by
Ali Jamali
1,*,
Swalpa Kumar Roy
2,
Danfeng Hong
3,4,
Bing Lu
1 and
Pedram Ghamisi
5,6
1
Department of Geography, Simon Fraser University, 8888 University Dr, Burnaby, BC V5A 1S6, Canada
2
Department of Computer Science and Engineering, Alipurduar Government Engineering and Management College, Bakla 736206, India
3
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
4
School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
5
Machine Learning Group, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Helmholtz Institute Freiberg for Resource Technology, 09599 Freiberg, Germany
6
Lancaster University, Lancaster LA1 4YR, UK
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(21), 4015; https://doi.org/10.3390/rs16214015
Submission received: 28 August 2024 / Revised: 23 October 2024 / Accepted: 24 October 2024 / Published: 29 October 2024

Abstract

:
Convolutional neural networks (CNNs) and vision transformers (ViTs) have shown excellent capability in complex hyperspectral image (HSI) classification. However, these models require a significant number of training data and are computational resources. On the other hand, modern Multi-Layer Perceptrons (MLPs) have demonstrated a great classification capability. These modern MLP-based models require significantly less training data compared with CNNs and ViTs, achieving state-of-the-art classification accuracy. Recently, Kolmogorov–Arnold networks (KANs) were proposed as viable alternatives for MLPs. Because of their internal similarity to splines and their external similarity to MLPs, KANs are able to optimize learned features with remarkable accuracy, in addition to being able to learn new features. Thus, in this study, we assessed the effectiveness of KANs for complex HSI data classification. Moreover, to enhance the HSI classification accuracy obtained by the KANs, we developed and proposed a hybrid architecture utilizing 1D, 2D, and 3D KANs. To demonstrate the effectiveness of the proposed KAN architecture, we conducted extensive experiments on three newly created HSI benchmark datasets: QUH-Pingan, QUH-Tangdaowan, and QUH-Qingyun. The results underscored the competitive or better capability of the developed hybrid KAN-based model across these benchmark datasets over several other CNN- and ViT-based algorithms, including 1D-CNN, 2DCNN, 3D CNN, VGG-16, ResNet-50, EfficientNet, RNN, and ViT.

1. Introduction

Hyperspectral remote sensing has drawn a lot of interest lately for a variety of Earth observation uses [1,2,3,4]. Mapping the physical, biological, or geographical dimensions of ecosystems is necessary to monitor the temporal and spatial patterns of Earth surface activities and comprehend how they work. Because each pixel contains a wealth of spectral information, hyperspectral imaging (HSI) has been applied extensively in a variety of real-world applications, including precision agriculture [5], military object detection [6], and land use land cover mapping [7,8]. Because it offers precise and detailed information about the physical and chemical properties of objects that are imaged, HSI has grown to be an essential tool in the industry. Notably, the detailed features produce effective classification results that are too intricate for conventional methods, i.e., a nonlinear correlation among the obtained spectrum data and the corresponding object, such as buildings [3].
As opposed to standard panchromatic and multi-spectral imagery captured by satellites, HSI supplies hundreds of contiguous narrow spectral bands, offering an improved detailed and accurate technique for discerning Earth objects [9]. HSI is especially useful for more refined classification because of its capacity to identify subtle spectral characteristics that standard imagery is unable to detect [10]. The majority of techniques used in the early stages of HSI classification research concentrated on handcrafted extraction of features, such as extended morphological profiles (EMPs) [11] and extended extinction profiles (EEPs). However, these conventional classification techniques are limited in their ability to retrieve high-level characteristics of images, and they are associated with “shallow” models. As a result, these techniques typically fall short of achieving greater accuracy. Recently, it has been established that deep learning (DL) is a powerful feature extractor that effectively recognizes the nonlinear problems that have emerged in a variety of computer vision tasks. This promotes the encouraging outcomes of using DL for HSI data classification [12,13,14].
Convolutional neural networks (CNNs), because of their superior local contextual modeling capabilities, are widely used in spectral–spatial HSI data classification. While CNN-based methods are advantageous for spatial–contextual identification, they suffer greatly from handling spectral sequential data because long-range dependencies are often difficult for CNNs to capture correctly [15]. While the existing CNN-based techniques have shown promising results [16], they continue to encounter several difficulties. For instance, the receptive field is constrained, data are lost during the downsampling phase, and deep networks require a large amount of processing power [17]. On the other hand, in the field of computer vision, vision transformers (ViTs) have demonstrated significant promise recently [18,19,20,21,22]. By means of the incorporation of a multi-layer perceptron (MLP) and a multi-headed self-attention (MHSA) module, ViTs are capable of acquiring global long-range data interactions in the input sequential data. Because of this capability, the application of transformers to the classification of HSI data is expanding rapidly [23,24,25,26].
However, due to their quadratic computational complexity, transformers need a substantially higher amount of training data than CNNs and have a relatively high computational cost [27]. ViTs and CNNs have been surpassed in image classification tasks by modern MLP algorithms, such as MLP-Mixer [27] and ResMLP [28], which have demonstrated an excellent classification capability. These modern MLP models require significantly less training data compared to CNNs and ViTs, achieving state-of-the-art classification accuracy [29]. In addition, SpectralMamba [30] has been proposed for hyperspectral image classification to further reduce computational complexity while effectively improving the classification performance. This work is notable as the first to introduce the Mamba framework into the hyperspectral remote sensing field.
In the past few months, Kolmogorov–Arnold Networks (KANs), which are inspired by the Kolmogorov–Arnold representation theorem, were proposed as viable alternatives for MLPs [31]. KANs employ feature learnable activation functions on edges, or “weights”, in comparison with MLPs, which have fixed activation functions on nodes, or “neurons”. KANs do not use any linear weights at all; instead, a uni-variate function with spline parameterization serves as a substitute for each weight parameter. Thus, in this research, we assess and evaluate the capability and effectiveness of KAN models for complex HSI data classification over several other CNN- and vision-based models. The contributions of this paper can be summarized as follows:
  • We introduce a hybrid architecture based on KANs, a technique that achieves competitive or better HSI classification accuracy over several well-known CNN- and ViT-based algorithms.
  • We incorporate 1D, 2D, and 3D KAN modules to enhance the ability of linear KANs in image classification tasks. This hybrid architecture increases the discriminative capability of the KAN architecture.
  • We conduct extensive experiments on a brand new, complex HSI dataset called Qingdao UAV-borne HSI (QUH), including QUH-Tangdaowan, QUH-Qingyun, and QUH-Pingan [32]. These experiments prove the effectiveness of the proposed KAN architecture.
The remainder of the paper is structured as follows. In Section 2, we examine the structure and various modules developed in the proposed KAN-model-based architecture. Subsequently, we conduct comprehensive experiments, including a thorough discussion of the obtained HSI data classification results, as detailed in Section 4. The paper concludes with a summary provided in Section 5.

2. Proposed Methodology

Multilayer perceptrons (MLPs) are the foundation of many modern deep learning models. KANs were recently presented as an alternative to MLPs [31]. KANs are motivated by the Kolmogorov–Arnold representation theorem [33], whereas MLPs are inspired by the universal approximation theorem. Similar to MLPs, KANs have fully-connected structures. But MLPs employ fixed activation functions on nodes (referred to as “neurons”), while KANs place learnable activation functions on edges (referred to as “weights”). Instead of using linear weight matrices, KANs use a learnable 1D function parametrized as a spline for each weight parameter. Nodes in KANs do nothing more than add up incoming signals without using any non-linearities. The straightforward modification of KANs to use an activation function on the edges allows them to surpass MLPs in terms of accuracy as well as interpretability on small-scale machine learning challenges. In function-fitting tasks, smaller KANs can attain accuracy levels that are comparable to or higher than larger MLPs. KANs are known to have faster neural scaling laws than MLPs, both in theory and in practice [31]. Splines can be easily adjusted locally, are precise for low-dimensional functions, and can transition between different resolutions. However, due to their limited ability to take advantage of compositional structures, splines suffer greatly from the curse of dimensionality (COD). In contrast, MLPs are less prone to COD because of their feature learning capabilities. However, in low dimensions, their accuracy is inferior to splines due to their incapacity to optimize univariate functions. It should be noted that KANs are just combinations of splines and MLPs, utilizing their respective advantages and avoiding their respective disadvantages, despite their sophisticated mathematical interpretation. In order to correctly learn a function, the model must be able to approximate the univariate functions (internal degrees of freedom) as well as learn the compositional structure (external degrees of freedom). Because of their internal similarity to splines and their external similarity to MLPs, KANs are able to optimize learned features with remarkable accuracy in addition to being able to learn new features.
Are KANs similar to MLP? An MLP can be expressed as stacking N layers and each layer may be expressed as a linear combination of the weight matrix (W) followed by non-linear operations ( δ ) for the input X R p i n :
M L P ( x ) = ( W N 1 δ W N 2 W 1 δ W 0 ) x
On the other hand, a general KAN model consists of nesting N layers and the output map can be defined as follows:
K A N ( x ) = ( Φ N 1 Φ N 2 Φ 1 Φ 0 ) x
where Φ i represents the i-th layer of the entire KAN models. Let p i n and p o u t be the dimension of the input and output for each KAN layer, then Φ consists of p i n × p o u t 1D learnable activation function ϕ :
Φ = { ϕ i , j } i = 1 , 2 , p i n , j = 1 , 2 , p o u t
The outcome of KAN models while computing from layer n to layer n + 1 may be shown in matrix form as follows:
X n + 1 = ϕ n , 1 , 1 ( · ) ϕ n , 1 , 2 ( · ) ϕ n , 1 , p n ( · ) ϕ n , 2 , 1 ( · ) ϕ n , 2 , 2 ( · ) ϕ n , 2 , p n ( · ) ϕ n , p n + 1 , 1 ( · ) ϕ n , p n + 1 , 2 ( · ) ϕ n , p n + 1 , p n ( · ) Φ n X n
It is evident that KANs treat non-linearities and linear transformations collectively in Φ , whereas MLPs treat them separately as W and δ . To ensure the representation power of ϕ i , j and Φ i , as shown in Figure 1, in the KAN models a basis function b ( x ) (similar to that of residual connections) is included, such that the activation function ϕ ( x ) is the sum of the many spline function and the basis function b ( x ) , as defined by:
ϕ ( x ) = w b b ( x ) + w s s p l i n e ( x )
where b ( x ) = s i l u ( x ) = x / ( 1 + e x ) , spline(x)= i c i B i ( x ) , and c i are trainable. For more details, refer to Liu et al. [31].
Classical vs. KAN Convolution: KAN convolutions are perhaps similar to traditional convolutions operation, except that each element is given a learnable non-linear activation function, which is then added to the kernel and the associated pixels in the image patch, rather than the dot product between the two. The kernel of the KAN convolution is equivalent to a KAN linear Layer of 9 inputs and 1 output neuron (shown in Figure 2). The output pixel of that convolution step is the sum of ϕ i ( x i ) for each input i, to which we have applied a ϕ i learnable function. To visualize the difference between classical vs. KAN convolution, consider the input image patch X R W × H , the output O R H × W , the kernel K, and Φ for the convolutional
X = x 11 x 12 x 13 x 1 w x 21 x 22 x 23 x 2 w x 31 x 32 x 33 x 3 w x h 1 x h 2 x h 3 x h w H × W
and KAN kernel are defined in Equation (7), respectively.
K = k 11 k 12 k 13 k 21 k 22 k 23 k 31 k 32 k 33 a n d Φ = ϕ 11 ( · ) ϕ 12 ( · ) ϕ 13 ( · ) ϕ 21 ( · ) ϕ 22 ( · ) ϕ 23 ( · ) ϕ 31 ( · ) ϕ 32 ( · ) ϕ 33 ( · )
The output of the classical convolutional operation (*) can be obtained as follows:
o i , j = m , n = 0 K 1 x i + m , j + n K m , n
In the case of KAN convolution, the inner function ϕ ( · ) may be represented as a matrix containing several activation functions as shown in Equation (7). We also have an input matrix (X) that will cycle through each activation function and has n × n characteristics. It should be noted that here, ϕ ( · ) denotes the activation function rather than the weights. These activation functions are called B-splines. Let’s add all of the functions, which are just basic polynomial curves and these curves are dependent upon the X input. The output of the KAN convolutional operation (∘) can be obtained as follows:
o i , j = m , n ϕ m , n ( x i + m , j + n )
Similarly, the above Equation (9) can easily be extended for the input image X R H × W × C i n with C i n channels by applying a set of KAN kernels Φ , which produces the output O R H × W × C o u t as follows:
o i , j , c = m , n , c ϕ m , n , c ( x i + m , j + n )
HybridSN an Embedding by KAN Layer: We experimentally selected a KAN architecture similar to the hybrid spectral network [34], as seen in Figure 3. The hybrid spectral network was proposed in 2020 and is considered to be a successful architecture in hyperspectral feature extraction and classification. Considering an input hyperspectral image of X i n R H × W × B , where H, W, and B indicate the height,, width, and number of spectral bands, respectively. We first utilized a principle component analysis (PCA) algorithm to reduce the number of input channels/bands in all HSI datasets to D, expressed as follows:
X = f P C A ( X i n )
To enhance the HSI classification accuracy obtained by the KAN models, we developed and proposed a hybrid KAN-network-based architecture consisting of three consecutive 3D KANs with 8, 16, and 32 output channels (feature maps), expressed as follows:
X = K A N 3 D ( K A N 3 D ( K A N 3 D ( X ) ) )
Then, one 2D KAN layer with an output channel (output map) of 64 is employed immediately after the third 3D KAN. The resulting feature maps are then flattened and sent to a 1D KAN layer with a hidden layer of 32 and output map/channel equivalent to the number classes in the HSI data, expressed as:
c l a s s = K A N 1 D ( K A N 2 D ( X ) )
The architecture of the proposed KAN-based model layer-wised is presented in Table 1.

3. Datasets

The HSI data benchmarks that we used were located in Qingdao City, Shandong Province, China’s West Coast New Area. This city is close to China’s Yellow Sea coast and features a wealth of both natural and artificial surroundings, as well as rapid urbanization. Because the morphology and distribution of each region’s land cover are so complex, it is not easy to classify them precisely. An unmanned aerial vehicle (UAV) equipped with hyperspectral sensors was used to collect these datasets. More specifically, the UAV platform was the DJI M600 Pro. A hyperspectral sensor called the Gaiasky mini2-VN imaging spectrometer was used. Image mosaicking, radiometric calibration, and atmospheric and geometric corrections were all carried out using the instrument manufacturer’s SpecView software [32]. It is important to note that the HSI data benchmarks used in this study were more challenging than existing major HSI datasets, such as Indian Pines and Pavia University, due to their high inter-class and intra-class similarity.

3.1. QUH-Tangdaowan

The QUH-Tangdaowan dataset was surveyed on 18 May 2021, in Tangdao Bay National Wetland Park, Qingdao, China. The UAV operated at a height of 300 m with a spatial resolution of approximately 0.15 m. This dataset comprises 176 bands with a wavelength range of 400–1000 nm and an image pixel size of 1740 × 860. Table 2 and Figure 4 illustrate the number of training, validation, and test data in this dataset.

3.2. QUH-Qingyun

The QUH-Qingyun dataset was surveyed on 18 May 2021, in the vicinity of the Qingyun Road primary school and residential area in Qingdao, China. The UAV captured images with an image pixel size of 880 × 1360, 270 bands ranging from 400 to 1000 nm at a height of 300 m with a spatial resolution of approximately 0.15 m. Table 3 and Figure 5 illustrate the number of training, validation, and test data in this dataset.

3.3. QUH-Pingan

On 19 May 2021, at Huangdao Pingan Passenger Ship Terminal in Qingdao, China, the QUH-Pingan dataset was collected. The UAV operated at a height of 200 m above the ground with a spatial resolution of approximately 0.10 m. This dataset comprises 176 bands with a wavelength range of 400–1000 nm and an image pixel size of 1230 × 1000. Table 4 and Figure 6 present the number of training, validation, and test data in this HSI dataset.

3.4. Experimental Setting

This section describes the comparative approaches and experimental settings used to evaluate the proposed KAN-based model. The overall accuracy (OA), average accuracy (AA), Kappa accuracy ( κ ), and per-class accuracies were calculated across all HSI datasets. The percentage of accurately mapped samples was the main focus of the overall and average accuracy. On the other hand, the Kappa ( κ ) accuracy resulted from statistical testing and offered information about how well classification models functioned in comparison to random selection. Essentially, the accuracy of Kappa ( κ ) depends on the number of classes in the dataset and the probability that sample points will be assigned a random label. As such, it functions as a more reliable accuracy metric than OA and AA, which could be deceptive in instances of unbalanced datasets. Comparative analysis against state-of-the-art methods was conducted to assess the effectiveness of the KAN models. In more detail, the HSI classification results obtained by the KAN models were evaluated agianst several other models, including 1D-CNN, 2DCNN, 3D CNN, VGG-16 [35], ResNet-50 [36], EfficientNet [37], RNN [38], and ViT [39].

4. Results

4.1. Statistical Results

Table 5 and Figure 7 illustrate the HSI classification results and maps produced by the developed CNN- and transformer-based architectures in the Tangdaowan HSI dataset. The results revealed that the KAN models, specifically the developed HybridKAN architecture, obtained a competitive HSI classification accuracy compared with other well-known CNNs and ViTs. The developed HybridKAN achieved the highest average accuracy (97.12%), while the ResNet-50 model achieved the best overall accuracy (98.09%) and Kappa value (97.82%). The HSI data classification results underscored the effectiveness of the KAN models compared with the other classification models. The 3D KAN models similar to its counterpart of 3D CNN demonstrated the least HSI classification accuracy with an average accuracy of 76.14% compared with that of 1D KAN (96.15%), 2D KAN (96.47%), and the HybridKAN (97.12%). As seen in Figure 7, HybridKAN illustrated the most homogeneous classification map with much less noise compared with the other classification architectures, showcasing its high capability in accurate HSI data classification.
On the other hand, in the Pingan HSI dataset, as seen in Table 6 and Figure 8, the highest HSI data classification result was obtained by the VGG-16 CNN model with an overall accuracy, Kappa value, and average accuracy of 99.06%, 98.61%, and 98.09%, respectively. In this HSI dataset, the developed HybridKAN architecture demonstrated a competitive HSI classification accuracy compared with other models, with an average accuracy, Kappa value, and overall accuracy of 95.95%, 97.74%, and 98.48%, respectively. Similar to the Tangdaowan dataset, the 3D KAN model with an average accuracy of 81.53% illustrated the least classification accuracy over the 2DKAN (94.14%), 1DKAN (95.42%), and the HybridKAN (95.95%). While the statistical results showed a slightly better classification accuracy with VGG-16 over HybridKAN, as seen in Figure 8, the HybridKAN architecture produced much less noise and a more homogeneous classification map.
Moreover, as seen in Table 7 and Figure 9, the best HSI data classification accuracy was obtained by the developed HybridKAN architecture in terms of overall accuracy (97.06%) and Kappa value (96.11%) in the Qingyun HSI dataset. The highest average accuracy was achieved by the 2DCNN model (95.60%) over the other developed classification models. The proposed HybridKAN architecture obtained the highest average accuracy (94.91%) as compared to 3DKAN (84.69%), 1DKAN (92.11%), and 2DKAN network (92.93%). Overall, the obtained results showed significant capability of KAN models for complex land cover land use mapping using HSI data. We used a simple and straightforward architecture similar to traditional CNN-based models (e.g., Hybrid SN [34]), yet the developed model based on the KAN architectures illustrated competitive or better HSI data classification capability compared to other developed CNN- and ViT-based classification models.

4.2. Convergence Graph Between HybridSN and Its KAN Version

Because the training process in deep learning may prove time-consuming and it is not always evident when the network has acquired sufficient information, convergence is an important tool. Nevertheless, when the validation and training error cease to decrease, a deep learning model is considered to have been converged. An ideal solution is not always guaranteed by convergence; this relies on several variables, including the network’s architecture, the hyperparameters, and the quality of the HSI data. As seen in Figure 10, Figure 11 and Figure 12, the HybridKAN architecture, which utilizes KAN layers, is superior over the HybridSN using convolutional layers in terms of lower loss, higher train accuracy, and higher validation accuracy. The HyperKAN model requires a smaller number of epochs for its convergence, which is vital in the remote sensing filed with the existence of high-dimensional and complex data. This proves the better capability of the developed HyperKAN model over the HybridSN classification algorithm.

4.3. Feature Visualization of KAN Using t-SNE

The many spectral ranges that make up HS data allow for the comprehensive capture of details over a large range of electromagnetic wavelengths. As such, it can be difficult to visualize these high-dimensional characteristics. Nevertheless, t-Distributed Stochastic Neighbour Embedding (t-SNE) [40] may make it easier to observe the complex spectral–spatial features that the developed HybridKAN extracts in a two-dimensional space. To analyze the representational abilities of our model, this visualization is essential because it provides insights that may not be immediately clear from a direct examination of the raw data. The feature distributions for HybridKAN in 2D feature space by t-SNE are shown in Figure 13. As can be seen in Figure 13, the developed KAN-based architecture demonstrated excellent feature separation capability for recognizing complex land covers in all three HSI data benchmarks, according to the results obtained by the t-SNE algorithm. Furthermore, due to its weighted non-linear function rather than traditional MLPs with fixed non-linear activation functions, HybridKAN’s classification map, as shown in Figure 7, Figure 8 and Figure 9, showed the least amount of noise and the most homogeneous classification map when compared with the other implemented algorithms. Furthermore, compared with the conventional ViT architecture, it is clear that HybridKAN’s classification maps are far less noisy.

4.4. Hyperparameter Sensitivity Analysis

The complexity and quantity of parameters in a classification algorithm are important factors to consider in the remote sensing field. In comparison, in 2D and 3D KAN models, HybridKAN model has more parameters, but this increase is justified. The higher OA, AA, and k in Table 5, Table 6 and Table 7 demonstrate the considerable boost to the classification performance of the hybrid model, which justifies the trade-off. Furthermore, as shown in Figure 7Figure 8, the visual classification maps produced by the HybridKAN architecture yield less noise and more homogeneous classification maps. This perspective emphasizes the notion that the trade-off of greater model complexity (as indicated by the greater number of parameters in Table 8) is matched with a demonstrable and supported improvement in the model’s capacity to categorize high-spectrum imagery correctly.

5. Conclusions

This research proposed and discussed a KAN-model-based architecture for complex land use land cover mapping using HSI data, which employs 1D, 2D, and 3D KAN models. The classification results on three highly complex HSI datasets demonstrate that the developed classification model, HybridKAN, was competitive or better statistically and visually over several other CNN- and ViT-based algorithms, including 1D-CNN, 2D-CNN, 3D-CNN, VGG-16, ResNet-50, EfficientNet, RNN, and ViT. The obtained results underscored the significant potential use of KAN models in complex remote sensing tasks. The HSI data classification ability of the proposed hybrid KAN architecture compared with other CNN-and ViT-based classification models is shown over three HSI benchmark datasets: QUH-Pingan, QUH-Tangdaowan, and QUH-Qingyun. The results underscored the competitive or better capability of the developed hybrid model across these benchmark datasets compared with state-of-the-art classification architectures.

Author Contributions

Conceptualization, A.J. and S.K.R.; methodology, A.J. and S.K.R.; software, A.J.; validation, D.H. and P.G.; formal analysis, A.J.; investigation, D.H. and P.G.; resources, A.J.; data curation, A.J.; writing—original draft preparation, A.J. and S.K.R.; writing—review and editing, D.H., B.L. and P.G.; visualization, A.J. and S.K.R.; supervision, B.L. and P.G.; project administration, A.J.; funding acquisition, A.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data and code are available at: https://github.com/aj1365/HSIConvKAN.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Hong, D.; Zhang, B.; Li, X.; Li, Y.; Li, C.; Yao, J.; Yokoya, N.; Li, H.; Ghamisi, P.; Jia, X.; et al. Spectralgpt: Spectral remote sensing foundation model. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5227–5244. [Google Scholar] [CrossRef] [PubMed]
  2. Ghamisi, P.; Yokoya, N.; Li, J.; Liao, W.; Liu, S.; Plaza, J.; Rasti, B.; Plaza, A. Advances in Hyperspectral Image and Signal Processing: A Comprehensive Overview of the State of the Art. IEEE Geosci. Remote. Sens. Mag. 2017, 5, 37–78. [Google Scholar] [CrossRef]
  3. Ullah, F.; Ullah, I.; Khan, R.U.; Khan, S.; Khan, K.; Pau, G. Conventional to Deep Ensemble Methods for Hyperspectral Image Classification: A Comprehensive Survey. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2024, 17, 3878–3916. [Google Scholar] [CrossRef]
  4. Li, C.; Zhang, B.; Hong, D.; Jia, X.; Plaza, A.; Chanussot, J. Learning Disentangled Priors for Hyperspectral Anomaly Detection: A Coupling Model-driven and Data-driven Paradigm. IEEE Trans. Neural Netw. Learn. Syst. 2024, 1–14. [Google Scholar] [CrossRef]
  5. Khan, A.; Vibhute, A.D.; Mali, S.; Patil, C. A systematic review on hyperspectral imaging technology with a machine and deep learning methodology for agricultural applications. Ecol. Inform. 2022, 69, 101678. [Google Scholar] [CrossRef]
  6. Ke, C. Military object detection using multiple information extracted from hyperspectral imagery. In Proceedings of the 2017 International Conference on Progress in Informatics and Computing (PIC), Nanjing, China, 15–17 December 2017; pp. 124–128. [Google Scholar] [CrossRef]
  7. Hong, D.; Zhang, B.; Li, H.; Li, Y.; Yao, J.; Li, C.; Werner, M.; Chanussot, J.; Zipf, A.; Zhu, X.X. Cross-city matters: A multimodal remote sensing benchmark dataset for cross-city semantic segmentation using high-resolution domain adaptation networks. Remote Sens. Environ. 2023, 299, 113856. [Google Scholar] [CrossRef]
  8. Roy, S.K.; Sukul, A.; Jamali, A.; Haut, J.M.; Ghamisi, P. Cross Hyperspectral and LiDAR Attention Transformer: An Extended Self-Attention for Land Use and Land Cover Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5512815. [Google Scholar] [CrossRef]
  9. Landgrebe, D. Hyperspectral image data analysis. IEEE Signal Process. Mag. 2002, 19, 17–28. [Google Scholar] [CrossRef]
  10. Li, J.; Marpu, P.R.; Plaza, A.; Bioucas-Dias, J.M.; Benediktsson, J.A. Generalized Composite Kernel Framework for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2013, 51, 4816–4829. [Google Scholar] [CrossRef]
  11. Fauvel, M.; Chanussot, J.; Benediktsson, J.A.; Sveinsson, J.R. Spectral and spatial classification of hyperspectral data using SVMs and morphological profiles. IEEE Int. Geosci. Remote Sens. Symp. 2007, 46, 4834–4837. [Google Scholar] [CrossRef]
  12. He, X.; Chen, Y.; Huang, L.; Hong, D.; Du, Q. Foundation Model-based Multimodal Remote Sensing Data Classification. IEEE Trans. Geosci. Remote Sens. 2023, 62, 5502117. [Google Scholar] [CrossRef]
  13. He, X.; Chen, Y.; Huang, L. Bayesian Deep Learning for Hyperspectral Image Classification With Low Uncertainty. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5506916. [Google Scholar] [CrossRef]
  14. Li, C.; Zhang, B.; Hong, D.; Yao, J.; Chanussot, J. LRR-Net: An interpretable deep unfolding network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5513412. [Google Scholar] [CrossRef]
  15. Xue, Z.; Tan, X.; Yu, X.; Liu, B.; Yu, A.; Zhang, P. Deep Hierarchical Vision Transformer for Hyperspectral and LiDAR Data Classification. IEEE Trans. Image Process. 2022, 31, 3095–3110. [Google Scholar] [CrossRef] [PubMed]
  16. Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep Feature Extraction and Classification of Hyperspectral Images Based on Convolutional Neural Networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef]
  17. Yu, Q.; Wei, W.; Li, D.; Pan, Z.; Li, C.; Hong, D. HyperSINet: A Synergetic Interaction Network Combined with Convolution and Transformer for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5508118. [Google Scholar] [CrossRef]
  18. Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep Learning for Hyperspectral Image Classification: An Overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef]
  19. Lee, H.; Kwon, H. Going Deeper with Contextual CNN for Hyperspectral Image Classification. IEEE Trans. Image Process. 2017, 26, 4843–4855. [Google Scholar] [CrossRef]
  20. Ran, Q.; Zhou, Y.; Hong, D.; Bi, M.; Ni, L.; Li, X.; Ahmad, M. Deep transformer and few-shot learning for hyperspectral image classification. CAAI Trans. Intell. Technol. 2023, 8, 1323–1336. [Google Scholar] [CrossRef]
  21. Yao, J.; Zhang, B.; Li, C.; Hong, D.; Chanussot, J. Extended vision transformer (ExViT) for land use and land cover classification: A multimodal deep learning framework. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
  22. Yang, X.; Ye, Y.; Li, X.; Lau, R.Y.K.; Zhang, X.; Huang, X. Hyperspectral Image Classification With Deep Learning Models. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5408–5423. [Google Scholar] [CrossRef]
  23. Liang, L.; Zhang, Y.; Zhang, S.; Li, J.; Plaza, A.; Kang, X. Fast Hyperspectral Image Classification Combining Transformers and SimAM-Based CNNs. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5522219. [Google Scholar] [CrossRef]
  24. Hong, D.; Han, Z.; Yao, J.; Gao, L.; Zhang, B.; Plaza, A.; Chanussot, J. SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5518615. [Google Scholar] [CrossRef]
  25. Mei, S.; Song, C.; Ma, M.; Xu, F. Hyperspectral Image Classification Using Group-Aware Hierarchical Transformer. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5539014. [Google Scholar] [CrossRef]
  26. Li, C.; Zhang, B.; Hong, D.; Zhou, J.; Vivone, G.; Li, S.; Chanussot, J. CasFormer: Cascaded transformers for fusion-aware computational hyperspectral imaging. Inf. Fusion 2024, 108, 102408. [Google Scholar] [CrossRef]
  27. Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J.; et al. MLP-Mixer: An all-MLP Architecture for Vision. In Advances in Neural Information Processing Systems; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: New York, NY, USA, 2021; Volume 34, pp. 24261–24272. [Google Scholar]
  28. Touvron, H.; Bojanowski, P.; Caron, M.; Cord, M.; El-Nouby, A.; Grave, E.; Izacard, G.; Joulin, A.; Synnaeve, G.; Verbeek, J.; et al. ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 5314–5321. [Google Scholar] [CrossRef]
  29. Jamali, A.; Roy, S.K.; Hong, D.; Atkinson, P.M.; Ghamisi, P. Spatial-Gated Multilayer Perceptron for Land Use and Land Cover Mapping. IEEE Geosci. Remote Sens. Lett. 2024, 21, 5502105. [Google Scholar] [CrossRef]
  30. Yao, J.; Hong, D.; Li, C.; Chanussot, J. Spectralmamba: Efficient mamba for hyperspectral image classification. arXiv 2024, arXiv:2404.08489. [Google Scholar]
  31. Liu, Z.; Wang, Y.; Vaidya, S.; Ruehle, F.; Halverson, J.; Soljačić, M.; Hou, T.Y.; Tegmark, M. KAN: Kolmogorov-Arnold Networks. arXiv 2024, arXiv:abs/2404.19756. [Google Scholar]
  32. Fu, H.; Sun, G.; Zhang, L.; Zhang, A.; Ren, J.; Jia, X.; Li, F. Three-dimensional singular spectrum analysis for precise land cover classification from UAV-borne hyperspectral benchmark datasets. ISPRS J. Photogramm. Remote Sens. 2023, 203, 115–134. [Google Scholar] [CrossRef]
  33. Kolmogorov, A.N. On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition. In Doklady Akademii Nauk; Russian Academy of Sciences: Moscow, Russia, 1957; Volume 114, pp. 953–956. [Google Scholar]
  34. Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D–2-D CNN Feature Hierarchy for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2020, 17, 277–281. [Google Scholar] [CrossRef]
  35. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
  36. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 30 June 2016. [Google Scholar]
  37. Koonce, B. EfficientNet. In Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization; Apress: Berkeley, CA, USA, 2021; pp. 109–123. [Google Scholar] [CrossRef]
  38. Mou, L.; Ghamisi, P.; Zhu, X.X. Deep Recurrent Neural Networks for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3639–3655. [Google Scholar] [CrossRef]
  39. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2010, arXiv:2010.11929. [Google Scholar]
  40. Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Figure 1. The overall architecture of the Kolmogorov–Arnold networks.
Figure 1. The overall architecture of the Kolmogorov–Arnold networks.
Remotesensing 16 04015 g001
Figure 2. Pictorial representation of the KAN convolution operation where x, and Φ represent the input sub-patch and B-splines, respectively. The output of o 14 = x Φ can be calculated as ϕ 11 ( x 11 ) + ϕ 12 ( x 12 ) + ϕ 13 ( x 13 ) + ϕ 21 ( x 21 ) + ϕ 22 ( x 22 ) + ϕ 23 ( x 23 ) + ϕ 31 ( x 31 ) + ϕ 32 ( x 32 ) + ϕ 33 ( x 33 ) .
Figure 2. Pictorial representation of the KAN convolution operation where x, and Φ represent the input sub-patch and B-splines, respectively. The output of o 14 = x Φ can be calculated as ϕ 11 ( x 11 ) + ϕ 12 ( x 12 ) + ϕ 13 ( x 13 ) + ϕ 21 ( x 21 ) + ϕ 22 ( x 22 ) + ϕ 23 ( x 23 ) + ϕ 31 ( x 31 ) + ϕ 32 ( x 32 ) + ϕ 33 ( x 33 ) .
Remotesensing 16 04015 g002
Figure 3. The overall architecture of the proposed hybrid KAN.
Figure 3. The overall architecture of the proposed hybrid KAN.
Remotesensing 16 04015 g003
Figure 4. Pictorial view of the QUH-Tangdaowan data benchmark: (a) the annotation of the training samples, (b) the annotation of the validation samples, and (c) the test samples.
Figure 4. Pictorial view of the QUH-Tangdaowan data benchmark: (a) the annotation of the training samples, (b) the annotation of the validation samples, and (c) the test samples.
Remotesensing 16 04015 g004
Figure 5. Pictorial view of the QUH-Qingyun data benchmark: (a) the annotation of the training samples, (b) the annotation of the validation samples, and (c) the test samples.
Figure 5. Pictorial view of the QUH-Qingyun data benchmark: (a) the annotation of the training samples, (b) the annotation of the validation samples, and (c) the test samples.
Remotesensing 16 04015 g005
Figure 6. Pictorial view of the QUH-Pingan data benchmark: (a) the annotation of the training samples, (b) the annotation of the validation samples, and (c) the test samples.
Figure 6. Pictorial view of the QUH-Pingan data benchmark: (a) the annotation of the training samples, (b) the annotation of the validation samples, and (c) the test samples.
Remotesensing 16 04015 g006
Figure 7. The predicted land cover maps obtained for the Tangdaowan HSI dataset: (a) RGB image (b) GT (c) 1D-CNN (d) 2D-CNN (e) 3D-CNN (f) DRNN (g) ResNet50 (h) VGG-16 (i) EfficientNet (j) ViT (k) 1D-KAN (l) 2D-KAN (m) 3D-KAN (n) HybridKAN, respectively.
Figure 7. The predicted land cover maps obtained for the Tangdaowan HSI dataset: (a) RGB image (b) GT (c) 1D-CNN (d) 2D-CNN (e) 3D-CNN (f) DRNN (g) ResNet50 (h) VGG-16 (i) EfficientNet (j) ViT (k) 1D-KAN (l) 2D-KAN (m) 3D-KAN (n) HybridKAN, respectively.
Remotesensing 16 04015 g007
Figure 8. The predicted land cover maps were created for the Pingan data set: (a) RGB image (b) GT (c) 1D-CNN (d) 2D-CNN (e) 3D-CNN (f) DRNN (g) ResNet50 (h) VGG-16 (i) EfficientNet (j) ViT (k) 1D-KAN (l) 2D-KAN (m) 3D-KAN (n) HybridKAN, respectively.
Figure 8. The predicted land cover maps were created for the Pingan data set: (a) RGB image (b) GT (c) 1D-CNN (d) 2D-CNN (e) 3D-CNN (f) DRNN (g) ResNet50 (h) VGG-16 (i) EfficientNet (j) ViT (k) 1D-KAN (l) 2D-KAN (m) 3D-KAN (n) HybridKAN, respectively.
Remotesensing 16 04015 g008
Figure 9. The predicted land cover maps were created for the Qingyun dataset: (a) RGB image (b) GT (c) 1D-CNN (d) 2D-CNN (e) 3D-CNN (f) DRNN (g) ResNet50 (h) VGG-16 (i) EfficientNet (j) ViT (k) 1D-KAN (l) 2D-KAN (m) 3D-KAN (n) HybridKAN, respectively.
Figure 9. The predicted land cover maps were created for the Qingyun dataset: (a) RGB image (b) GT (c) 1D-CNN (d) 2D-CNN (e) 3D-CNN (f) DRNN (g) ResNet50 (h) VGG-16 (i) EfficientNet (j) ViT (k) 1D-KAN (l) 2D-KAN (m) 3D-KAN (n) HybridKAN, respectively.
Remotesensing 16 04015 g009
Figure 10. The convergence graph of (a) loss and (b) train accuracy between HybridSN and its KAN version over the Tangdaowan HSI benchmark dataset for 40 epochs.
Figure 10. The convergence graph of (a) loss and (b) train accuracy between HybridSN and its KAN version over the Tangdaowan HSI benchmark dataset for 40 epochs.
Remotesensing 16 04015 g010
Figure 11. The convergence graph of (a) loss and (b) train accuracy between HybridSN and its KAN version over the Qingyun HSI benchmark dataset for 40 epochs.
Figure 11. The convergence graph of (a) loss and (b) train accuracy between HybridSN and its KAN version over the Qingyun HSI benchmark dataset for 40 epochs.
Remotesensing 16 04015 g011
Figure 12. The convergence graph of (a) loss and (b) train accuracy between HybridSN and its KAN version over the Pingan HSI benchmark dataset for 40 epochs.
Figure 12. The convergence graph of (a) loss and (b) train accuracy between HybridSN and its KAN version over the Pingan HSI benchmark dataset for 40 epochs.
Remotesensing 16 04015 g012
Figure 13. The t-SNE visual presentation of the HybridKAN in data benchmark of (a) Tangdaowan, (b) Qingyun, and (c) Pingan.
Figure 13. The t-SNE visual presentation of the HybridKAN in data benchmark of (a) Tangdaowan, (b) Qingyun, and (c) Pingan.
Remotesensing 16 04015 g013
Table 1. The layer-wise summary of the proposed hybrid KAN architecture with a window size of 9×9. The last layer is based on the Tangdaowan dataset.)
Table 1. The layer-wise summary of the proposed hybrid KAN architecture with a window size of 9×9. The last layer is based on the Tangdaowan dataset.)
Layer (Type)Kernel SizeStrideNumber of Kernels/FiltersOutput Shape
KAN3D-1118(8, 9, 9, 1)
KAN3D-21116(16, 9, 9, 1)
KAN3D-31132(32, 9, 9, 1)
Reshape---(32, 9, 9)
KAN2D-13264(64, 5, 5)
Max pooling33-(64, 1, 1)
Flatten---(64, 1)
KAN1D-1--32(64, 32, 18)
Table 2. Number of training, validation, and test ground truth data in the QUH-Tangdaowan dataset.
Table 2. Number of training, validation, and test ground truth data in the QUH-Tangdaowan dataset.
Class No.ColorClassTrainValidationTestTotal
1Remotesensing 16 04015 i001Rubber track7755517012,92425,849
2Remotesensing 16 04015 i002Flaggingv16,66611,11127,77655,553
3Remotesensing 16 04015 i003Sandy10,211680717,01934,037
4Remotesensing 16 04015 i004Asphalt18,20712,13830,34560,690
5Remotesensing 16 04015 i005Boardwalk5593729311862
6Remotesensing 16 04015 i006Rocky shallows11,137742518,56337,125
7Remotesensing 16 04015 i007Grassland42382825706414,127
8Remotesensing 16 04015 i008Bulrush19,22612,81732,04464,087
9Remotesensing 16 04015 i009Gravel road9208613915,34830,695
10Remotesensing 16 04015 i010Ligustrum vicaryi5353578911783
11Remotesensing 16 04015 i011Coniferous pine6371424710,61821,236
12Remotesensing 16 04015 i012Spiraea225150374749
13Remotesensing 16 04015 i013Bare soil5063378431686
14Remotesensing 16 04015 i014Buxus sinica266177443886
15Remotesensing 16 04015 i015Photinia serrulata42062804701014,020
16Remotesensing 16 04015 i016Populus42,27128,18170,452140,904
17Remotesensing 16 04015 i017Ulmus pumila L.2940196149019802
18Remotesensing 16 04015 i018Seawater12,682845521,13842,275
--Total167,209111,473278,684557,366
Table 3. Number of training, validation, and test ground truth data in the QUH-Qingyun dataset.
Table 3. Number of training, validation, and test ground truth data in the QUH-Qingyun dataset.
Class No.ColorClassTrainValidationTestTotal
1Remotesensing 16 04015 i019Trees83,44555,630139,075278,150
2Remotesensing 16 04015 i020Concrete building53,85335,90289,757179,512
3Remotesensing 16 04015 i021Car41352757689113,783
4Remotesensing 16 04015 i022Ironhide building2930195348849767
5Remotesensing 16 04015 i023Plastic playground65,32043,547108,868217,735
6Remotesensing 16 04015 i024Asphalt road76,78451,189127,973255,946
--Total286,467190,978477,448954,893
Table 4. Number of training, validation, and test ground truth data in the QUH-Pingan dataset.
Table 4. Number of training, validation, and test ground truth data in the QUH-Pingan dataset.
Class No.ColorClassTrainValidationTestTotal
1Remotesensing 16 04015 i025Ship14,680978724,46848,935
2Remotesensing 16 04015 i026Seawater173,434115,622289,057578,113
3Remotesensing 16 04015 i027Trees2504166941728345
4Remotesensing 16 04015 i028Concrete structure building26,69217,79444,48788,973
5Remotesensing 16 04015 i029Floating pier6228415210,37920,759
6Remotesensing 16 04015 i030Brick houses42262817704314,086
7Remotesensing 16 04015 i031Steel houses41972798699613,991
8Remotesensing 16 04015 i032Wharf construction land24,93416,62341,55683,113
9Remotesensing 16 04015 i033Car2432162240548108
10Remotesensing 16 04015 i034Road82,95455,303138,257276,514
--Total342,281228,187570,4691,140,937
Table 5. Classification results in terms of OA, AA, and Kappa (in %) obtained on the Tangdaowan dataset.
Table 5. Classification results in terms of OA, AA, and Kappa (in %) obtained on the Tangdaowan dataset.
Class No.1DCNN2DCNN3DCNNVGG16 [35]ResNet50 [36]EfficientNet [37]RNN [38]ViT [39]1DKAN [31]2DCKAN3DKANHybridKAN
199.8099.8299.4599.9699.9399.9399.93100.099.9599.9198.8199.85
298.4299.4987.0099.7597.4999.4897.8399.5699.5299.5295.9199.48
394.9896.6286.4697.6398.5497.0292.3792.9697.3196.7492.3397.24
499.1799.8992.0299.9699.8999.9298.2699.2099.7899.6098.5299.71
595.3897.4287.2099.7899.8999.2497.6397.8597.8897.6338.8699.35
688.5494.6686.8494.3893.8292.7190.9294.8197.2196.8790.6197.34
781.1490.7440.0081.9294.7096.0678.7091.2593.8490.7272.7295.78
899.8199.9298.3499.8399.9999.9399.8499.8099.9799.9498.8599.89
996.8999.3590.8492.9699.7999.9597.5599.7099.4599.4494.3399.51
1092.8198.5482.5798.4296.1885.1897.7593.9395.2295.7270.3897.75
1164.8084.7746.7894.7684.9596.8364.7286.6887.2491.8345.1690.34
1277.5486.8930.8793.0492.7887.4389.8378.8794.1891.2556.9995.85
1398.22100.091.91100.099.8899.5299.7698.8199.7099.5396.0799.06
1483.7498.4137.0989.1695.25100.086.0098.6497.6995.6745.7497.25
1579.7196.4079.3197.0896.8195.8383.0596.1694.0793.6877.4295.04
1693.5696.3586.2896.8598.9996.3892.7895.6696.5297.1688.5897.20
1765.5392.5174.5397.4497.1296.8967.4191.8390.7994.2071.5893.66
1899.8499.9398.5499.9099.9199.9999.8399.6999.8699.9498.7699.91
OA93.8197.3287.6397.4398.0997.8993.5996.9097.6897.9091.1198.08
AA89.4496.2175.6196.2796.9996.7990.7995.3096.1596.4776.1497.12
κ ( × 100 ) 92.9396.9585.8297.0897.8297.6192.6996.4897.3697.6189.8097.81
Table 6. Classification results in terms of OA, AA, and Kappa (in %) obtained on the Pingan dataset.
Table 6. Classification results in terms of OA, AA, and Kappa (in %) obtained on the Pingan dataset.
Class No.1DCNN2DCNN3DCNNVGG16 [35]ResNet50 [36]EfficientNet [37]RNN [38]ViT [39]1DKAN [31]2DKAN3DKANHybridKAN
178.7393.1773.1297.0495.2389.8473.7285.2691.4491.1475.7892.91
298.8499.5998.5899.3699.5999.5698.9199.6199.4799.3699.1899.50
395.1899.0880.1699.2599.8397.6995.9795.3997.9693.6489.3396.98
483.3898.1160.5899.6998.2798.5980.0397.8896.8496.4583.2197.54
568.0195.0058.7796.0394.4493.9467.4195.3691.3988.5072.7293.42
688.2098.3553.4098.7598.4297.5889.4997.4897.5495.8186.5297.30
787.3698.7781.3498.8899.0297.2985.3798.6998.6396.6879.4598.15
886.7497.2276.7198.3998.9798.2784.1294.0696.5395.9183.4396.87
954.6690.9430.6794.3591.3989.9343.8891.3686.5282.8150.5989.11
1095.1299.3688.0099.1898.6799.0595.4898.8998.7498.3094.4598.91
OA93.8298.7988.6799.0698.8698.6193.1898.0898.2497.8493.3698.48
AA83.6296.9667.3798.0997.3896.1881.4495.4095.4294.1481.5395.95
κ ( × 100 ) 90.7798.2083.0198.6198.3197.9389.8097.1397.3896.7890.0997.74
Table 7. Classification results in terms of OA, AA, and Kappa (in %) obtained from the Qingyun dataset.
Table 7. Classification results in terms of OA, AA, and Kappa (in %) obtained from the Qingyun dataset.
Class No.1DCNN2DCNN3DCNNVGG16 [35]ResNet50 [36]EfficientNet [37]RNN [38]ViT [39]1DKAN [31]2DKAN3DKANHybridKAN
194.6297.5390.9696.8297.2897.6495.3697.4296.4596.5093.7297.17
292.2897.8984.2396.0187.6296.0492.8895.0296.7397.8992.9497.66
331.8285.4810.1171.3155.8269.0438.5553.7673.0173.0747.3982.62
497.0399.2495.6699.5297.6299.8398.5498.1599.1098.6296.8999.07
592.7198.2091.2198.5493.0996.9592.3595.8297.3897.9195.8598.05
690.0295.2383.8095.0593.1395.6990.6393.1395.3295.6890.8096.35
OA91.6396.9887.1996.2492.8096.2792.1594.8396.1396.5592.7497.06
AA83.0895.6075.4492.8787.4392.5384.7288.8892.1192.9384.6994.91
κ ( × 100 ) 88.8996.0082.9495.0390.4395.0589.5993.1594.8795.4390.3796.11
Table 8. Number of parameters in the developed classification algorithms. (* To reduce the number of trainable parameters, VGG-16, EfficientNet, and ResNet-50 have been modified).
Table 8. Number of parameters in the developed classification algorithms. (* To reduce the number of trainable parameters, VGG-16, EfficientNet, and ResNet-50 have been modified).
Model1DCNN2DCNN3DCNNVGG16 * [35]ResNet50 * [36]EfficientNet * [37]RNN [38]ViT [39]1DKAN [31]2DCKAN3DKANHybridKAN
Total number of parameters29,17060,9024,2821,174,162211,826177,4067,686152,586565,45814,74350,826135,090
Number of trainable parameters29,17060,9024,2821,174,162211,826177,4067,686152,586565,45814,74350,826135,090
Number of non-trainable parameters000000000000
Forward/backward pass size (MB)0.010.040.010.640.070.290.1118.630.070.120.030.12
Params size (MB)0.110.230.024.480.810.680.030.582.160.060.190.52
Estimated Total Size (MB)0.120.270.035.120.880.970.1419.222.230.180.230.63
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jamali, A.; Roy, S.K.; Hong, D.; Lu, B.; Ghamisi, P. How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification. Remote Sens. 2024, 16, 4015. https://doi.org/10.3390/rs16214015

AMA Style

Jamali A, Roy SK, Hong D, Lu B, Ghamisi P. How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification. Remote Sensing. 2024; 16(21):4015. https://doi.org/10.3390/rs16214015

Chicago/Turabian Style

Jamali, Ali, Swalpa Kumar Roy, Danfeng Hong, Bing Lu, and Pedram Ghamisi. 2024. "How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification" Remote Sensing 16, no. 21: 4015. https://doi.org/10.3390/rs16214015

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop