2.2. Modeling Intrinsic Structure of Hyperspectral Images
The proposed method transforms the hyperspectral image data projection into a low-dimensional space to model its intrinsic structure preserving spectral information and spatial information of hyperspectral images, thereby realizing the low-dimensional representation of the hyperspectral image to facilitate spectral unmixing.
The first law of geography is expressed as “Everything is related to everything else, but near things are more related than distant things” [
35]. With that in mind, we divide the hyperspectral image into multiple local windows, considering only the relation between the central pixel and the surrounding pixels in the local window, ignoring the pixels outside the local window.
Figure 3 shows the local window of hyperspectral image. Assume that the hyperspectral image with
n pixels, equals to
r × s pixels, is arranged on
r × s grids. The spatial coordinate of the top left corner grid is (0, 0). The grid spatial coordinate is increased from left to right, from top to bottom until the coordinate of the bottom right corner grid is (
r,
s). In
Figure 3, a black grid and red grids constitute a local window in which the black grid represents central pixel and red grids represent surrounding pixels. The size of the local window can be expressed by the product of the number of grids at the outermost edge of the local window. For example, the size of the local window in
Figure 3 is 5 × 5.
and
are the hyperspectral image pixels which represent spectral signals. Let us consider the case in which pixel
is the central pixel of a local window. If the pixel
is a surrounding pixel of pixel
in the local window, we define
. If the pixel
does not belong to surrounding pixels of pixel
in the local window, we define
Then we model the intrinsic structure of the local window of the hyperspectral image.
and
are connected by an edge if the pixel
is a surrounding pixel of pixel
in the local window. The edges can be weighted by the heat kernel [
25]. The heat kernel is expressed as follows:
where σ is a scaling parameter of the heat kernel.
,
are the
ith and
jth column vectors of the hyperspectral image
. It is known from Equation (3) that when the pixel
is in the local window, it is connected to the central pixel
and the strength of the connection can be weighed by the heat kernel. When the pixel
is outside the local window, its connection to the central pixel does not exist. This is consistent with the first law of geography. The σ value adapted to different intrinsic structures of local windows is given by the following equation:
where
h is the number of surrounding pixels of the central pixel
in the local window.
A hyperspectral image not only contains abundant spectral information of ground objects, but also includes spatial correlations between each pixel. Utilization of spatial correlations between pixels will be incorporated in modeling the intrinsic structure of hyperspectral image. Inspired by the above principle, the closer the surrounding pixel
is to the spatial distance of the central pixel
, the more the surrounding pixel
contributes to the central pixel
. It is also obvious that the closer the spectral similarity between the surrounding pixel
and the central pixel
, the more the surrounding pixel
contributes to the central pixel
. Assume that (
p,
q) and (
b,
c) are the spatial coordinates of the central pixel
and the surrounding pixel
, respectively. Then
and
also can be expressed as
x(
p,
q) and
x(
b,
c). Accounting for the effects of spectral information and spatial information, how much the surrounding pixel
contributes to the central pixel
can be reflected by Equations (5)–(7):
where <
x(
p,
q),
x(
b,
c)> denotes the inner product of the two spectra, and
denotes the vector magnitude.
value characterizes spatial distance difference between the central pixel
and the surrounding pixel
. The smaller the value of
, the closer the surrounding pixel
is located with respect to the central pixel
and the more contribution pixel
has to the central pixel
.
represents the spectral angle distance between
and
Spectral angle distance reflects the difference in the geometric characteristics of the two spectral vectors. If the
value is smaller, the more similar the geometric feature of the surrounding pixel
is to the central pixel
and the more contribution the pixel
has to the central pixel
.
is a simplified representation of the degree of the contribution. Equation (5) is convenient to calculate and can reflect the effect of the surrounding pixel
on the central pixel
. Accounting for the effect of the surrounding pixel
on the central pixel
, we promote Equations (3)–(8):
If the value of is larger, the surrounding pixel is more closely related to the central pixel . Compared with Equation (3), Equation (8) better reveals spectral geometric differences, spatial distance information, and spectral similarity of pixel pairs in the local window.
The spectral information of the pixel reflects the physical and chemical characteristics of the ground objects in the hyperspectral image. However, the spectral curve of the same ground object will change under different environments, which is called endmember spectral variability.
Figure 4 shows the spectral curves of trees and rocks. As can be seen from
Figure 4, curve 1 of the blue line is the spectral curve of trees under weak sunlight, curve 2 of orange line is the spectral curve of trees under strong sunlight, and curve 3 of gray line is the spectral curve of rocks. Because of the change in the intensity of solar radiation, the spectral curve of the tree mutates. The spatial position of the three ground objects of
Figure 4 is as shown in
Figure 5. It is assumed that the ground object of the central pixel is tree1 whose spectral information is curve 1. The ground objects of the two surrounding pixels are tree2 and rock whose spectral curves are curve 2 and curve 3, respectively.
We use Equations (3) and (8) to project the relation between the central pixel and the surrounding pixels into the low-dimensional space. The resulting projection values are shown in
Table 1. After projecting by Equation (3), projection value of the relation between tree1 and tree2 is relatively equal to projection value of the relation between tree1 and rock. According to the projection results of Equation (3), we think that the effect of the two surrounding pixels on the central pixel is nearly the same, and that the ground objects of the two surrounding pixels are close, which is obviously contrary to the actual situation. After projecting by Equation (8), projection value of the relation between tree1 and tree2 is much larger than projection value of the relation between tree1 and rock. According to the projection results of Equation (8), compared with the surrounding pixel including the curve 3, we determine that the surrounding pixel containing the curve 2 is closer to the central pixel, which is in accordance with the real situation.
Hyperspectral image can be segmented into homogeneous and transition areas [
27,
28,
36]. Pixels in homogeneous areas have large values relatively close to each other [
28]. Pixels in transition areas have relatively small values compared with homogeneous pixels [
28]. Segmenting the homogeneous and transition areas of hyperspectral images is conductive to improve spectral unmixing accuracy [
27,
28]. For each central pixel
with coordinate (
p,
q), the whole relations with its all surrounding pixels in the local window can be calculated in the Equation (9).
can be used to calculate the strength of the central pixel
connection with surrounding pixel
. The whole relations between the central pixel
with coordinate (
p,
q) and its all surrounding pixels in the local window can be denoted as
or
:
Given hyperspectral image with
r × s pixels, the whole relations of all local windows of hyperspectral image can be described as
R matrix in the Equation (10).
Figure 6 shows simulated hyperspectral image and whole relations of all local windows of simulated spectral image according to Equation (10). We see that
Figure 6b inherits the spatial distribution characteristics of the simulated hyperspectral image. In
Figure 6b, yellow represents a larger value and blue represents a smaller value. According to the definition of homogeneous and transition areas, the areas of dark yellow and light yellow are homogeneous areas while the blue areas are transition areas. Therefore,
R matrix can segment the homogeneous and transition areas of hyperspectral image:
Based on the above analysis, we choose Equation (8) as the projection equation, which will project the hyperspectral image into the low dimensional space. We determine each pixel of the hyperspectral image as the central pixel to create a local window with size of 5 × 5. Then we use the projection Equation (8) to project the relation between the central pixel and the surrounding pixels into the low-dimensional space to model the intrinsic structure of hyperspectral images.
2.3. PISINMF Algorithm Model
Spectral unmixing is designed to extract endmember spectrum and corresponding abundance from hyperspectral image data, in accordance to the linear mixture model expressed by Equation (1). NMF has been introduced into spectral unmixing, which aims at obtaining endmember matrix
and abundance matrix
to approximately represent the given non-negative matrix
. In order to measure the degree of approximation, the loss function based on the square of the Euclidean distance between
and
is expressed as:
where the operator
represents the Frobenius norm.
Unfortunately, because of the non-convexity of NMF, the algorithm has a large number of local minima. In order to reduce the influence of the non-convexity of NMF on spectral unmixing accuracy, sparsity constraints are introduced into standard NMF [
14,
20]. In [
20], L
1/2 regularizer has been proven to provide the best sparse solution. Therefore, L
1/2 regularizer is adopted as sparsity constraint for improving the sparsity of endmember abundances. Thus, NMF with sparsity constraints is written as:
where
is the regularization parameter which weights the contribution of
.
is defined in the Equation (13):
Motivated by a temperature schedule in the simulated annealing technique, the regularization parameter
is defined in the Equation (14) to avoid getting stuck in local minima. The regularization parameter
shows its effectiveness in [
31]. Thus, it will be adopted as the regularization parameter that controls the impact of the sparsity measure function
:
where
and
are constants to regularize the impact of sparsity constraints and
is the iteration number in the process of optimization [
31].
Given the hyperspectral data
, the abundance data
, and the endmember matrix
G, the following relation exists in Equation (15):
is an individual pixel of hyperspectral image
.
is abundance data corresponding to endmembers. From the perspective of dimensionality reduction, the endmember matrix can be regarded as a new base matrix in the m-dimension space. The hyperspectral image is represented as a linear combination of the new base matrix.
can be regarded as the representation of
in m-dimension space. Equation (15) can be further promoted to Equation (16):
We learn from Equation (16) that the more similar
and
are, the more similar the abundance of given pixel
is to the abundance of given pixel
. Therefore, a close link between
and
reveals a close link between hyperspectral image and decomposed abundance matrix. Based on the above analysis, once
and
are close, the abundance
of pixel
and the abundance
of pixel
in the m-dimension space are close too. Inspired by the above analysis, a graph regularizer is introduced in PISINMF to keep local structure invariant between hyperspectral image and decomposed abundance matrix. Graph regularizer is written as:
where
represents the trace of matrix,
represents the transpose of matrix,
is weight matrix in which
is element,
D is a diagonal matrix and
.
To make PISINMF have the ability to preserve the intrinsic structure invariant, the regularization term of Equation (17) obtained in the previous section is incorporated into the Equation (12). The cost function of PISINMF is written as:
where
is the regularization parameter of the graph regularizer.
2.5. Implementation Issue
The multiplicative rules of PISINMF algorithm model do not consider the sum over the columns of abundance to be unity. To make the results of PISINMF algorithm model more accurate,
X and
G are replaced by
and
in Equation (25):
where
is a
n (
m)-dimensional column vector of all 1 s.
δ in Equation (25) is the weight value and can control the influence of the sum over the columns of abundance to be unity on the objective function. In the implementation, a relatively small value of
δ (i.e.,
) will cause the sum of each column of
A to be much smaller than one. On the contrary, a larger value of
δ (i.e.,
) will cause the sum of each column of
A to be closer to one, but it reduces the convergence rate. So
δ is selected as 50 to meet the needs of precision and efficiency in PISINMF model.
In general, the initial value of the endmember matrix G can be calculated by the endmember extraction algorithm or randomly chosen data. In PISINMF algorithm model, we use the VCA algorithm or other endmember extraction algorithms to set the initial value of endmember matrix G.
The initial value of the abundance matrix
A can be calculated using the least squares method expressed as following Equation (26):
To make the initial value of the abundance matrix
A non-negative, we force the matrix
A to be non-negative according to Equation (27):
The PISINMF algorithm model sets two stop criteria in the iterative optimization process. One of the stop criteria is the maximum number of iterations, another stop criteria is that a threshold
τ should be specified for the Equation (28):