1336
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 5, MAY 2008
Model-Based Satellite Image Fusion
Henrik Aanæs, Johannes R. Sveinsson, Senior Member, IEEE, Allan Aasbjerg Nielsen,
Thomas Bøvith, and Jón Atli Benediktsson, Fellow, IEEE
Abstract—A method is proposed for pixel-level satellite image
fusion derived directly from a model of the imaging sensor. By
design, the proposed method is spectrally consistent. It is argued
that the proposed method needs regularization, as is the case for
any method for this problem. A framework for pixel neighborhood
regularization is presented. This framework enables the formulation of the regularization in a way that corresponds well with
our prior assumptions of the image data. The proposed method
is validated and compared with other approaches on several data
sets. Lastly, the intensity-hue-saturation method is revisited in
order to gain additional insight of what implications the spectral
consistency has for an image fusion method.
Index Terms—Generative model, image fusion, satellite images,
sensor fusion, spectral consistency.
tion to solve the problem. Thus, in a sense, we have to “make
up” the missing pieces of information, which, in practice, implies the use of prior knowledge either implicitly or explicitly.
To address this problem, the approach taken in this paper is to
consider the generative models of the data, i.e., models of how
the images are formed. This, in turn, allows us to distinguish
between exactly what we do know and what we do not know
regarding the desired fused image. This has the advantage that
the proposed method uses the available information and we can
deliberately construct prior models dealing with what we do
not know.
A. Relation to Other Works
I. I NTRODUCTION
S
ATELLITES provide very valuable data about the Earth,
e.g., for environmental monitoring, weather forecasting,
map making, and military intelligence. Satellites are expensive
both to build and operate. This implies that we are somehow
obliged to do the best with the data we get from existing
satellites, e.g., by combining the output from different sensors
and to increase the knowledge we can infer. In this paper, we
deal with the merging of low spatial and high spectral resolution
satellite images with high spatial and low spectral ones with
the aim of creating high spatial and high spectral images. This
is also known as image fusion, which is a subset of sensor or
data fusion.
Image fusion can be done at several levels: pixel, feature,
object, and decision levels, depending on the intended use of the
fused image. In this paper, we are only concerned about pixellevel fusion, and when the terms “image fusion” or “fusion”
are used, pixel-level fusion is intended. In the current context,
fusion is the next step after preprocessing and the step before
further processing, such as segmentation and classification.
As argued hereafter, satellite image fusion is an ill-posed
inverse problem, implying that we do not have enough informa-
Manuscript received May 31, 2007; revised October 30, 2007. This work was
supported in part by the Research Fund of the University of Iceland.
H. Aanæs is with the Department of Informatics and Mathematical
Modelling, Technical University of Denmark, 2800 Lyngby, Denmark.
J. R. Sveinsson and J. A. Benediktsson are with the Department of Electrical
and Computer Engineering, University of Iceland, 101 Reykjavík, Iceland.
A. A. Nielsen is with the National Space Institute, Technical University of
Denmark, 2800 Lyngby, Denmark.
T. Bøvith is with the National Space Institute, Technical University of
Denmark, 2800 Lyngby, Denmark, and also with the Danish Meteorological
Institute, 2100 Copenhagen, Denmark.
Digital Object Identifier 10.1109/TGRS.2008.916475
During the past two decades, several fusion techniques have
been proposed. Most of these techniques are based on the compromise between the desired spatial enhancement and the spectral consistency. The intensity-hue-saturation (IHS) method
[1]–[4] has been widely used for red, green, and blue (RGB)
color-composite fusion. The IHS is probably the most popular
image fusion approach because of its low computational cost.
The Brovey [5] sharpening method is a simple technique, and
it can be performed on individual bands. Principal-componentanalysis [6] and wavelet-based fusion methods [7]–[10] have
also been proposed. All these methods fall in the category of
component-substitution methods. All of them can cause color
distortion if the spectral range of the intensity replacement
image is different from the spectral range covered by the bands
used in a color composite. That is to say, the fused images
obtained by these methods have high spatial quality, but they
suffer from spectral distortions. To achieve an image with high
fidelity to the original spectral information, the smoothing filterbased intensity modulation fusion technique was proposed by
Liu [8]. However, the images obtained by using that method are
blurred compared with other fusion techniques. It was observed
that this blurring was a direct result of inaccurate coregistration
between the up-scaled multispectral and captured panchromatic
images. A good introduction to satellite image fusion can be
found in [11].
In this paper, a novel multisensor image fusion technique is
presented. The method is model based, and the fused images
obtained spectrally are consistent by design. The problem of
satellite image fusion is an ill-posed problem, and therefore,
regularization is necessary for its solution. For this purpose,
the framework for pixel neighborhood regularization is presented in this paper. Using this framework, we can do the
regularization consciously and also formulate it in a way that
corresponds well with our prior assumptions on the image data.
0196-2892/$25.00 © 2008 IEEE
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
AANÆS et al.: MODEL-BASED SATELLITE IMAGE FUSION
B. Notation and Paper Outline
The scope of this paper is to estimate a high-resolution
multichannel image from both low-resolution multichannel and
high-resolution panchromatic images. The presented theory and
methodology are valid for an arbitrary number of channels
and an arbitrary ratio between the spatial resolution of the lowand high-resolution images. For ease of presentation and without loss of generality, it is assumed (if nothing else is stated)
that, in this paper, we work with three channels (red, green,
and blue) and the ratio between high and low spatial resolution
pixels is one to four by four, i.e., 1 : 16. The notation used in this
high
high
paper is as follows: We want to estimate (Rhigh
ij , Gij , Bij )
low
low
from a low-resolution RGB image (Rlow
i , Gi , Bi ) and a
high
high-resolution panchromatic image Pij . The indices are
such that the first one i enumerates the low-resolution pixels
and the second one j ∈ {1, . . . , 16} enumerates the associated
high-resolution pixels.
The outline of this paper is as follows. In Section II, it is
argued that the satellite image fusion is inherently ill posed. In
Section III, this is followed by the introduction of our sensor or
observation model with the constraints that we draw from it. An
initial solution applying these constraints is given in Section IV.
Section V presents additional ways of regularizing the solution
by neighborhood comparison. In Section VI, the IHS method
is set in the context of this paper, giving additional insight. An
experimental comparison of our proposed method and a few
selected other approaches from the literature are reported in
Section VII. This is followed by a conclusion of this paper and
an Appendix dealing with computational issues relating to the
proposed method.
II. I LL -P OSED P ROBLEM
As mentioned earlier, we argue that any satellite image fusion
method needs regularization. This is due to the image fusion
problem being ill posed or under constraint, which can be seen
by simple variable counting. That is, for each low-resolutionimage pixel i, we have three values from the low-resolution
low
low
pixel (Rlow
i , Gi , Bi ) and 16 panchromatic values, i.e.,
19(= 16 + 3) in all. On the other hand, we need to estimate
16 RGB values, i.e., 48(= 16 · 3), which give us a shortage
of 29(= 48 − 19) values. Because no other information is
available, we have to produce these 29 dimensions or values by
regularization, preferably by modeling our prior assumption of
the data. This regularization can either be explicit or implicit.
For example, in the IHS method, regularization is implicitly
done by assuming that the hue and saturation values of all highresolution pixels are equal to the low-resolution pixel. That is
not necessarily the right solution, and probably, it is not.
III. O BSERVATION M ODEL
The methods presented here take their outset in an “assumed”
observation model for the satellite image sensor, which is also
seen as a generative model of the data.
This observation model, which is based on the response of
a given pixel and a data channel, is given by the frequency
1337
composition and intensity of the incoming light L(x, ω), where
x and ω denote position and frequency, respectively, and the
spectral response function for that channel F (ω),1 cf., Fig. 1.
This response can be expressed as
L(x, ω)F (ω) dω dx
(1)
A Ω
the frequency runs over a relevant spectral range Ω. The integral
(1) runs over the area A corresponding to the sensor area related
to the pixel.
A. Inner Product Space
The response in (1) is a convolution between the spectrum
of the incoming light L(x, ω) and the response function of the
channels F (ω). This induces an inner product, e.g.,2
F red , F pan =
F red (ω)F pan (ω) dω
(2)
Ω
corresponding to the L2 norm3 [13] which introduces a measure
of angle and distance. In using this, we are only interested in
the observable 4-D subspace spanned by the R, G, B, and P
channels. Therefore, as an example, a measure of how much
information about Ghigh is contained in Phigh is given by4
α(G, P ) =
F green , F pan
F green , F green F pan , F pan
(3)
which is equal to the cosine of the angle between the two
spectra. Collecting these normalized inner products between all
channels gives
⎡
α(R, R) α(R, G)
⎢ α(G, R) α(G, G)
Σ=⎣
α(B, R) α(B, G)
α(P, R) α(P, G)
α(R, B)
α(G, B)
α(B, B)
α(P, B)
⎤
α(R, P )
α(G, P ) ⎥
⎦
α(B, P )
α(P, P )
(4)
which, in some sense, describes the information between the
different channels. In Section IV, this matrix will be interpreted
as a correlation matrix, for ease of computation, without us
claiming that the spectra are normally distributed.
B. Spectral Consistency
Assuming that the low- and high-resolution pixels are perfectly aligned, i.e., the areas associated with the different pixels
1 F (ω)
is usually supplied by the sensor manufacturer.
response functions are independent of position. Therefore, there is no
need to integrate over area.
3 This is an extension of the regular two-norm to an infinite dimensional
space. Later, we will just refer to this as the two-norm.
4 We assume that there is an individual linear stretch of each channel, such
that it best fits the dynamic range. Thus, we normalize this criterion to account
for this.
2 The
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
1338
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 5, MAY 2008
Fig. 2. Here, it is illustrated that, because the same light is supposedly hitting
the low and high spatial resolution sensors, the color of the low-resolution
pixel is equal to the average of the corresponding high-resolution pixels, as
expressed in (5).
Fig. 1. Spectral response of the IKONOS spectral bands in [12].
satisfy (see Fig. 2)
Ai =
Aij
and
j
∅=
Aij .
j
Another property derived from (1) is that of spectral consistency. This property is very important for satellite image fusion
because spectral distortion may result in unreliable interpretation for applications. By considering any channel, e.g., the blue,
spectral consistency can be derived as follows:
=
Blow
i
L(x, ω)F (ω) dω dx
Ai Ω
16
=
L(x, ω)F (ω) dω dx
j=1A Ω
ij
16
Bhigh
ij .
=
j=1
Therefore, the spectral consistency constraint states that5
⎡ high ⎤
⎡ low ⎤
Rij
16
Ri
1
⎢ high ⎥
⎣ Glow
⎦
=
⎣ Gij ⎦ .
i
16 j=1
Blow
i
Bhigh
ij
(5)
channel but highly different on, e.g., the red channel. Thus,
Phigh and (5) do not give enough information to fully determine
the individual high-resolution multispectral components, e.g.,
Rhigh .
Here, it is assumed that the “unknown” part6 of the signal
is small. Therefore, we regularize the solution by penalizing
it with the two-norm, because this is the norm induced by
the inner product. This implicitly corresponds to casting the
problem in a statistical setting, assuming that an “observation”
in (Rhigh , Ghigh , Bhigh , Phigh ) space is normally distributed
with covariance Σ (4). Thus, the conditional means method [14]
is used to solve the problem. Due to the spectral consistency,
high
high
we know that the mean of (Rhigh
ij , Gij , Bij ) is equal to
low
low
low
(Ri , Gi , Bi ). This gives the following solution to the
fusion problem:
⎡
⎤
⎡ low ⎤ ⎡
⎤
Rhigh
Ri
α(R, P )
ij
⎢ high ⎥
high
µ
⎦
⎣
⎦
P
+
α(G,
P
)
−
P
⎣ Gij ⎦ = Fij = ⎣ Glow
i
ij
i
Blow
α(B, P )
i
Bhigh
ij
(6)
where Piµ denotes the mean of the Phigh
w.r.t. the lowij
resolution pixel i. For ease of notation, this solution is denoted
by Fij . The results of this proposed method can be seen in
Figs. 5–7 (where this solution is referred to as no smoothing).
A couple of comments should be made w.r.t. the solution in
(6). First, we do not claim that the observations are normally
distributed, just that a reasonable regularization corresponds
to it. Second, if there is no “overlap” between the spectral
responses of the panchromatic and one of the low-resolution
channels, then α(·, P ) will be zero. This indicates that there
is no information between the two channels and, in effect,
omits the last term in (6) for the channel in question. Lastly,
it is seen that the solution is spectrally consistent, cf., (5), by
construction, because
IV. I NITIAL S OLUTION
Deriving an image fusion method from the aforementioned
observation model is hindered by (Rhigh , Ghigh , Bhigh ), not
being fully observable from the data, as argued in Section II.
Therefore, regularization is required. The same is also illustrated by the spectral response functions of the sensor, cf.,
Fig. 1. Here, it is seen that there are several compositions of
incoming light, giving the same response on the panchromatic
5 1/16
is due to the appropriate normalization of the channels.
Piµ =
1
16
16
16
Phigh
⇒
ij
j=1
j=1
Phigh
− Piµ = 0.
ij
V. S MOOTHING P RIORS
Although the aforementioned initial solution is spectrally
consistent, it still leaves something to be desired, particularly
because there is a pronounced blocking effect originating from
6 This
“unknown” part is orthogonal to the panchromatic channel.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
AANÆS et al.: MODEL-BASED SATELLITE IMAGE FUSION
1339
the low-resolution pixels, as seen in Figs. 5–7. As argued in
Section VI, this is likely to be the best that can be done on a
pixelwise basis. In order to improve this, we propose the augmentation of the method with a spatial prior. This spatial prior
should express our expectation that the fused image should be
homogeneous or piecewise homogeneous, which corresponds
to our general expectation of images.
Specifically, by applying a Markov random-field setting [15],
[16] to the problem, the (conditional) smoothing of the image
can be expressed as penalizing deviations between neighboring
pixels. That is, define the errors
⎡ high ⎤ ⎡
⎤
Rij
Rhigh
⎢ high ⎥ ⎣ khigh ⎦
,
k ∈ Nij
(7)
ǫijk = ⎣ Gij ⎦ − Gk
high
high
B
Bij
k
where Nij is the four neighborhood of pixel ij. We then
regularize by adding the term
ρ(ǫijk )wijk
(8)
i,j k∈Nij
to (6), where the wijk denotes appropriate weights, and ρ is the
error function
−1 ǫijk .
ρ(ǫijk ) = ǫTijk Σ
(9)
is the RGB part of Σ, cf., (4). This regularizer is seen
Here, Σ
as a wijk -dependent smoothing of the image. The reason that
is included in (8) is that ρ(ǫijk ) should measure differences
Σ
in incoming light and not be biased by the correlation of the
measured channels.
The spatial prior is merged with the solution in (6) by
combining (6) and (8), i.e., define the deviation from Fij by
⎞
⎛⎡ high ⎤
⎤
⎞T
Rij
Rhigh
ij
⎟
⎥
⎟ −1 ⎜⎢ high ⎥
⎜⎢
Dij = ⎝⎣ Ghigh
⎝⎣ Gij ⎦ − Fij ⎠ (10)
⎦ − Fij ⎠ Σ
ij
Bhigh
Bhigh
ij
ij
⎛⎡
then, the combined objective function becomes
⎛
min
Rhigh ,Ghigh ,Bhigh
ij
⎝Dij + γ
k∈Nij
⎞
ρ(ǫijk )wijk ⎠
(11)
where γ is a tuning parameter encapsulating the weight of the
smoothness term that is relative to the data term. Note that this
is a least squares (LS) problem in Rhigh , Ghigh , Bhigh and (11)
is under the condition that the solution is spectrally consistent.
This latter characteristic is dealt with in Section V-A. In the
experiments relating to this paper, γ was selected in the range
of one to five. The value of γ, however, depends highly on the
intrinsic scaling of the two combined terms. Different range for
the γ may be appropriate in other problems.
The problem is now reduced to setting the appropriate
weights wijk to best capture the prior assumption of the fused
image. Usually, these weights are set simultaneously with solving (11), because the smoothed solution is the best clue to where
the edges in the image should be, cf., e.g., [15] and [16]. In
general, the optimization problem in (11) is thus “big” and
nonlinear. This is not the case here because we have the highresolution panchromatic image, which is a very good source of
information on where edges should be. Here, the weights wijk
are thus based on the panchromatic image, such that the fused
image shall have homogeneous regions where the panchromatic
image has them. To illustrate this, we have tried two general
approaches, cf., Sections V-B and C. The results with uniform
smoothing (all wijk = 1) are also presented for comparison.
The results are given in Section VII. By general approaches,
we mean that the two investigated approaches are good typical examples of what might be used; thus, they are not the
results of a long investigation of what is the best weighting
scheme.
A. Ensuring Spectral Consistency
Extending (6)–(11) implies that the LS solution is not spectrally consistent, cf., (5), and has to be added as a specific
condition to the solution. This is done by noting that (5) defines
a 15-D hyperplane in 16 dimensions on which the solution
should be located. We can thus reparameterize the solution
space such that the only possible values are on this hyperplane.
This is accomplished by introducing
high
high
Rhigh
ij , Gij , Bij ,
j ∈ {1, . . . , 15}
which map to the pixel of the fused image and variables in (1)
as follows:
⎡ high ⎤ ⎡ high ⎤
Rij
Rij
⎢ high ⎥ ⎢ high ⎥
j ∈ {1, . . . , 15}
⎣ Gij ⎦ = ⎣ Gij ⎦ ,
high
Bij
Bhigh
ij
⎡ high ⎤
⎤
⎡ low ⎤
Rij
Rhigh
15
Ri
i,16
⎢ high ⎥
⎢ high ⎥
low ⎦
⎣
−
⎣ Gij ⎦ .
⎣ Gi,16 ⎦ = 16 Gi
low
high
high
j=1
B
i
Bij
Bi,16
⎡
(12)
An important property of this mapping is that it is affine.
By plugging (12) into (11), a new objective function
high
high
in Rhigh
ij , Gij , Bij , which is spectrally consistent by design,
and an LS system are obtained. This, in turn, has nice implications for computing a solution (see Appendix). It should be
noted that solving (11) without this parameterization (or another expression of the spectral consistency constraint) will, in
general, yield nonspectrally consistent solutions. It is noted that
this parameterization is intrinsic in the predictor for satellite
image fusion presented in [17], where it is also noted that it
has the desirable characteristic of spectral consistency.
B. Edge-Induced Weights
We are, in a sense, trying to transfer edges from the panchromatic image to the fused RGB image. Thus, the first of the
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
1340
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 5, MAY 2008
Fig. 3. Sample from satellite image data set. (Top left) Panchromatic image. (Top center) Extracted edges via the Canny edge detector. (Top right) Weight set via
(13) with white being zero. Here, a λ of 0.05 is used. (Bottom left) Low-resolution multispectral image. (Bottom center) Resulting fused image with weight set
according to the Canny weights—see top center. (Bottom right) Resulting fused image with weight set via (13)—see top right.
general approaches we use for setting the weights in (11) are
as follows.
1) Extract edges in the panchromatic image, see Fig. 3, top
center.
2) If there is an edge between pixel ij and its kth neighbor,
set the weight to zero, or else, set the appropriate weight
to one.
The extracted edges are obtained via the Canny edge detector [18]. There are, however, a multitude of methods for
edge extraction, many of which could be used successfully,
e.g., via the total variation method [19], as used in [20].
The results using the Canny edges are shown in Fig. 3 and
Figs. 5–7.
C. Gradient-Induced Weights
Setting the weights via edge extraction gives results that are
somewhat blocky to sharp edges, as can be seen in Fig. 3 and
Figs. 5–7. This is partly due to the weights being either zero
or one. Therefore, the algorithm either smooths fully the one
between two pixels or does not smooth at all. Thus, it seems
natural to let the weights have a continuous transition from
zero to one and, by that, introducing the concept of “degree
of edginess.” That is the basis for our second general weighting
scheme.
The simplest form of the second weighting scheme would
be letting the weights be proportional to the inverse gradient
magnitude of the panchromatic image, i.e.,
wijk
1
∝
high
∇Pij
which however did not turn out too successful on our data.
Instead, we adapted a method from nonlinear image diffusion
Fig. 4.
Function in (13) as a function of the gradient magnitude ∇Phigh
ij
with λ = 0.5, which is denoted by the dotted line.
[21], which is proposed in [22]. Here, the amount of smoothing
is a nonlinear function of the image gradient magnitude—here,
the panchromatic image. The function in question is
⎛
⎞
⎜ −3.31488 ⎟
⎜
⎟
wijk = 1 − exp ⎜
4 ⎟
⎠
⎝ ∇Phigh
ij
(13)
λ
where λ is a steering parameter, indicating how large the
gradient should be in order to be an edge, see Fig. 4. The results
of this scheme can be seen in Fig. 3 and Figs. 5–7. A note
on (13): This function, including the −3.31488, has proven
successful in the nonlinear diffusion literature; for more on
this matter, see [21].
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
AANÆS et al.: MODEL-BASED SATELLITE IMAGE FUSION
1341
Fig. 5. Results of the image fusion strategies. A sample of the resulting
images of the IKONOS data set. (Top left) Input panchromatic image. (Top
right) No smoothing. (Row 2 left) Uniform weights. (Row 2 right) Edgeinduced weighting. (Row 3 left) Gradient-induced weighting. (Row 3 right)
IHS method. (Bottom left) Mean-corrected IHS method. (Bottom right) Brovey
method. The images have been stretched for better visualization.
Fig. 6. Results of the image fusion strategies. A sample of the resulting
images of the QuickBird data set. (Top left) Input panchromatic image. (Top
right) No smoothing. (Row 2 left) Uniform weights. (Row 2 right) Edgeinduced weighting. (Row 3 left) Gradient-induced weighting. (Row 3 right)
IHS method. (Bottom left) Mean-corrected IHS method. (Bottom right) Brovey
method. The images have been stretched for better visualization.
VI. N OTE ON THE IHS M ETHOD
Probably, the most popular method for image fusion is the
IHS method, where the fusion is performed in IHS space by
letting the hue and saturation of a fused pixel be equal to
the corresponding low-resolution pixel and the intensity Iij be
equal to the panchromatic channel, i.e.,
Iij =
Rhigh
ij
+
Ghigh
ij
3
+
Bhigh
ij
= Phigh
ij .
(14)
Combining (5) and (14), it is seen that
1
16
16
Phigh
=
ij
j=1
=
1
16
16
j=1
Rhigh
+ Ghigh
+ Bhigh
ij
ij
ij
3
Rlow
+ Glow
+ Blow
i
i
i
= Ii .
3
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
(15)
1342
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 5, MAY 2008
That leads us to propose a modification of the IHS method,7
which consists of appropriately correcting the mean of the
high
panchromatic channel. The correction is done by using P
ij
high
high
instead of Pij , where P
is
defined
as
follows:
ij
high = Phigh
P
ij
ij
1
16
Ii
16
j=1
Phigh
ij
= Iij .
(16)
high instead of Phigh in the IHS method
It is seen that using P
gives a spectrally consistent result.
The results of using this mean-corrected IHS method are seen
in Figs. 5–7. It is seen that this method, although spectrally
consistent, yields a blocky result that is similar to that of our
initial solution presented in Section IV. This seems to indicate
that in order to get a spectrally consistent result with a simple
per pixel method, you need to endure a blocking effect. An
explanation is that the discrepancy between the RGB response
functions and the panchromatic response function, as seen in
Fig. 1, gives a varying mean offset for each block.
This mean-corrected IHS method, in some sense, resembles
the Brovey method or transform, cf., [5] and [23], which is thus
included for experimental comparison. The Brovey method is
given by
⎤ ⎡
⎡
⎤
Rhigh
Rlow
ij
Phigh
⎢ high ⎥ ⎣ ilow ⎦
ij
.
⎣ Gij ⎦ = Gi
low + Glow + Blow
R
i
i
i
Blow
i
Bhigh
ij
(17)
It should also be noted that other methods have been proposed
for addressing the issue of spectral consistency with the IHS
method, cf., e.g., [24].
VII. E XPERIMENTAL C OMPARISONS AND D ISCUSSION
To experimentally investigate and compare the proposed
methods, we have applied them to three different data sets from
three different types of satellites, each depicting a different
type of landscape. The proposed methods were compared with
the IHS and Brovey methods for comparison. The investigated
methods are summarized in Table I. The three data sets are as
follows.
Fig. 7. Results of the image fusion strategies. A sample of the resulting
images of the Meteosat data set. (Top left) Input panchromatic image. (Top
right) No smoothing. (Row 2 left) Uniform weights. (Row 2 right) Edgeinduced weighting. (Row 3 left) Gradient-induced weighting. (Row 3 right)
IHS method. (Bottom left) Mean-corrected IHS method. (Bottom right) Brovey
method. The images have been stretched for better visualization.
This implies that if the IHS method should be spectrally
consistent, the mean of the panchromatic channel over a
low-resolution pixel should equal the corresponding lowresolution intensity. This will generally not be the case, as
shown in Fig. 1, where the spectral response function of the
panchromatic channel is far from the sum of the red, green, and
blue response functions. This fact is also validated experimentally, as reported in Table II.
1) IKONOS. An image of an urban area is taken by the
IKONOS earth imaging satellite [12]. The low-resolution
image consists of four bands R, G, B, and near infrared
(NIR) (in this paper, we only use R, G, and B), and
the ratio between the panchromatic and low-resolution
images is 16 or four by four.
2) QuickBird. An image of a forest landscape is bisected
by roads, and it contained a few large buildings. A city
emerges toward the right side. This image is taken by
the QuickBird satellite [25]. The low-resolution image
consists of four bands R, G, B, and NIR (in this paper, we use only R, G, and B), and the ratio between
7 Partly for gaining insight into what spectral consistency implies for satellite
image fusion methods.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
AANÆS et al.: MODEL-BASED SATELLITE IMAGE FUSION
1343
TABLE I
FOLLOWING METHODS WERE APPLIED IN THE EXPERIMENTS
TABLE II
SPECTRAL CONSISTENCY OF THE IHS AND BROVEY METHODS,
WHICH IS MEASURED AS THE CROSS CORRELATION BETWEEN THE
LOW-RESOLUTION RGB AND APPROPRIATELY DOWN-SAMPLED
FUSED IMAGES. ALL OTHER APPLIED METHODS, CF., TABLE I,
ARE SPECTRALLY CONSISTENT BY DESIGN, WHICH HAS
BEEN CONFIRMED EXPERIMENTALLY
the panchromatic and low-resolution images is 16 or
four by four.
3) Meteosat. A weather satellite image depicting Europe
from the Meteosat satellite [26]. We use the R, G, and
B bands of low resolution, and the ratio between the
panchromatic and low-resolution images is nine or three
by three.
Using these different data sets gives us an idea of how a given
method will work on data sets with different contents. The
reason that we did not use the fourth band in the IKONOS and
QuickBird cases was for visualization purposes. We have, however, implemented and applied our fusion methods successfully
to all four bands.
In evaluating the different methods, we were faced with
the problem that we did not have any ground truth data, i.e.,
a high-resolution color image, to compare with. Due to this
shortcoming, we have chosen two different strategies. First, we
have applied the methods to the original data sets, computed
the spectral consistency, see Table II, and visually evaluated
the results, see Figs. 5–7. Second, we have down-sampled the
original data to generate a new data set with ground truth,
albeit with different intrinsic scale. The difference between this
ground truth and the fused results was compared with different
metrics, see Table III.
In the experiments, the γ in (11) was set to five and one
for the original and down-sampled data,8 respectively. In the
Canny edge detector, which was used for generating the edgeinduced weights, the smoothing σ was set to one and 0.05 for
the original and down-sampled data, respectively. As for the
gradient-induced weights, the parameters σ and λ were selected
8 This
was used only to generate the values in Table III.
TABLE III
RESULT OF COMPARING THE FUSED RESULT WITH THE GROUND TRUTH
WITH THE DOWN-SAMPLED DATA. HERE, BOLD DENOTES THE BEST
RESULT FOR THE DATA SET; IF THIS IS NOT THE ONE DERIVED IN (11),
THE B EST OF T HESE I S D ENOTED BY I TALICS . S EE T ABLE I
FOR A D ESCRIPTION OF THE A PPLIED M ETHODS
as (0.5, 0.05) and (0.5,0.007) for the original and down-sampled
data, respectively.9
We have chosen five different metrics in evaluating the downsampled experiments, as reported in Table III. Many of the popular metrics are further developments of each other, which, in
part, explain the strong correlation among the best performing
9 The images are smoothed slightly with a Gaussian kernel before the gradient
is calculated.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
1344
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 5, MAY 2008
methods. None or all of these metrics, however, encompass the
ground truth of what a good fusion is; thus, visual inspection
also needs to be taken into account.10 The used metrics, which
are among the popular in the literature, are as follows (the
boldface acronym indicate its entry in Table III).
1) MSE is the mean square error on all bands. It takes on
values between zero and one with zero being the best
value.
2) CC is the mean cross correlation between bands. It takes
on values between zero and one with one being the best
value.
3) WB is the mean Wang and Bovik measure between
bands, cf., [27]. It takes on values between −1 and one
with one being the best value.
4) SSIM is a quality metric which measures the degradation of structural information between two images. This
metric combines three independent components of the
images: Luminance, contrast, and structure, and the value
one is only achieved if the compared images are identical.
The lowest value is zero if the images show no similarity
at all.
5) Q4 is an extension of the Wang and Bovik measure to
multiple dimensions, giving a combined metric, cf., [28].
It takes on values between zero and one with one being
the best value.
In Table III, most of the metrics (MSE, CC, WB, and Q4)
give significant numerical differences for the individual methods which were applied on the different data sets. However, that
is not the case for the SSIM metric, which gives similar results
for all the methods that were applied on the different data sets.
As a general observation, it is seen that the proposed methods
work well. Specifically, the gradient-induced weight method
should be noted, in that, it gives good visual results. Also, it
performs well in the down-sample experiment. In comparison,
the mean-corrected IHS method performs excellently in the
down-sampled experiment but gives a too blocky result with
the visual comparison. The Brovey and IHS methods give a
very pleasing visual result but perform poorly with respect to
the down-sampled experiment and are clearly not spectrally
consistent.
Comparing the edge-induced and gradient-induced weighting schemes visually, it is seen that the edge-induced scheme
gives an oversegmented image and the gradient-induced
scheme visually gives the best results, although the latter image
is a bit blurry. As mentioned in Section V, we have only
presented two general approaches, and the search for the best
weighting scheme is still ongoing. In this light, a mixture of the
edge-induced and gradient-induced schemes seems well worth
investigating.
The fact that the mean-corrected IHS method gives blocky
images and the gradient-induced weight method gives a somewhat smoothed image is well in line with our current experience. This experience states that with the proposed method,
10 Note, for example, that the mean-corrected IHS method performs best
when the metrics are used for the comparison in our experiments, cf. Table III,
but its visual quality leaves a lot to be desired. This motivates further investigation into image metrics, which is beyond the scope of this paper.
if spectral consistency is sought, either a blocking effect is
endured, as discussed in Section VI, or a slightly blurred
image is obtained because of the removal of the blockiness by
smoothing. This can be explained by the panchromatic channel
containing information that is not present in the RGB channels,
cf., Fig. 1.
As a note for future work, neither our initial solution (also
termed as the “no weight” solution) nor the mean-corrected IHS
method uses a spatial prior, but both are spectrally consistent.
Comparing the two, it is seen that the mean-corrected IHS
method clearly outperforms the former in terms of the quality
measures. This indicates that Fij in (10) could possibly be
replaced with the result of the mean-corrected IHS method. On
the other hand, the IHS method is less related to the underlying
imaging physics. Also, the IHS method, in some sense, works in
the one-norm, see (14), instead of the induced two-norm. Using
different norms also seems worth investigating, although these
are also less tied to the imaging physics.
VIII. C ONCLUSION
A framework has been presented for pixel-level fusion of
satellite images, which is derived by considering the underlying
imaging physics. This, among other characteristics, ensures
spectral consistency by design. We have also argued that the
satellite image fusion problem is inherently ill posed, and as
such, our framework explicitly states how regularization should
be performed to address that. Several schemes for regularization have been presented in the proposed framework. These
schemes have been experimentally tested on several data sets.
For all these data sets, the methods gave good performance.
The schemes also held their ground when compared with other
popular methods for satellite image fusion. Also, in this paper,
the IHS method was revisited in order to gain additional insight
into the implications of spectral consistency for image fusion
methods. The results obtained with the proposed framework are
promising, and further research on the framework is ongoing.
A PPENDIX
N UMERICAL S OLUTION
It is noted that (11) is an LS system, a property which is
retained after the reparameterization in (12) because it is just
an affine mapping. In principal, the solution can be obtained by
solving this LS system via the usual methods, e.g., the normal
equations in [30]. However, this poses the problem that the
dimension of the solution will often be very large. It can be
computed as the number of high-resolution pixels times the
number of bands. Therefore, LS methods become numerically
prohibitive. As an example, the IKONOS data set requires
1.873.920 variables with just three bands. Thus, something
else needs to be considered, e.g., exploiting the sparsity of the
problem. It is noted that the LS property in (11) ensures that a
nice and convex error surface is obtained with only one welldefined optimum.11
11 Since the problem is regularized properly, the optimum is unique. Otherwise, the optimum would be a linear subspace.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
AANÆS et al.: MODEL-BASED SATELLITE IMAGE FUSION
Our solution is solving (11) iteratively for all the highresolution pixels corresponding to a low-resolution pixel. Thus,
we keep all other pixels constant in relation to the neighborhood
term. The blocks were updated in a checkerboard fashion [31],
[32]. The process was run until convergence, i.e., it was stopped
when very small changes were observed in the solution.
ACKNOWLEDGMENT
The authors would like to thank A. Vésteinsson for spuring
this work via his masters thesis [29] and the reviewers whose
comments greatly improved this paper.
R EFERENCES
[1] T. Carper, T. Lillesand, and R. Kiefer, “The use of intensity-huesaturation transformations for merging SPOT panchromatic and multispectral image data,” Photogramm. Eng. Remote Sens., vol. 56, no. 4,
pp. 459–467, 1990.
[2] P. Chavez and J. Bowell, “Comparison of the spectral information content of Landsat Thematic Mapper and SPOT for three different sites in
the Phoenix, Arizona region,” Photogramm. Eng. Remote Sens., vol. 54,
no. 12, pp. 1699–1708, 1988.
[3] K. Edwards and P. Davis, “The use of intensity-hue-saturation transformation for producing color shaded-relief images,” Photogramm. Eng.
Remote Sens., vol. 60, no. 11, pp. 1369–1373, 1994.
[4] T. Tu, P. Huang, C. Hung, and C. Chang, “A fast intensity-hue-saturation
fusion technique with spectral adjustment for IKONOS imagery,” IEEE
Geosci. Remote Sens. Lett., vol. 1, no. 4, pp. 309–312, Oct. 2004.
[5] A. Gillespie, A. Kahle, and R. Walker, “Color enhancement of highly
correlated images. II. Channel ratio and ‘chromaticity’ transformation techniques,” Remote Sens. Environ., vol. 22, no. 3, pp. 343–365,
Aug. 1987.
[6] P. Chavez, S. Sides, and J. Anderson, “Comparison of three different
methods to merge multiresolution and multispectral data: Landsat TM and
SPOT panchromatic,” Photogramm. Eng. Remote Sens., vol. 57, no. 3,
pp. 295–303, 1991.
[7] B. Aiazzi, L. Alparone, S. Baronti, and A. Garzelli, “Context-driven fusion of high spatial and spectral resolution images based on oversampled
multiresolution analysis,” IEEE Trans. Geosci. Remote Sens., vol. 40,
no. 10, pp. 2300–2312, Oct. 2002.
[8] J. Liu, “Smoothing filter-based intensity modulation: A spectral preserve
image fusion technique for improving spatial details,” Int. J. Remote Sens.,
vol. 21, no. 18, pp. 3461–3472, Dec. 2000.
[9] J. Nunez, X. Otazu, O. Fors, A. Prades, V. Pala, and R. Arbiol,
“Multiresolution-based image fusion with additive wavelet decomposition,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 3, pp. 1204–1211,
May 1999.
[10] T. Ranchin and L. Wald, “Fusion of high spatial and spectral resolution
images: The ARSIS concept and its implementation,” Photogramm. Eng.
Remote Sens., vol. 66, no. 1, pp. 49–61, 2000.
[11] M. J. Canty, Image Analysis, Classification and Change Detection in
Remote Sensing: With Algorithms for ENVI/IDL. Boca Raton, FL: CRC
Press, 2006.
[12] GeoEye, Geoeye Imaging Products Ikonos. [Online]. Available:
http://www.geoeye.com/products/imagery/ikonos/
[13] M. Pedersen, Functional Analysis in Applied Mathematics and Engineering. Boca Raton, FL: CRC Press, 1999.
[14] T. Anderson, An Introduction to Multivariate Statistical Analysis, 2nd ed.
New York: Wiley, 1984.
[15] S. Li, Markov Random Field Modeling in Image Analysis, 2nd ed.
New York: Springer-Verlag, 2007.
[16] G. Winkler, Image Analysis, Random Fields and Markov Chain Monte
Carlo Methods: A Mathematical Introduction (Applications of Mathematics). New York: Springer-Verlag, 2003.
[17] R. Nishii, S. Kusanobu, and S. Tanaka, “Enhancement of low spatial
resolution image based on high resolution-bands,” IEEE Trans. Geosci.
Remote Sens., vol. 34, no. 5, pp. 1151–1158, Sep. 1996.
[18] J. Canny, “A computational approach to edge detection,” IEEE Trans.
Pattern Anal. Mach. Intell., vol. PAMI-8, no. 6, pp. 679–698, Nov. 1986.
[19] L. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based
noise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268,
Nov. 1992.
1345
[20] J. Sveinsson, H. Aanæs, and J. Benediktsson, “Smoothing of fused spectral consistent satellite images with TV-based edge detection,” in Proc.
IGARSS, 2007, pp. 318–321.
[21] J. Weickert, Anisotropic Diffusion in Image Processing, ser. ECMI Series.
Stuttgart, Germany: Teubner-Verlag, 1998.
[22] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic
diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, no. 7,
pp. 629–639, Jul. 1990.
[23] J. Zhou, D. Civco, and J. A. Silander, “A wavelet transform method to
merge Landsat TM and SPOT panchromatic data,” Int. J. Remote Sens.,
vol. 19, no. 4, pp. 743–757, Mar. 1998.
[24] M. Choi, “A new intensity-hue-saturation fusion approach to image fusion
with a tradeoff parameter,” IEEE Trans. Geosci. Remote Sens., vol. 44,
no. 6, pp. 1672–1682, Jun. 2006.
[25] Quickbird Imagery Products. Product Guide, DigitalGlobe, Inc.,
Longmont, CO, 2006. Tech. Rep. 4.2.7. [Online]. Available:
http://www.digitalglobe.com/downloads/QuickBird Imagery ProductsProduct Guide.pdf
[26] J. Schmetz, P. Pili, S. Tjemkes, D. Just, J. Kerkmann, S. Rota, and
A. Ratier, “An introduction to Meteosat second generation (MSG),” Bull.
Amer. Meteorol. Soc., vol. 83, no. 7, pp. 977–992, Jul. 2002.
[27] Z. Wang and A. C. Bovik, “A universal image quality index,” IEEE Signal
Process. Lett., vol. 9, no. 3, pp. 81–84, Mar. 2002.
[28] L. Alparone, S. Baronti, A. Garzelli, and F. Nencini, “A global quality measurement of pan-sharpened multispectral imagery,” IEEE Trans.
Geosci. Remote Sens. Lett., vol. 1, no. 4, pp. 313–317, Oct. 2004.
[29] A. Vesteinsson, J. Sveinsson, J. Benediktsson, and H. Aanæs, “Spectral
consistent satellite image fusion: Using a high resolution panchromatic
and low resolution multi-spectral images,” in Proc. IGARSS, 2005, vol. 4,
pp. 2834–2837.
[30] Å. Björck, Numerical Methods for Least Squares Problems.
Philadelphia, PA: SIAM, 1996.
[31] J. Besag, “On the statistical analysis of dirty pictures,” J. R. Stat. Soc. B,
vol. 48, no. 3, pp. 259–302, 1986.
[32] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and
Bayesian restoration of images,” IEEE Trans. Pattern Anal. Mach. Intell.,
vol. PAMI-6, no. 6, pp. 721–741, Nov. 1984.
Henrik Aanæs received the M.Sc. degree in engineering and the Ph.D. degree from the Technical
University of Denmark, Lyngby, in 2000 and 2003,
respectively.
He is currently an Assistant Professor in computer
vision and image analysis with the Department of
Informatics and Mathematical Modelling, Technical
University of Denmark. His main research interests
are estimation of 3-D shape to form 2-D images,
object recognition and tracking, statistical inference
from images, data fusion, and robust statistics.
Johannes R. Sveinsson (S’86–M’90–SM’02) received the B.S. degree from the University of
Iceland, Reykjavík, and the M.S. and Ph.D. degrees
from Queen’s University, Kingston, ON, Canada, all
in electrical engineering.
He is currently an Associate Professor with the
Department of Electrical and Computer Engineering, University of Iceland, where he was with the
Laboratory of Information Technology and Signal
Processing from 1981 to 1982 and, from November
1991 to 1998, the Engineering Research Institute and
the Department of Electrical and Computer Engineering as a Senior Member
of the research staff and a Lecturer, respectively. He was a Visiting Research
Student with the Imperial College of Science and Technology, London, U.K.,
from 1985 to 1986. At Queen’s University, he held teaching and research
assistantships. His current research interests are in systems and signal theory.
Dr. Sveinsson received the Queen’s Graduate Awards from Queen’s
University.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.
1346
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 46, NO. 5, MAY 2008
Allan Aasbjerg Nielsen received the M.Sc. and
Ph.D. degrees from the Technical University of
Denmark, Lyngby, in 1978 and 1994, respectively.
He is currently an Associate Professor with the
National Space Institute, Technical University of
Denmark, where he worked on energy conservation
in housing with the Thermal Insulation Laboratory
from 1978 to 1985, with the section for image
analysis in the Department of IMM from 1985 to
2001, as a Part-Time Acting Professor from 1995
to 2001, and with the section for geoinformatics
in the Department of IMM from 2001 to 2006. He was with the Danish
Defense Research Establishment from 1977 to 1978. Since 1985, he has been
working on several national and international projects on the development,
implementation, and application of statistical methods and remote sensing in
mineral exploration, mapping, geology, agriculture, environmental monitoring, oceanography, geodesy, and security funded by industries, the European
Union, the Danish International Development Agency, and the Danish National
Research Councils. He is the Reviewer of several international journals and
conferences.
Thomas Bøvith received the M.S. degree in engineering from the Technical University of Denmark,
Lyngby, in 2002, where he is currently working
toward the Ph.D. degree in remote sensing in the
National Space Institute, Technical University of
Denmark, which is in collaboration with the Danish
Meteorological Institute, Copenhagen.
His research interests include remote sensing of
precipitation, data fusion, image classification, and
scientific visualization.
Jón Atli Benediktsson (S’84–M’90–SM’99–F’04)
received the Cand.Sci. degree in electrical engineering from the University of Iceland, Reykjavík,
in 1984 and the M.S.E.E. and Ph.D. degrees from
Purdue University, West Lafayette, IN, in 1987 and
1990, respectively.
He is currently the Director of Academic Development and Innovation and a Professor with the Department of Electrical and Computer Engineering, University of Iceland. He has held visiting positions with
the Department of Information and Communication
Technology, University of Trento, Trento, Italy (2002–present); School of Computing and Information Systems, Kingston University, Kingston upon Thames,
U.K. (1999–2004); the Joint Research Center of the European Commission,
Ispra, Italy (1998); Technical University of Denmark, Lyngby (1998); and the
School of Electrical and Computer Engineering, Purdue University (1995). He
was a Fellow with the Australian Defense Force Academy, Canberra, A.C.T., in
August 1997. From 1999 to 2004, he was the Chairman of the energy company
Metan Ltd. His research interests are in remote sensing, pattern recognition,
neural networks, image processing, and signal processing, and he has published
extensively in those fields.
Dr. Benediktsson is the Editor of the IEEE TRANSACTIONS ON
GEOSCIENCE AND REMOTE SENSING (TGARS) and Associate Editor of the
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS. He was an Associate
Editor of TGARS from 1999 to 2002. He coedited (with Prof. D. A. Landgrebe)
a special issue on data fusion of TGARS (May 1999). In 2002, he was appointed
as Vice President of Technical Activities in the Administrative Committee
of the IEEE Geoscience and Remote Sensing Society (GRSS) and (with
P. Gamba and G. G. Wilkinson) a special issue on urban remote sensing from
satellite (October 2003). From 1996 to 1999, he was the Chairman of the GRSS
Technical Committee on Data Fusion and was elected to the Administrative
Committee of the GRSS for the term 2000 to 2002. In 2002, he was appointed as
Vice President of Technical Activities of GRSS. He was the founding Chairman
of the IEEE Iceland Section and served as its Chairman from 2000 to 2003. He
was the Chairman of the Quality Assurance Committee, University of Iceland
(2006–present), the Chairman of the University’s Science and Research Committee (1999–2005), a member of Iceland’s Science and Technology Council
(2003–2006), and a member of the Nordic Research Policy Council (2004). He
was a member of the North Atlantic Treaty Organization Advisory Panel of the
Physical and Engineering Science and Technology Subprogram (2002–2003).
He received the Stevan J. Kristof Award from Purdue University in 1991 as an
outstanding graduate student in remote sensing. In 1997, he was the recipient of
the Icelandic Research Council’s Outstanding Young Researcher Award, and in
2000, he was granted the IEEE Third Millennium Medal. In 2007, he received
the Outstanding Service Award from the IEEE Geoscience and Remote Sensing
Society. He is a member of Societas Scinetiarum Islandica and Tau Beta Pi.
Authorized licensed use limited to: Danmarks Tekniske Informationscenter. Downloaded on November 4, 2009 at 10:06 from IEEE Xplore. Restrictions apply.