Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

An Image Fusion Approach Based On Markov Random Fields

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

5116

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011

An Image Fusion Approach Based on


Markov Random Fields
Min Xu, Member, IEEE, Hao Chen, Member, IEEE, and Pramod K. Varshney, Fellow, IEEE

AbstractMarkov random field (MRF) models are powerful


tools to model image characteristics accurately and have been
successfully applied to a large number of image processing applications. This paper investigates the problem of fusion of remote
sensing images, e.g., multispectral image fusion, based on MRF
models and incorporates the contextual constraints via MRF models into the fusion model. Fusion algorithms under the maximum
a posteriori criterion are developed to search for solutions. Our
algorithm is applicable to both multiscale decomposition (MD)based image fusion and non-MD-based image fusion. Experimental results are provided to demonstrate the improvement of fusion
performance by our algorithms.
Index TermsMarkov random field, multi-resolution decomposition, multispectral image fusion.

I. I NTRODUCTION

MAGE FUSION is important in many image analysis tasks


in which image data are acquired from multiple sources.
The goal of image fusion is to combine relevant information
from two or more source images into one single image such
that the single image contains as much information from all
the source images as possible. The source images involved
in such applications can be taken at different times and/or
using different sensors. As a result, some source images may
contain certain occlusions and source images from different
sensors show different physical features. Thus, the fused image
is expected to have a more accurate description of the scene
and is, therefore, more useful for human visual or machine
perception [1]. In remote sensing applications, there have been
a few studies on fusing high-resolution panchromatic images
and low-resolution multispectral images to improve the spatial resolution [2], [3]. In this paper, we focus on the fusion
of remote sensing images having the same resolution, e.g.,
multispectral image fusion. A multispectral band covers only

Manuscript received May 29, 2009; revised June 6, 2010 and February 22,
2011; accepted April 23, 2011. Date of publication July 28, 2011; date of
current version November 23, 2011.
M. Xu was with the Department of Electrical Engineering and Computer
Science, Syracuse University, Syracuse, NY 13244 USA. She is now with
Blue Highway, LLC, Syracuse, NY 13244-4100 USA (e-mail: mxu@bluehighway.com).
H. Chen is with the Department of Electrical and Computer Engineering, Boise State University, Boise, ID 83725-2075 USA (e-mail: haochen@
boisestate.edu).
P. K. Varshney is with the Department of Electrical Engineering and
Computer Science, Syracuse University, Syracuse, NY 13244 USA (e-mail:
varshney@syr.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TGRS.2011.2158607

a narrow spectral range [3], and different bands represent


different aspects of the scene. Multispectral image fusion involves the fusion of several bands in order to improve spectral
resolution.
The existing image fusion approaches can be classified into
three categories: pixel-level, feature-level, and decision-level
[4]. An overview of these image fusion approaches can be found
in [4]. This paper is focused on the pixel-level fusion approach.
Before image fusion, an image registration algorithm usually
needs to be applied in order to align the source images [5].
In this paper, we assume that registered images are available
prior to fusion. A variety of image fusion algorithms have been
proposed for different applications [6]. The basic pixel-level
fusion rule includes two steps.
1) First, we need to determine whether a source image
contributes to the fused image for each pixel.
2) Second, the intensity of the pixel in the fused image is
obtained from all the contributing source images.
Among the pixel-level fusion rules, two traditional approaches to fusion are to average the pixel intensities from all
the source images or take the maximal pixel intensity among
all the source images. The averaging approach is effective in
removing the Gaussian noise and increases the signal-to-noise
ratio (SNR) but makes the image smoother and results in the
loss of contrast information. The maximizing approach can
produce the fused image at full contrast but is sensitive to
sensor noise [7]. To overcome the limitations of the averaging
and maximizing approaches, Sharma et al. [7] proposed a
Bayesian image fusion approach and related it to local principal
component analysis.
In recent years, multiscale decomposition (MD)-based techniques have been successfully applied to image fusion for
different applications such as concealed weapon detection [8]
and hyperspectral image fusion [9]. Different MD methods
including pyramid transform and discrete wavelet transform
have been applied to image fusion. The performances of these
MD-based image fusion approaches are evaluated in [6] for
a digital camera application. The MD-based image fusion approaches consist of three steps.
1) The source images are first decomposed into several scale
levels using a pyramid transform or a wavelet transform.
2) Fusion is then applied at each level of the source images.
3) Finally, we invert the transform to synthesize the fused
image.
The MD-based image fusion approach provides both spatial
and frequency domain localization and achieves much better

0196-2892/$26.00 2011 IEEE

XU et al.: IMAGE FUSION APPROACH BASED ON MARKOV RANDOM FIELDS

performance while the use of the transform increases the computational complexity. So one can choose or not to employ
transforms on images depending on different applications. For
the MD-based fusion approaches, the basic fusion rule is applied to MD representations of the images at each resolution
level. For the non-MD-based fusion approach, the basic fusion
rule is directly applied to the source images.
Generally, the main drawback of the pixel-level fusion rule
is that the decision on whether a source image contributes to
the fused image is made pixel by pixel and, therefore, may
cause spatial distortion in the fused image, which affects further
processing, such as classification and detection. As we know,
the pixels in an image are spatially correlated. Thus, for a source
image, if one of its pixels contributes to the fused image, its
neighbors are also likely to contribute to the fused image. It
implies that the decision making during the first step of the
fusion process should exploit the property of spatial correlation.
Thus, it is important to incorporate spatial correlation into the
fusion model, and the use of such a fusion model is expected to
improve fusion performance.
A straightforward approach to make use of spatial correlation
is to use a window- or region-based method [7], [10][13]. The
idea is to estimate the intensity of a fused pixel from that of the
source images in a small window. Yang and Blum [11] assumed
that the decision making of pixels within a small window is
constant and developed an expectation-maximization algorithm
by employing a Gaussian mixture image model to adaptively
find the fusion result. Burt and Kolczynski [10] proposed a
weighted average algorithm to estimate the fused image in a
pyramid transform domain. The weights are measured based
on a local energy or variance (called salience) within a
small window. Lozci et al. [12] modified the weighted average
algorithm by incorporating a generalized Gaussian statistical
model. Lallier and Farooq [13] designed a weighted average
scheme for fusing IR and visual images in a surveillance
scenario. In their algorithm, larger weights are assigned to
either the warmer or cooler pixels for the IR image and to the
pixels having larger local variance for the visual image. The
aforementioned algorithms [10][13] are used in the MD-based
fusion approach.
The theory of Markov random fields (MRFs) provides a basis
for modeling contextual constraints in visual processing and
interpretation [14]. MRF models have proved to be successful
in a variety of image processing applications, including multiresolution fusion [15], change detection [16], edge detection
[17], image denoising [14], [18], image restoration [19], [20],
and image classification [21]. In the image fusion application,
an MRF model has been used to model the images for the fusion
of edge information [22]. Yang and Blum proposed a statistical
model to describe the fusion process [1], [11], [23]. However,
the application of MRF models for pixel-level image fusion on
images with the same resolution has not been considered. In this
paper, we propose two fusion algorithms by incorporating the
contextual constraints via MRF models into the fusion model.
The first algorithm models the decision making at the first
step of the fusion rule as an MRF, and the second algorithm
models both the decision making and the true image as MRFs.
Also, the first algorithm is applicable for both the MD-based

5117

fusion approach and the non-MD-based fusion approach while


the second algorithm is only applicable for the non-MD-based
fusion approach.
This paper is organized as follows. In Section II, we formulate the image fusion problem based on a statistical model.
Then, the MRF-based image fusion approach is presented in
Section III. In Section IV, we compare our proposed fusion
approach with other fusion approaches via two experiments.
Finally, some concluding remarks are provided in Section V.

II. P ROBLEM F ORMULATION


Image fusion is essentially an estimation problem. The objective is to estimate the underlying scene, assuming that each
source image contains a good view of only part of the scene [1].
Blum [1] has proposed a statistical model for the image fusion
problem. Assume that there are N source images to fuse. Each
source image can be modeled as
yi (r) = Hi (r) x(r) + wi (r),

i = 1, . . . , N

(1)

where r indicates the spatial coordinates of a pixel, yi (r) is


the intensity of the ith source image at r, x(r) is the intensity
of the true scene at r to be estimated, wi (r) is the noise, and
Hi (r) is the sensor selectivity coefficient, taking on values from
= {q1 , q2 , . . .} representing the percentage of the true scene
contributing to the ith source image [7]. In our work, we use
= {0, 1}, which determines if the true scene contributes to
the ith source image or not [1]. In the following, for simplicity
of notation, (r) is omitted.
Note that (1) represents the relationship between the source
images and the true scene. According to this model, if the
true scene contributes to the source image, the source image is
modeled as a true scene plus a Gaussian noise. If the true scene
does not contribute to the source image, the source image is
modeled as Gaussian noise. In practice, particularly in multiplesensor applications and multifocus applications, this model has
some limitations. The source images obtained from different
sensors sense different aspects of the true scene, and this model
may be a coarse approximation in this case.
The image fusion problem essentially involves the estimation
of Hi and x. The two traditional algorithms, namely, the
averaging and maximizing algorithms, can also be expressed
using this model. For the averaging algorithm, Hi = 1 for all i.
For the maximizing algorithm, Hi = 1, i = max{yi }; Hi = 0,
i

otherwise.
When Hi is given, the pixel intensity of the fused image can
be easily calculated by a Least Squares (LS) technique as [24]
1

x
= (H T H) H T Y

(2)

where H denotes the vector [H1 , H2 , . . . , HN ]T and Y denotes


the vector [y1 , y2 , . . . , yN ]T .
In practice, we only have the source images available without
any prior information and the coefficient H is usually unknown.
According to the LS technique, from the set of all possible

5118

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011

values that the coefficient H can take, the one which produces
the highest energy should be selected, i.e.,


= min (Y H x
H
)
)T (Y H x
H


1
= min Y T Y (Y T H)(H T H) (H T Y )
H


1
= max (Y T H)(H T H) (H T Y ) .
(3)
H

Note, since Hi {0, 1}, H has 2N possible values. Once H is


available, the intensity of the fused image at pixel r, i.e., x, is
obtained by an LS approach [24], which is
1 H
T Y.
T H)
x
= (H

(4)

In the aforementioned model, both the coefficient H and


the intensity of the fused image x at each pixel are estimated
pixel by pixel, and therefore, it is very sensitive to sensor
noise. Furthermore, since the estimation of the fused image is
based on the estimation of the coefficients, the estimation of
the coefficient H plays an important role in the fusion process.
The estimation accuracy of the coefficients directly influences
the estimation of the fused image. Since the coefficient H of a
pixel is likely to be similar to the coefficients corresponding to
other pixels in its neighborhood due to spatial correlation, we
can get better estimates of H by utilizing spatial correlation.
A straightforward and simple approach is to assume that the
coefficients of pixels within a small window are constant and
then select the coefficient which produces the highest energy
of pixels within a small window. This strategy has been used
in [11]. However, the goal of the LS approach is to minimize
the data error y y2 , which does not necessarily lead to a
small estimation error for either H or x. A popular strategy
for improving the estimation error of LS is to incorporate prior
information on H or x [25]. Motivated by this fact and the fact
that the MRF model in the form of prior Gibbs distributions is
currently the most effective way to describe the local behavior
of both the intensity field and the discontinuity field [20], we
propose to employ an MRF model to estimate the coefficients.
It is expected to improve the estimation accuracy of the coefficients H, thereby leading to improved fusion results. In the
next section, we develop our image fusion approaches based on
MRF modeling.
III. P ROPOSED A LGORITHMS
The image fusion problem is to estimate the true scene x.
However, before the estimation of x, we need to obtain an
accurate estimate of H, which represents the decision whether
the true scene is present in the source image, i.e., whether the
source image contributes to the fused image. In the previous
section, we considered the estimation of x and H on a pixel
level. In this section, we propose two MRF-based image fusion
approaches, which design the estimator by incorporating the
spatial correlation through the prior probability density function
(pdf) of H and x. The intensity of a fused pixel then depends
not only on the intensities of the pixel in the source images but
also on that of the neighboring pixels. In the first algorithm,

only the coefficients are modeled using an MRF, denoted as


MRF_H. In the second algorithm, both the coefficients and the
fused image are modeled using MRFs, denoted as MRF_HX.
Some notations used in the remainder of this paper are listed as
follows:
X: the whole true scene (fused image);
Hi : the coefficients of the ith source image;
Yi : the intensities of the ith source image;
H: the coefficients of source images, where H(r, i) = Hi ;
Y: the intensities of source images, where Y(r, i) = Yi .
The maximum a posteriori (MAP) criterion is used to find
the optimal solution for the estimation problem. The estimation
procedure based on the MAP criterion chooses the most likely
values of coefficients and the fused image among all possible
values given the observed source images. The resulting probability of error is minimal among all other detectors [24]. This
criterion is expressed as



(5)
{H, X} = arg max [P (X, H|Y)] .
H,X

However, due to high computational complexity, it is difficult


to directly obtain the final solution. Note that P (X, H|Y) =
P (X|H, Y)P (H|Y) = P (H|X, Y)P (X|Y). Thus, a suboptimal method is adopted in this paper. We decompose our
problem (5) into two subproblems and iteratively solve the two
subproblems



n)
n+1 = arg max P (H|Y, X
H
H



n+1

n)
X
= arg max P (X|Y, H
(6)
X

n denotes the nth update of the estimate of H and


where H
n
denotes the nth update of the estimate of X. It is easy to
X
show that we can iteratively update the estimates of H and X
such that
P (Xn+1 , Hn+1 |Y) P (Xn , Hn |Y)

(7)

and, therefore, achieve the optimum at the end.


A. Fusion Approach: MRF Modeling for
Coefficients H (MRF_H)
Motivated by the fact that the coefficients of the source
images exhibit spatial correlation, we model the coefficient H
by an MRF model. Let S be a set of sites in an image and
{0, 1, . . . , L 1} be the phase space. We assume that the
coefficients H(S) S follow MRF properties with the Gibbs
potential Uc (H). The marginal pdf for H is written as [14]


1
1
exp
Uc (H)
(8)
PH (H) =
ZH
T
cS

where ZH is a normalization constant given by





1
exp
Uc (H) .
ZH =
T
S
H

cS

(9)

XU et al.: IMAGE FUSION APPROACH BASED ON MARKOV RANDOM FIELDS

The estimate of H is given by





n) .
n+1 = arg max P (H|Y, X
H
H

(10)

We apply Bayes rule, which provides the following result:


n
n
n ) = P (H, Y|X ) = P (Y|H, X )P (H) (11)
P (H|Y, X
n)
n)
P (Y|X
P (Y|X
is constant for all the values of H, (10)
and because P (Y|X)
can be rewritten as



n )P (H) .
n+1 = arg max P (Y|H, X
(12)
H
H

In the model given in (1), the noise of each source pixel


is assumed to be an independent and identically distributed
(i.i.d.) Gaussian noise with zero mean and variance of 2 , and
therefore, the conditional pdf of the source image Y given H
n is given by
and X


n )T (Yi Hi X
n)
(Yi Hi X
i
exp
2 2
n) =
(13)
P (Y|H, X
M
(2 2 ) 2
where M is the total number of pixels.
Then, substituting (8) and (13) into (12) and taking the
constant term out, we obtain


n+1 = arg max [exp (E(H))]
(14)
H

5119

Here, the temperature is a parameter which is used to control


the randomness of the coefficient generator, and we consider
that the algorithm converges when the two consecutive updates
are within tolerance of each other. At steps 2) and 3), we visit
each pixel from left to right and from top to bottom when
we update the coefficients and the fused image. Eventually,
the resulting coefficient will converge to the solution of (16),
and the fused image is simultaneously obtained. Compared
with the maximizing approach, the averaging approach, and
the LS approach, the solution of this algorithm is obtained
through an optimization algorithm, and therefore, it increases
the computation time. However, the MRF modeling of the
coefficient in the image model is a better model to describe the
fusion process, which improves the fusion performance.
In recent years, other optimization algorithms such as the
graph-cut-based approach [27] have become very popular, and
they can find the solution in a more computationally efficient
manner than the SA optimization algorithm. The use of optimization algorithms such as the graph-cut-based optimization
approach will be investigated in the future to improve the
efficiency of the fusion algorithm.
Assume that we have N source images of size M . Since the
coefficient Hi (r) of the ith source image at pixel r is taken
from {0, 1}, each component of H belongs to a set with the
size 2N . Thus, during each iteration, the algorithm to estimate
the coefficient has the computational complexity O(M 2N )
during each iteration.

where

E(H) =

i (Yi

n )T (Yi Hi X
n)
Hi X
+
Uc (H).
2 2
cS
(15)

According to the aforementioned result, we observe that maximization in (14) is equivalent to minimization of E(H). Thus,
the optimal estimate for H can be expressed as


n+1 = arg min (E(H)) .
(16)
H
H

Note that, for two source images with size 300 300, H has
a total of 490000 possible configurations. Thus, in practice, due
to the large search space on H, the solution of (16) cannot
be obtained directly, and therefore, the simulated annealing
(SA) algorithm [26] is applied here to search for the optimal
solution of (16). The solution for the second subproblem, i.e.,
the estimate for X, is obtained by (4). The iterative algorithm is
described in terms of the following steps.
1) Start with an initial estimate of H and X. Estimate the
initial parameters (noise variance and some parameters in
the pdf of H) and set the initial temperature.
2) At each iteration, obtain a new estimate of H based on
the Gibbs pdf given in (8) with the Gibbs potential E(H)
using a Gibbs sampling procedure [14].
3) Update the fused image using (4).
4) Reduce the temperature using a predetermined schedule
and repeat 2) and 3) until convergence.

B. Fusion Approach: MRF Modeling for Coefficients H


and Fused Image X (MRF_HX)
In the aforementioned algorithm, we assumed that the coefficients H follow an MRF model. Then, the intensity of the
fused pixel is estimated by an LS technique. In practice, the
fused image also has the property of high spatial correlation.
Thus, one may assume that the fused image also follows an
MRF model with a Gibbs potential Vc (X). Hence, the marginal
pdf for X is written as [14]


1
1
exp
Vc (X)
PX (X) =
ZX
T

(17)

cS

where ZX is a normalization constant given by


ZX =

1
exp
Vc (X) .
T

(18)

cS

Under this assumption, the MAP criterion to obtain the


optimal X is written as



n) .
n+1 = arg max P (X|Y, H
X
X

(19)

(19)
In a similar manner as for the estimation of H,
reduces to


n+1 = arg min ((X))
(20)
X
X

5120

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011

where
(X) =

T 

1 
nX
nX +
Y
Y

H
Vc (X).
i
i
i
i
2 2 i
cS
(21)

In the next section, we will show that the solution of (20) can
be easily obtained when we use a Gaussian MRF to model the
fused image.
Different from the first algorithm MRF_H, where the fused
image X is updated by an LS technique, this new algorithm
uses a MAP solution to update the fused image X using (20).
The whole procedure is then described as follows.
1) Start with an initial estimate of H and X. Estimate the
initial parameters (noise variance and some parameters in
the pdf of H and X) and set the initial temperature.
2) At each iteration, obtain a new estimate of H based on
its Gibbs pdf given in (8) with the Gibbs potential E(H)
using a Gibbs sampling procedure [14].
3) Update the fused image using the solution of (20).
4) Reduce the temperature using a predetermined schedule
and repeat 2) and 3) until convergence.
In summary, it is an iterative estimation process to estimate
both H and X, which increases computational time. The MRF
modeling of both the coefficient and the fused image more
accurately represents the images with high resolution and therefore produces better fusion results for the fusion of source
images with high resolution. Moreover, the initial estimates
of H and X are important. As the initial estimates of H and
X get closer to the optimal values, the algorithm converges
faster. Poor initial estimates may lead to the local maxima of the
a posteriori probability. Although the LS approach given by (3)
and (4) is found to be a simple and effective fusion approach, it
does not take spatial correlation into account. However, we use
it to obtain our initial estimates of H and X. According to our
experiments, this approach displays a good fusion performance.
C. Extension to the MD-Based Fusion Framework
Here, the applicability of the two proposed algorithms to
the MD-based fusion approach is discussed. For the non-MDbased fusion approach, Y and H in the data model (1) denote
the intensities of the source images and their corresponding
coefficients and X denotes the intensity of the fused image.
This data model (1) is also applicable for the data after the
MD process [1], [7]. Thus, if the MD-based fusion approach is
employed, we assume that the MD transform is applied to the
source images, Y refers to the values of the MD representations
of the source images at some resolution level, H refers to the
corresponding coefficients, and X refers to the values of the
MD representations of the fused image at the same resolution
level. Thus, instead of directly applying the image fusion model
(1) on the source images, one can perform the MD on the
source images and then apply the image fusion model (1)
on the MD representations at each resolution level. By using
multiresolution transforms such as discrete wavelet transform,
the source image is decomposed into different frequency bands,

Fig. 1.

Cliques considered in the eight-neighborhood system.

which makes the model (1) more closely fit the MD representations. However, the use of multiresolution transforms increases
the complexity of the algorithm. It is noted that, since the
multiresolution transform may result in the loss of locality in
MRF models for the MD image representations [28], i.e., the
local MRF property may not hold on X, it is not suggested
to use the algorithm MRF_HX with the MD-based fusion
approach.
The coefficient H for each pixel represents whether the true
scene X (fused image) contributes to the source image. Pixels
in a large area may all contribute to the true scene; however, all
the pixels in the area may not contain the same intensities. Thus,
the coefficient H has more spatial correlation over a larger area
than the intensity of the true scene X. After MD transformation,
coefficients H may still exhibit spatial correlation while MRF
property may not hold for X. Thus, only the algorithm MRF_H
is applied in the MD-based fusion approach. In the next section,
some examples are provided for illustration.
IV. E XPERIMENTAL R ESULTS
A. Choice of MRF Models
We provide three examples to evaluate the fusion performance of our fusion algorithms. For the two MRF-based fusion algorithms, MRF_H and MRF_HX, used in the following
experiments, we consider five clique types in the eightneighborhood system: C1 , C2 , C3 , C4 , C5 , associated with the
singleton, vertical pairs, horizontal pairs, left-diagonal pairs,
and right-diagonal pairs, respectively. They are shown in Fig. 1.
The Gibbs energy function of the coefficient of the source
image is defined by an autologistic function, given by [14]

Uc (H) = aT L
(22)
cS

where a = [a2 , . . . , a5 ] is the parameter vector of the coefficient H and



L=
I [H(s), H(t)] , . . . ,
I [H(s), H(t)]
(s,t)C2

(s,t)C5

(23)
is the coefficient potential vector associated with clique types.
Here, the function I is defined as
I(a, b) = 1,
I(a, b) = 1,

if a = b
otherwise.

(24)

Due to its simplicity, this class of MRF model has been extensively used in [14], [29][31] for modeling a wide variety of
images, both as region formation models and as texture models.

XU et al.: IMAGE FUSION APPROACH BASED ON MARKOV RANDOM FIELDS

5121

Furthermore, we use a Gaussian MRF model to represent the


Gibbs energy function of the fused image, which is given by

2
1
X(s) pT G
Vc (X) =
(25)
2
cS

where p = [p2 , . . . , p5 ] is the parameter vector of the image


model and its potential vector G is defined as



X(t), . . . ,
X(t) .
(26)
G=
(s,t)C2

(s,t)C5

For simplicity, we choose p = [0.333, 0.333, 0.1667, 0.1667]


in our experiments. The Gaussian MRF model is widely used
for modeling image texture [14]. Under this model, the analytical solution for (20) can be easily derived by
(X)
= 0.
X

(27)

Substituting (13) and (25) into (27), it yields



1 



n
2
2
n+1 = 1+
HnT
HnT
pT G+
.
X
i Hi /
i Yi /
i

Fig. 2. Synthetic source images in Experiment 1, size 150 150. (a) Source 1.
(b) Source 2. (c) Source 3.

The potential function Vc (H, a) can be simply computed. However, the normalization term ZH involves a summation over all
possible configurations of H, which is practically impossible
due to the large computation time. Note that, for two source
images with size 300 300, H has a total of 490000 possible
configurations.
An alternative method for approximation to ML estimation
is maximum pseudolikelihood (MPL) estimation, which was
proposed by Besag [33]. The MPL estimation method is a
suboptimal method, which is given by

= arg max
P (H(s), a)
a
a

(28)
The estimate given by (28) for one pixel involves vector
multiplication, which has the computational complexity O(N ).
Thus, the estimation of the whole fused image has the computational complexity O(M N ). Because the estimation of the coefficient also has the computational complexity O(M 2N ) at
each iteration, the overall complexity is O(M (2N + N ))
O(M 2N ).
B. Parameter Estimation
Modeling the Markov pdf parametrically involves the datadriven optimal estimation of the parameters associated with the
potential functions Vc . The model parameters must be estimated
for each data set as part of the image processing algorithm. In
our algorithms, the noise variance 2 in (13) and the parameter
a in the coefficient MRF pdf in (22) are unknown. Thus, we
need to estimate these parameters in our algorithms.
Because we assume that the noise in the fusion model is
a Gaussian noise, it is straightforward to estimate the noise
variance by the maximum likelihood (ML) criterion. It is
given by

2 = arg max P (Y|H, X, 2 )


1
(Yi Hi X)T (Yi Hi X).
=
MN i

(29)

The direct ML estimation of the parameters associated with


the pdf of H is known to be a difficult problem [32]. The ML
estimate of a is
= arg max P (H, a)
a
a

= arg min Vc (H, a) ln ZH .


a

(30)

= arg min
a

Vc (H(s), a) ln ZH(s) .

(31)

Since each site is not independent of each other, the resulting


function is not the true likelihood function. However, Geman
and Graffigne [34] later proved that the MPL estimate converges asymptotically to the true ML estimate. Thus, we used
this method to estimate the parameters associated with the
MRF pdf in our algorithm. To obtain the optimal solution
that maximizes the pseudolikelihood, we adopt the Metropolis
optimization method [31]. Because parameter estimates do not
change appreciably between successive iterations, we update
the parameters using this strategy after every ten iterations for
H, as suggested in [26].
C. Application With Non-MD-Based Fusion
1) Experiment 1Synthetic Data: We first implement our
two proposed fusion algorithms, MRF_H and MRF_HX, for
non-MD-based fusion approaches. In this experiment, we generated three source images based on the model given in (1), as
shown in Fig. 2. The image Campus of size 150 150 is used
as our original image. For each source image, we generated
the coefficient mask Hi by a Gibbs sampling generator. The
value 0 of the coefficient represents occlusion in the source
image, and the value 1 of the coefficient represents the true
scene in the source image. Then, a Gaussian noise was added to
the source images prior to fusion. The maximizing approach,
the averaging approach, the LS approach, the window-based
approach, and our two proposed approaches were used to fuse
the three source images. The window-based approach assumes
that the coefficients of source images within a 5 5 window are
constant and selects the coefficients which produce the highest
energy of all pixels within that window. The intensity of the
fused image is then estimated by (4). The whole process was

5122

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011

Fig. 3. MSE of fusion result as the number of iterations increases for MRF-based approaches for four- and eight-neighborhood systems. (a) MRF_H, SNR =
10 dB. (b) MRF_H, SNR = 30 dB. (c) MRF_HX, SNR = 10 dB. (d) MRF_HX, SNR = 30 dB.

repeated at two SNR levels, SNR = 10 dB and SNR = 30 dB.


In addition to the visual inspection, we employed mean square
error (mse) to evaluate the fusion performance in Experiment 1
in which a true reference scene is available. A smaller mse
usually indicates a better fusion performance. The solid lines in
Fig. 3 show the mse of fusion results using the MRF approaches
with respect to the ground truth as the number of iterations
increases. Initially, the coefficient mask and the fused image
have much noise. As the number of iterations increases, the
resulting coefficient mask and the fused image approach the
ground truth and produce the smallest mse. We found that our
new fusion algorithms converge after around three iterations at
an SNR level of 30 dB and converge after around ten iterations
at an SNR level of 10 dB.
MRF modeling is employed to include the dependences of
decision making and/or pixel intensities of nearby pixels. The
choice of neighborhoods affects the effectiveness of MRF modeling. Larger neighborhoods used in MRF modeling implies
more spatial correlation in the image. In order to evaluate
the fusion result for the different neighborhoods chosen, we
also employed the four-neighborhood system, which only uses
the three clique types C1 , C2 , and C3 shown in Fig. 1 for
MRF modeling, to compare with the fusion result using the
eight-neighborhood system. The mses of fusion results using
both neighborhood systems for MRF modeling are shown
in Fig. 3. Although the algorithms using both neighborhood
systems converge finally, it is observed that the MRF modeling using the eight-neighborhood system makes the algorithm
converge faster than that for the four-neighborhood system.
Furthermore, the use of the eight-neighborhood system pro-

duces a little smaller mse than that of the four-neighborhood


system while using the algorithm MRF_H, while it produces a
slightly larger mse with the algorithm MRF_HX. This observation implies that the use of a larger neighborhood system does
not necessarily improve the performance and that the choice of
different neighborhoods in MRF models does not impact much
the final fusion result. We also note that the inclusion of more
memory in the model results in improved performance, i.e.,
the eight-neighborhood system performs better than the fourneighborhood system.
Fig. 2 shows the three source images at an SNR level of
30 dB, and Fig. 4 shows the fusion results produced by the
six fusion approaches, respectively. It is observed that the fused
image produced by the algorithm MRF_HX displays reduced
image noise at the cost of smoother image texture and is the
closest to the ground truth. Fig. 5 shows the results for the
estimated coefficients of the three source images using six
fusion approaches. The window-based approach removed the
noise quite effectively but produced a mosaic coefficient mask.
The two MRF algorithms outperform the other four approaches
and demonstrate good ability of accurately estimating the coefficients. Table I gives a quantitative comparison by means of
the mse. We observed that the use of the spatial correlation
property improves the fusion performance. Furthermore, the
window-based approach and the algorithm MRF_H produce
much smaller mses than the LS approach in the low SNR
case, while they do not improve the performance much in the
high SNR case. Our proposed fusion algorithm, MRF_HX, is
observed to produce the smallest mse in both the low SNR case
and the high SNR case. This also indicates that the estimation

XU et al.: IMAGE FUSION APPROACH BASED ON MARKOV RANDOM FIELDS

5123

Fig. 4. Fusion results and true scene in experiment 1. (a) Averaging.


(b) Maximizing. (c) LS. (d) Window. (e) MRF_H. (f) MRF_HX. (g) True scene.

accuracy of coefficients plays a much more important role in


the fusion process in the low SNR case than that in the high
SNR case.
Table II gives the execution times for the six fusion
approaches. These six approaches are implemented using
Matlab 7.1 and executed on a Computer with 4-G RAM and
Intel Core2Quad CPU at 2.40 GHz. The two MRF-based approaches require relatively longer computational time than the
other fusion approaches. However, the computation time can
be easily reduced by dividing the whole image into several
subimages and then processing the data in parallel.
2) Experiment 2Real Data: In this experiment, we select
three bands from a hyperspectral data set as the source images.
Six fusion approaches are applied in this experiment to fuse the
three source images. The three source images and the fusion
results are shown in Figs. 6 and 7, respectively. Our model is
a simplified model, and there is a certain mismatch between
the model and the real image data set, e.g., the relationship
between the source images and the true scene might not be

Fig. 5. Estimated coefficients of the three source images in Experiment 1.


(a) Averaging. (b) Maximizing. (c) LS. (d) Window. (e) MRF_H. (f) MRF_HX.
TABLE I
MSE OF F USION A PPROACHES IN E XPERIMENT 1

explained simply by contribute or not contribute and the


coefficient Hi can take on values other than {0, 1}. However,
our algorithms still work quite well in practice. The results

5124

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011

TABLE II
E XECUTION T IME OF F USION A PPROACHES IN
E XPERIMENT 1 ( IN S ECONDS )

Fig. 6. Three source images in Experiments 2 and 3, size 155 155.


(a) Source 1. (b) Source 2. (c) Source 3.

Fig. 8. Fusion results in Experiment 3, application with MD-based fusion.


(a) Averaging. (b) Maximizing. (c) LS. (d) Window. (e) MRF_H.

D. Application With MD-Based Fusion

Fig. 7. Fusion results in Experiment 2, application with non-MD based fusion.


(a) Averaging. (b) Maximizing. (c) LS. (d) Window. (e) MRF_H. (f) MRF_HX.

show that the use of the MRF model for both the coefficient
and the fused image improves the fusion performance and
enhances the image contrast. The fused images produced by
the averaging approach and the window-based approach display
low contrast. The LS approach produces more noise in the
fused image. The fused image produced by the maximizing
approach has high contrast, but it is too bright. It is observed
that the algorithm MRF_HX produces the sharpest edges than
the other approaches in the fused image and it preserves most
textures.

In the previous two experiments, we evaluated our two


fusion algorithms for the non-MD based-fusion approach. In
this section, we will evaluate the performance of algorithm
MRF_H for the MD-based fusion approach. As discussed in
Section III-C, due to the loss of the locality in the MRF models
for the MD image representations, only the algorithm MRF_H
is applied in the MD-based fusion approach.
1) Experiment 3: The three source images shown in Fig. 6
are also used in this experiment. Each source image is decomposed into four subimages by using the discrete 2-D Haar
wavelet transform, and then, a fusion approach is employed to
fuse the MD representations of the three source images at each
level. Finally, the inverse Haar wavelet transform is computed
from the fusion results at four levels. The maximizing approach,
the averaging approach, the LS approach, the window-based
approach, and the algorithm MRF_H are tested in the fusion
process that involves wavelet representations. The fusion results
are shown in Fig. 8. The maximizing approach still produces a
fusion result that is too bright. The LS approach produces more
noise in the fusion result. The fused image produced by the
window-based approach displays a mosaic pattern. We observe
that our algorithm MRF_H produces the most textures in the
fused image. In addition, comparing Fig. 8(e) with Fig. 7(e),
the MD-based fusion result produced by algorithm MRF_H
[Fig. 8(e)] is observed to be closer to the source images with
less artifacts. Note that a white spot, which does not appear
in either of the three source images, is observed at the bottom

XU et al.: IMAGE FUSION APPROACH BASED ON MARKOV RANDOM FIELDS

5125

Fig. 9. Remotely sensed cloudy images used in Experiment 4. (a) Source 1.


(b) Source 2. (c) Source 3.

Fig. 11. Magnified fusion results in Experiment 4cloud images. (a) Averaging. (b) LS. (c) Window. (d) MRF_H. (e) MRF_HX (non-MD-based approach).

Fig. 10. Fusion results in Experiment 4cloud images. (a) Averaging.


(b) Maximizing. (c) LS. (d) Window. (e) MRF_H. (f) MRF_HX (non-MDbased approach).

of the non-MD-based fusion result produced by algorithm


MRF_H [Fig. 7(e)] and the MD-based fusion result does not
display such an artifact.
2) Experiment 4: As another example, three multispectral
images are used to illustrate our algorithm and evaluate the
fusion performance. Fig. 9 shows three bands from a multispectral data set downloaded from [35]. Artificial clouds have
been added by Photoshop to the three images to create the
occlusion. The multispectral images are from the Washington
DC area, with size of 1020 1020. This experiment tests the
MD-based fusion performance for the cloud images. The same
procedure for MD-based fusion, as described earlier, has been
applied. The fusion results for five fusion approaches are shown
in Fig. 10. The maximizing approach produces a fused result
that is too bright. Fig. 11 shows the zoomed-in version of a
boxed area of fused results for the other four approaches to
more clearly examine the performance of the fusion algorithms.

We observe that our algorithm MRF_H produces the highest


spatial resolution in the fused image and the fusion result
contains less artifacts due to the cloud effect than the other
approaches, which is marked with an arrow in Fig. 11(d).
This experiment demonstrates the improvement in the performance of MD-based fusion approaches using MRF models. The
MD-based approach is usually used for images with high
resolution and when the computational cost is not a consideration. The non-MD-based approach is particularly useful for
images with low resolutions and when the computational cost
is a consideration. In this experiment, the fusion result using
the MRF_HX algorithm with non-MD-based approach is also
shown in Figs. 10(f) and 11(e). Compared with the MRF_H
algorithm with MD-based approach, it is observed that the
MD-based approach can eliminate the cloud effect more effectively than the non-MD-based approach and the MRF_HX
algorithm can preserve more smooth edge information.
V. C ONCLUSION
In this paper, we have studied the image fusion problem
based on a statistical model. We utilized the fact that decision
making in the fusion process has significant correlation within
its neighborhood and assumed that it can be modeled as an
MRF. Based on that, a new statistical fusion algorithm, namely,
MRF_H, has been proposed. This approach is applicable for
both non-MD- and MD-based fusion approaches. In particular,

5126

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011

when the raw source images are directly used for fusion without
preprocessing, the fused image can also be modeled as an MRF,
and then, the fusion result can be obtained using the MAP
criterion incorporating the a priori Gibbs distribution of the
fused image. The second algorithm, MRF_HX, is only applicable for non-MD-based fusion approaches. Visual inspection
and quantitative performance evaluation both demonstrate that
the employment of the MRF model in the fusion approaches
resulted in a better fusion performance than the traditional
fusion approaches. In our proposed image fusion algorithms,
we assumed a simple relationship between each source image
and the true scene, i.e., a source image either contributes to
the fused image or does not contribute to the fused image.
Thus, it results in a mismatch between the fusion model and
the real image data set. To improve this, one can assume that
the coefficient in the data model can take any real value, which
may increase the accuracy of the fusion algorithms. In addition,
in the developed image fusion algorithms, we assumed that the
noise in the source image is an i.i.d. Gaussian noise. Since
this is a rather limiting assumption, if we can build the noise
model to include non-Gaussian distortion or possibly correlated
Gaussian mixture distortion, this model should be closer to
realistic sensor images and the estimation of fused image may
improve.
R EFERENCES
[1] R. S. Blum, On multisensor image fusion performance limits from an
estimation theory perspective, Inf. Fusion, vol. 7, no. 3, pp. 250263,
Sep. 2006.
[2] Z. Wang, D. Ziou, C. Armenakis, D. Li, and Q. Li, A comparative
analysis of image fusion methods, IEEE Trans. Geosci. Remote Sens.,
vol. 43, no. 6, pp. 13911402, Jun. 2005.
[3] C. Thomas, T. Ranchin, L. Wald, and J. Chanussot, Synthesis of multispectral images to high spatial resolution: A critical review of fusion
methods based on remote sensing physics, IEEE Trans. Geosci. Remote
Sens., vol. 46, no. 5, pp. 13011312, May 2008.
[4] C. Pohl and J. van Genderen, Multisensor image fusion in remote sensing: Concepts, methods, and applications, Int. J. Remote Sens., vol. 19,
no. 5, pp. 823854, 1998.
[5] P. K. Varshney, B. Kumar, M. Xu, A. Drozd, and I. Kasperovich, Image
registration: A tutorial, in Proc. NATO ASI, Albena, Bulgaria, 2005.
[6] Z. Zhang and R. S. Blum, A categorization of multiscale-decompositionbased image fusion schemes with a performance study for a digital camera
application, Proc. IEEE, vol. 87, no. 8, pp. 13151326, Aug. 1999.
[7] R. K. Sharma, T. K. Leen, and M. Pavel, Probabilistic image sensor
fusion, in Proc. Adv. Neural Inf. Process. Syst. 11, 1999, pp. 824830.
[8] H.-M. Chen, S. Lee, R. Rao, M.-A. Slamani, and P. Varshney, Imaging
for concealed weapon detection: A tutorial overview of development in
imaging sensors and processing, IEEE Signal Process. Mag., vol. 22,
no. 2, pp. 5261, Mar. 2005.
[9] Y. Zhang, S. De Backer, and P. Scheunders, Noise-resistant waveletbased Bayesian fusion of multispectral and hyperspectral images,
IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11, pp. 38343843,
Nov. 2009.
[10] P. Burt and R. Kolczynski, Enhanced image capture through fusion, in
Proc. 4th Int. Conf. Comput. Vis., 1993, pp. 173182.
[11] J. Yang and R. Blum, A statistical signal processing approach to image
fusion for concealed weapon detection, in Proc. IEEE Int. Conf. Image
Process., 2002, pp. 513516.
[12] A. Lozci, A. Achim, D. Bull, and N. Canagarajah, Statistical image fusion with generalized Gaussian and Alpha-Stable distributions, in Proc.
15th Int. Conf. Digital Signal Process., 2007, pp. 268271.
[13] E. Lallier and M. Farooq, A real time pixel-level based image fusion
via adaptive weight averaging, in Proc. 3rd Int. Conf. Inf. Fusion, 2000,
pp. WEC3/3WEC313.
[14] S. Z. Li, Markov Random Field Modeling in Computer Vision.
New York: Spinger-Verlag, 2001.

[15] M. Joshi and A. Jalobeanu, MAP estimation for multiresolution fusion


in remotely sensed images using an IGMRF prior model, IEEE Trans.
Geosci. Remote Sens., vol. 48, no. 3, pp. 12451255, Mar. 2010.
[16] T. Kasetkasem and P. Varshney, An image change detection algorithm
based on Markov random field models, IEEE Trans. Geosci. Remote
Sens., vol. 40, no. 8, pp. 18151823, Aug. 2002.
[17] Z. Tu and S. Zhu, Image segmentation by data-driven Markov chain
Monte Carlo, IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5,
pp. 657673, May 2002.
[18] H. Chen, Mutual information based image registration with applications, Ph.D. dissertation, Syracuse Univ., Syracuse, NY, May, 2002.
[19] S. Geman and D. Geman, Stochastic relaxation, Gibbs distributions, and
the Bayesian restoration of images, in Readings in Uncertain Reasoning.
San Francisco, CA: Morgan Kaufmann, 1990, pp. 452472.
[20] L. Bedini, A. Tonazzini, and S. Minutoli, Unsupervised edge-preserving
image restoration via a saddle point approximation, Image Vis. Comput.,
vol. 17, no. 11, pp. 779793, Sep. 1999.
[21] D. Kundur, D. Hatzinakos, and H. Leung, Robust classification of blurred
imagery, IEEE Trans. Image Process., vol. 9, no. 2, pp. 243255,
Feb. 2000.
[22] W. Wright, Fast image fusion with a Markov random field, in Proc. 7th
Int. Conf. Image Process. Appl., 1999, pp. 557561.
[23] R. S. Blum, Robust image fusion using a statistical signal processing
approach, Inf. Fusion, vol. 6, no. 2, pp. 119128, Jun. 2005.
[24] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation
Theory. Upper Saddle River, NJ: Prentice-Hall, 1993.
[25] Y. C. Eldar, A. Beck, and M. Teboulle, Bounded error estimation: A
Chebyshev center approach, in Proc. 2nd IEEE Int. Workshop Comput.
Adv. Multi-Sensor Adapt. Process., 2007, pp. 205208.
[26] S. Lakshmanan and H. Derin, Simultaneous parameter estimation and
segmentation of Gibbs random fields using simulated annealing, IEEE
Trans. Pattern Anal. Mach. Intell., vol. 11, no. 8, pp. 799813, Aug. 1989.
[27] V. Kolmogorov and R. Zabin, What energy functions can be minimized
via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 2,
pp. 147159, Feb. 2004.
[28] F. Heitz, Restriction of a Markov random field on a graph and multiresolution statistical image modeling, IEEE Trans. Inf. Theory, vol. 42, no. 1,
pp. 180190, Jan. 1996.
[29] P. Bremaud, Markov Chains, Gibbs Fields, Monte Carlo Simulation, and
Queues. New York: Springer-Verlag, 1999.
[30] H. Derin and H. Elliott, Modeling and segmentation of noisy and textured
images using Gibbs random fields, IEEE Trans. Pattern Anal. Mach.
Intell., vol. PAMI-9, no. 1, pp. 3955, Jan. 1987.
[31] T. Kasetkasem, Image analysis methods based on Markov random field
models, Ph.D. dissertation, Syracuse Univ., Syracuse, NY, Dec., 2002.
[32] S. S. Saquib, C. A. Bouman, and K. Sauer, ML parameter estimation for
Markov random fields, with applications to Bayesian tomography, IEEE
Trans. Image Process., vol. 7, no. 7, pp. 10291044, Jul. 1998.
[33] J. Besag, On the statistical analysis of dirty pictures, J. R. Stat. Soc.,
vol. 48, no. 3, pp. 259302, 1986.
[34] S. Geman and C. Graffigne, Markov random field image models and their
application to computer vision, in Proc. Int. Congr. Mathematicians,
1986, pp. 14961517.
[35] R. C. Gonzalez and R. E. Woods, Digital Image Processing. Upper
Saddle River, NJ: Prentice-Hall, 2008.

Min Xu (S07M10) received the B.S. degree from


the University of Science and Technology of China,
Hefei, China, in 2002 and the M.S. and Ph.D. degrees
in electrical engineering from Syracuse University,
Syracuse, NY, in 2005 and 2009, respectively.
Since December 2009, she has been a Researcher
with Blue Highway, LLC, Syracuse. Her research
interests are in the areas of statistical signal and
image processing.

XU et al.: IMAGE FUSION APPROACH BASED ON MARKOV RANDOM FIELDS

Hao Chen (S06M08) received the Ph.D. degree


in electrical engineering from Syracuse University,
Syracuse, NY, in 2007.
In 20072010, he was a Postdoctoral Research
Associate and then a Research Assistant Professor
with Syracuse University. Since August 2010, he
has been an Assistant Professor with the Department of Electrical and Computer Engineering, Boise
State University, Boise, ID. His research interests
include statistical signal and image processing and
communications.

5127

Pramod K. Varshney (S72M77SM82F97)


received the B.S. degree in electrical engineering and
computer science and the M.S. and Ph.D. degrees in
electrical engineering from the University of Illinois
at UrbanaChampaign, Urbana, in 1972, 1974, and
1976, respectively.
Since 1976, he has been with Syracuse University,
Syracuse, NY, where he is currently a Distinguished
Professor of Electrical Engineering and Computer
Science and the Director of The Center for Advanced
Systems and Engineering. His current research interests are in distributed sensor networks and data fusion, detection and estimation
theory, wireless communications, image processing, radar signal processing,
and remote sensing.
Dr. Varshney has received numerous awards. He serves as a Distinguished
Lecturer for the IEEE Aerospace and Electronic Systems Society. He was the
2001 President of the International Society of Information Fusion.

You might also like