An Image Fusion Approach Based On Markov Random Fields
An Image Fusion Approach Based On Markov Random Fields
An Image Fusion Approach Based On Markov Random Fields
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011
I. I NTRODUCTION
Manuscript received May 29, 2009; revised June 6, 2010 and February 22,
2011; accepted April 23, 2011. Date of publication July 28, 2011; date of
current version November 23, 2011.
M. Xu was with the Department of Electrical Engineering and Computer
Science, Syracuse University, Syracuse, NY 13244 USA. She is now with
Blue Highway, LLC, Syracuse, NY 13244-4100 USA (e-mail: mxu@bluehighway.com).
H. Chen is with the Department of Electrical and Computer Engineering, Boise State University, Boise, ID 83725-2075 USA (e-mail: haochen@
boisestate.edu).
P. K. Varshney is with the Department of Electrical Engineering and
Computer Science, Syracuse University, Syracuse, NY 13244 USA (e-mail:
varshney@syr.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TGRS.2011.2158607
performance while the use of the transform increases the computational complexity. So one can choose or not to employ
transforms on images depending on different applications. For
the MD-based fusion approaches, the basic fusion rule is applied to MD representations of the images at each resolution
level. For the non-MD-based fusion approach, the basic fusion
rule is directly applied to the source images.
Generally, the main drawback of the pixel-level fusion rule
is that the decision on whether a source image contributes to
the fused image is made pixel by pixel and, therefore, may
cause spatial distortion in the fused image, which affects further
processing, such as classification and detection. As we know,
the pixels in an image are spatially correlated. Thus, for a source
image, if one of its pixels contributes to the fused image, its
neighbors are also likely to contribute to the fused image. It
implies that the decision making during the first step of the
fusion process should exploit the property of spatial correlation.
Thus, it is important to incorporate spatial correlation into the
fusion model, and the use of such a fusion model is expected to
improve fusion performance.
A straightforward approach to make use of spatial correlation
is to use a window- or region-based method [7], [10][13]. The
idea is to estimate the intensity of a fused pixel from that of the
source images in a small window. Yang and Blum [11] assumed
that the decision making of pixels within a small window is
constant and developed an expectation-maximization algorithm
by employing a Gaussian mixture image model to adaptively
find the fusion result. Burt and Kolczynski [10] proposed a
weighted average algorithm to estimate the fused image in a
pyramid transform domain. The weights are measured based
on a local energy or variance (called salience) within a
small window. Lozci et al. [12] modified the weighted average
algorithm by incorporating a generalized Gaussian statistical
model. Lallier and Farooq [13] designed a weighted average
scheme for fusing IR and visual images in a surveillance
scenario. In their algorithm, larger weights are assigned to
either the warmer or cooler pixels for the IR image and to the
pixels having larger local variance for the visual image. The
aforementioned algorithms [10][13] are used in the MD-based
fusion approach.
The theory of Markov random fields (MRFs) provides a basis
for modeling contextual constraints in visual processing and
interpretation [14]. MRF models have proved to be successful
in a variety of image processing applications, including multiresolution fusion [15], change detection [16], edge detection
[17], image denoising [14], [18], image restoration [19], [20],
and image classification [21]. In the image fusion application,
an MRF model has been used to model the images for the fusion
of edge information [22]. Yang and Blum proposed a statistical
model to describe the fusion process [1], [11], [23]. However,
the application of MRF models for pixel-level image fusion on
images with the same resolution has not been considered. In this
paper, we propose two fusion algorithms by incorporating the
contextual constraints via MRF models into the fusion model.
The first algorithm models the decision making at the first
step of the fusion rule as an MRF, and the second algorithm
models both the decision making and the true image as MRFs.
Also, the first algorithm is applicable for both the MD-based
5117
i = 1, . . . , N
(1)
otherwise.
When Hi is given, the pixel intensity of the fused image can
be easily calculated by a Least Squares (LS) technique as [24]
1
x
= (H T H) H T Y
(2)
5118
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011
values that the coefficient H can take, the one which produces
the highest energy should be selected, i.e.,
= min (Y H x
H
)
)T (Y H x
H
1
= min Y T Y (Y T H)(H T H) (H T Y )
H
1
= max (Y T H)(H T H) (H T Y ) .
(3)
H
(4)
(5)
{H, X} = arg max [P (X, H|Y)] .
H,X
n)
X
= arg max P (X|Y, H
(6)
X
(7)
1
1
exp
Uc (H)
(8)
PH (H) =
ZH
T
cS
1
exp
Uc (H) .
ZH =
T
S
H
cS
(9)
(10)
5119
where
E(H) =
i (Yi
n )T (Yi Hi X
n)
Hi X
+
Uc (H).
2 2
cS
(15)
According to the aforementioned result, we observe that maximization in (14) is equivalent to minimization of E(H). Thus,
the optimal estimate for H can be expressed as
n+1 = arg min (E(H)) .
(16)
H
H
Note that, for two source images with size 300 300, H has
a total of 490000 possible configurations. Thus, in practice, due
to the large search space on H, the solution of (16) cannot
be obtained directly, and therefore, the simulated annealing
(SA) algorithm [26] is applied here to search for the optimal
solution of (16). The solution for the second subproblem, i.e.,
the estimate for X, is obtained by (4). The iterative algorithm is
described in terms of the following steps.
1) Start with an initial estimate of H and X. Estimate the
initial parameters (noise variance and some parameters in
the pdf of H) and set the initial temperature.
2) At each iteration, obtain a new estimate of H based on
the Gibbs pdf given in (8) with the Gibbs potential E(H)
using a Gibbs sampling procedure [14].
3) Update the fused image using (4).
4) Reduce the temperature using a predetermined schedule
and repeat 2) and 3) until convergence.
1
1
exp
Vc (X)
PX (X) =
ZX
T
(17)
cS
1
exp
Vc (X) .
T
(18)
cS
(19)
(19)
In a similar manner as for the estimation of H,
reduces to
n+1 = arg min ((X))
(20)
X
X
5120
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011
where
(X) =
T
1
nX
nX +
Y
Y
H
Vc (X).
i
i
i
i
2 2 i
cS
(21)
In the next section, we will show that the solution of (20) can
be easily obtained when we use a Gaussian MRF to model the
fused image.
Different from the first algorithm MRF_H, where the fused
image X is updated by an LS technique, this new algorithm
uses a MAP solution to update the fused image X using (20).
The whole procedure is then described as follows.
1) Start with an initial estimate of H and X. Estimate the
initial parameters (noise variance and some parameters in
the pdf of H and X) and set the initial temperature.
2) At each iteration, obtain a new estimate of H based on
its Gibbs pdf given in (8) with the Gibbs potential E(H)
using a Gibbs sampling procedure [14].
3) Update the fused image using the solution of (20).
4) Reduce the temperature using a predetermined schedule
and repeat 2) and 3) until convergence.
In summary, it is an iterative estimation process to estimate
both H and X, which increases computational time. The MRF
modeling of both the coefficient and the fused image more
accurately represents the images with high resolution and therefore produces better fusion results for the fusion of source
images with high resolution. Moreover, the initial estimates
of H and X are important. As the initial estimates of H and
X get closer to the optimal values, the algorithm converges
faster. Poor initial estimates may lead to the local maxima of the
a posteriori probability. Although the LS approach given by (3)
and (4) is found to be a simple and effective fusion approach, it
does not take spatial correlation into account. However, we use
it to obtain our initial estimates of H and X. According to our
experiments, this approach displays a good fusion performance.
C. Extension to the MD-Based Fusion Framework
Here, the applicability of the two proposed algorithms to
the MD-based fusion approach is discussed. For the non-MDbased fusion approach, Y and H in the data model (1) denote
the intensities of the source images and their corresponding
coefficients and X denotes the intensity of the fused image.
This data model (1) is also applicable for the data after the
MD process [1], [7]. Thus, if the MD-based fusion approach is
employed, we assume that the MD transform is applied to the
source images, Y refers to the values of the MD representations
of the source images at some resolution level, H refers to the
corresponding coefficients, and X refers to the values of the
MD representations of the fused image at the same resolution
level. Thus, instead of directly applying the image fusion model
(1) on the source images, one can perform the MD on the
source images and then apply the image fusion model (1)
on the MD representations at each resolution level. By using
multiresolution transforms such as discrete wavelet transform,
the source image is decomposed into different frequency bands,
Fig. 1.
which makes the model (1) more closely fit the MD representations. However, the use of multiresolution transforms increases
the complexity of the algorithm. It is noted that, since the
multiresolution transform may result in the loss of locality in
MRF models for the MD image representations [28], i.e., the
local MRF property may not hold on X, it is not suggested
to use the algorithm MRF_HX with the MD-based fusion
approach.
The coefficient H for each pixel represents whether the true
scene X (fused image) contributes to the source image. Pixels
in a large area may all contribute to the true scene; however, all
the pixels in the area may not contain the same intensities. Thus,
the coefficient H has more spatial correlation over a larger area
than the intensity of the true scene X. After MD transformation,
coefficients H may still exhibit spatial correlation while MRF
property may not hold for X. Thus, only the algorithm MRF_H
is applied in the MD-based fusion approach. In the next section,
some examples are provided for illustration.
IV. E XPERIMENTAL R ESULTS
A. Choice of MRF Models
We provide three examples to evaluate the fusion performance of our fusion algorithms. For the two MRF-based fusion algorithms, MRF_H and MRF_HX, used in the following
experiments, we consider five clique types in the eightneighborhood system: C1 , C2 , C3 , C4 , C5 , associated with the
singleton, vertical pairs, horizontal pairs, left-diagonal pairs,
and right-diagonal pairs, respectively. They are shown in Fig. 1.
The Gibbs energy function of the coefficient of the source
image is defined by an autologistic function, given by [14]
Uc (H) = aT L
(22)
cS
L=
I [H(s), H(t)] , . . . ,
I [H(s), H(t)]
(s,t)C2
(s,t)C5
(23)
is the coefficient potential vector associated with clique types.
Here, the function I is defined as
I(a, b) = 1,
I(a, b) = 1,
if a = b
otherwise.
(24)
Due to its simplicity, this class of MRF model has been extensively used in [14], [29][31] for modeling a wide variety of
images, both as region formation models and as texture models.
5121
X(t), . . . ,
X(t) .
(26)
G=
(s,t)C2
(s,t)C5
(27)
Fig. 2. Synthetic source images in Experiment 1, size 150 150. (a) Source 1.
(b) Source 2. (c) Source 3.
The potential function Vc (H, a) can be simply computed. However, the normalization term ZH involves a summation over all
possible configurations of H, which is practically impossible
due to the large computation time. Note that, for two source
images with size 300 300, H has a total of 490000 possible
configurations.
An alternative method for approximation to ML estimation
is maximum pseudolikelihood (MPL) estimation, which was
proposed by Besag [33]. The MPL estimation method is a
suboptimal method, which is given by
= arg max
P (H(s), a)
a
a
(28)
The estimate given by (28) for one pixel involves vector
multiplication, which has the computational complexity O(N ).
Thus, the estimation of the whole fused image has the computational complexity O(M N ). Because the estimation of the coefficient also has the computational complexity O(M 2N ) at
each iteration, the overall complexity is O(M (2N + N ))
O(M 2N ).
B. Parameter Estimation
Modeling the Markov pdf parametrically involves the datadriven optimal estimation of the parameters associated with the
potential functions Vc . The model parameters must be estimated
for each data set as part of the image processing algorithm. In
our algorithms, the noise variance 2 in (13) and the parameter
a in the coefficient MRF pdf in (22) are unknown. Thus, we
need to estimate these parameters in our algorithms.
Because we assume that the noise in the fusion model is
a Gaussian noise, it is straightforward to estimate the noise
variance by the maximum likelihood (ML) criterion. It is
given by
(29)
(30)
= arg min
a
Vc (H(s), a) ln ZH(s) .
(31)
5122
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011
Fig. 3. MSE of fusion result as the number of iterations increases for MRF-based approaches for four- and eight-neighborhood systems. (a) MRF_H, SNR =
10 dB. (b) MRF_H, SNR = 30 dB. (c) MRF_HX, SNR = 10 dB. (d) MRF_HX, SNR = 30 dB.
5123
5124
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011
TABLE II
E XECUTION T IME OF F USION A PPROACHES IN
E XPERIMENT 1 ( IN S ECONDS )
show that the use of the MRF model for both the coefficient
and the fused image improves the fusion performance and
enhances the image contrast. The fused images produced by
the averaging approach and the window-based approach display
low contrast. The LS approach produces more noise in the
fused image. The fused image produced by the maximizing
approach has high contrast, but it is too bright. It is observed
that the algorithm MRF_HX produces the sharpest edges than
the other approaches in the fused image and it preserves most
textures.
5125
Fig. 11. Magnified fusion results in Experiment 4cloud images. (a) Averaging. (b) LS. (c) Window. (d) MRF_H. (e) MRF_HX (non-MD-based approach).
5126
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 12, DECEMBER 2011
when the raw source images are directly used for fusion without
preprocessing, the fused image can also be modeled as an MRF,
and then, the fusion result can be obtained using the MAP
criterion incorporating the a priori Gibbs distribution of the
fused image. The second algorithm, MRF_HX, is only applicable for non-MD-based fusion approaches. Visual inspection
and quantitative performance evaluation both demonstrate that
the employment of the MRF model in the fusion approaches
resulted in a better fusion performance than the traditional
fusion approaches. In our proposed image fusion algorithms,
we assumed a simple relationship between each source image
and the true scene, i.e., a source image either contributes to
the fused image or does not contribute to the fused image.
Thus, it results in a mismatch between the fusion model and
the real image data set. To improve this, one can assume that
the coefficient in the data model can take any real value, which
may increase the accuracy of the fusion algorithms. In addition,
in the developed image fusion algorithms, we assumed that the
noise in the source image is an i.i.d. Gaussian noise. Since
this is a rather limiting assumption, if we can build the noise
model to include non-Gaussian distortion or possibly correlated
Gaussian mixture distortion, this model should be closer to
realistic sensor images and the estimation of fused image may
improve.
R EFERENCES
[1] R. S. Blum, On multisensor image fusion performance limits from an
estimation theory perspective, Inf. Fusion, vol. 7, no. 3, pp. 250263,
Sep. 2006.
[2] Z. Wang, D. Ziou, C. Armenakis, D. Li, and Q. Li, A comparative
analysis of image fusion methods, IEEE Trans. Geosci. Remote Sens.,
vol. 43, no. 6, pp. 13911402, Jun. 2005.
[3] C. Thomas, T. Ranchin, L. Wald, and J. Chanussot, Synthesis of multispectral images to high spatial resolution: A critical review of fusion
methods based on remote sensing physics, IEEE Trans. Geosci. Remote
Sens., vol. 46, no. 5, pp. 13011312, May 2008.
[4] C. Pohl and J. van Genderen, Multisensor image fusion in remote sensing: Concepts, methods, and applications, Int. J. Remote Sens., vol. 19,
no. 5, pp. 823854, 1998.
[5] P. K. Varshney, B. Kumar, M. Xu, A. Drozd, and I. Kasperovich, Image
registration: A tutorial, in Proc. NATO ASI, Albena, Bulgaria, 2005.
[6] Z. Zhang and R. S. Blum, A categorization of multiscale-decompositionbased image fusion schemes with a performance study for a digital camera
application, Proc. IEEE, vol. 87, no. 8, pp. 13151326, Aug. 1999.
[7] R. K. Sharma, T. K. Leen, and M. Pavel, Probabilistic image sensor
fusion, in Proc. Adv. Neural Inf. Process. Syst. 11, 1999, pp. 824830.
[8] H.-M. Chen, S. Lee, R. Rao, M.-A. Slamani, and P. Varshney, Imaging
for concealed weapon detection: A tutorial overview of development in
imaging sensors and processing, IEEE Signal Process. Mag., vol. 22,
no. 2, pp. 5261, Mar. 2005.
[9] Y. Zhang, S. De Backer, and P. Scheunders, Noise-resistant waveletbased Bayesian fusion of multispectral and hyperspectral images,
IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11, pp. 38343843,
Nov. 2009.
[10] P. Burt and R. Kolczynski, Enhanced image capture through fusion, in
Proc. 4th Int. Conf. Comput. Vis., 1993, pp. 173182.
[11] J. Yang and R. Blum, A statistical signal processing approach to image
fusion for concealed weapon detection, in Proc. IEEE Int. Conf. Image
Process., 2002, pp. 513516.
[12] A. Lozci, A. Achim, D. Bull, and N. Canagarajah, Statistical image fusion with generalized Gaussian and Alpha-Stable distributions, in Proc.
15th Int. Conf. Digital Signal Process., 2007, pp. 268271.
[13] E. Lallier and M. Farooq, A real time pixel-level based image fusion
via adaptive weight averaging, in Proc. 3rd Int. Conf. Inf. Fusion, 2000,
pp. WEC3/3WEC313.
[14] S. Z. Li, Markov Random Field Modeling in Computer Vision.
New York: Spinger-Verlag, 2001.
5127