Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Removing Atmospheric Turbulence Via Space-Invariant Deconvolution PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO.

X, XXXXXXX 2013 1

Removing Atmospheric Turbulence


via Space-Invariant Deconvolution
Xiang Zhu, Student Member, IEEE, and Peyman Milanfar, Fellow, IEEE

Abstract—To correct geometric distortion and reduce space and time-varying blur, a new approach is proposed in this paper capable
of restoring a single high-quality image from a given image sequence distorted by atmospheric turbulence. This approach reduces the
space and time-varying deblurring problem to a shift invariant one. It first registers each frame to suppress geometric deformation
through B-spline-based nonrigid registration. Next, a temporal regression process is carried out to produce an image from the
registered frames, which can be viewed as being convolved with a space invariant near-diffraction-limited blur. Finally, a blind
deconvolution algorithm is implemented to deblur the fused image, generating a final output. Experiments using real data illustrate that
this approach can effectively alleviate blur and distortions, recover details of the scene, and significantly improve visual quality.

Index Terms—Image restoration, atmospheric turbulence, nonrigid image registration, point spread function, sharpness metric

1 INTRODUCTION

A long-distance imaging system can be strongly affected


by atmospheric turbulence, which randomly changes
the refractive index along the optical transmission path,
Because the unknown hk;x is spatially and temporally
changing in (1), restoring the high-quality image F is not
trivial. Some multiframe reconstruction approaches [4], [9]
generating geometric distortion (motion), space and time- first employ a nonrigid image registration technique to
varying blur, and sometimes even motion blur if the register each observed frame with respect to a fixed reference
exposure time is not sufficiently short [1], [2], [3], [4], [5]. grid, and use the registration parameters to estimate the
Aside from hardware-based adaptive optics approaches [6], corresponding motion field for each frame. Then, a sharp
several signal processing approaches have been proposed to image is formed through a Bayesian reconstruction method.
solve this problem [7], [8], [4], [5], [9], [10]. These approaches The main problem for such algorithms is that they do not
attempt to restore a single high-quality image from an estimate the actual PSF hk;x . Both [4] and [9] employ a fixed
observed frame sequence distorted by air turbulence. As Gaussian model to approximate the PSF, which strongly
limits their performance.
with these other works based on videos or image sequences,
Recently, a method called Efficient Filter Flow (EFF) was
we work under the assumption that the scene and the image
introduced by Hirsch et al. for space-varying blind decon-
sensor are both static and that observed motions are due to
volution and has been applied for astronomical imaging to
the air turbulence alone. The imaging process can be alleviate turbulence distortion and also for magnetic
modeled as [11], [ 10] resonance imaging (MRI) to reduce blur caused by object
Gk ½x ¼ ðF  hk;x  hÞ½x þ Nk ½x; ð1Þ motion [10]. This method first divides each frame into
overlapping patches. Because the size of these patches is
where  represents a 2D convolution operator. F , Gk , and Nk small, they can be viewed as isoplanatic regions—small
denote the ideal image, the kth observed frame, and sensor regions containing space-invariant blur [6], [5]—and thus can
noise, respectively.1 The vector x ¼ ðx; yÞT denotes a 2D be processed through a multiframe blind deconvolution
spatial location. hk;x represents the space-varying (air algorithm [12]. Given an isoplanatic patch sequence of a local
region extracted from the input video, the blind deconvolu-
turbulence-caused) point spread function (PSF) for the
tion algorithm estimates the PSF separately for each patch. Final
position x in the kth frame, which includes both blur and output is then generated by a nonblind image estimation
motion effects [10]. h is the space-invariant diffraction- step. Though capable of suppressing turbulence effects, the
limited PSF, typically due to sensor optics and settings. EFF method cannot remove diffraction-limited blur since the
blind deconvolution step does not utilize much prior
1. In this paper, we use capital letters (e.g., F , G, N) to denote image-wise knowledge of the ideal sharp image except for a nonnega-
models, and lowercase (e.g., f, g, n) to represent patch-wise models.
tivity constraint and a simple Tikhonov regularizer. Besides,
due to the limited accuracy of the local PSF estimation and
. The authors are with the Department of Electrical Engineering, University probably also due to the sensor noise effect, the results can
of California, Santa Cruz, 1156 High St., Santa Cruz, CA 95064. contain strong deblurring artifacts, such as ringing [12].
E-mail: {xzhu, milanfar}@ee.ucsc.edu. Another class of approaches called “lucky imaging”
Manuscript received 18 Apr. 2011; revised 3 Oct. 2011; accepted 19 Mar. employs image selection and fusion methods to reduce the
2012; published online 2 Apr. 2012. blurring effects caused by turbulence [13], [14], [8], [5]. The
Recommended for acceptance by J. Jia. image selection technique attempts to find frames of the
For information on obtaining reprints of this article, please send e-mail to:
tpami@computer.org, and reference IEEECS Log Number
best quality (lucky frames) from a short-exposure video
TPAMI-2011-04-0241. stream. The output image is produced by fusing these lucky
Digital Object Identifier no. 10.1109/TPAMI.2012.82. frames together [13], [14]. This method is based on the
0162-8828/13/$31.00 ß 2013 IEEE Published by the IEEE Computer Society
2 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

imaging systems in general to avoid motion blur caused by


turbulence, and this would inevitably increase the noise
level. Noise can strongly affect the performance of deblur-
ring as it distorts image details and produces ringing
artifacts [20]. Fortunately, the frames within a sequence are
highly correlated with each other, and such a correlation
can be used to advantage denoising in the temporal
Fig. 1. Isoplanatic patches from the short exposure image sequence domain.
Building taken through hot air, where a variation of blur can be observed. In this paper, a new framework is proposed for restoring a
(d) illustrates a lucky frame example which is much sharper than the single image from an image sequence acquired in general
others.
anisoplanatic scenarios. The 3D physical scene is assumed to
be static, as is the imaging sensor, while the air between the
observation that for short-exposure images, turbulence scene and sensor is affected by atmospheric turbulence. Our
creates “mutations” in image quality and randomly makes approach is designed to reduce the spatial variation of PSFs
some images sufficiently sharp (see examples in Fig. 1) [14], over the whole image space so that the blur can be
[15], [16], [17]. This strategy is favored in many astronom- approximately treated as spatially invariant and the latent
ical imaging applications where the image of the object of image content can be estimated globally instead of by local
interest (e.g., a star) is usually taken inside an isoplanatic deconvolution. By doing this, we can improve the estimation
angle. In [17], Vorontsov et al. proposed a “lucky region” accuracy and, more importantly, natural image statistics can
restoration approach for anisoplanatic scenarios. Small lucky be introduced so that diffraction-limited blur can be
(isoplanatic) regions are detected by a local sharpness effectively removed. An image with reduced PSF variation
metric (which can be viewed as being blurred only by the (which we call the near-diffraction-limited (NDL) image) is
diffraction-limited PSF) and are fused to produce a large generated through a fusion process. To avoid noise effects in
high-quality image. In another similar method developed the subsequent deconvolution, temporal kernel regression is
by Joshi and Cohen [18], a local block-based image employed in our approach for denoising.
alignment is first carried out to reduce geometric distortion Note that this restoration framework can be viewed as an
caused by turbulence; then a lucky imaging-based weight- extended version of the one in our earlier paper in ICCP 2011
ing scheme is employed to generate a single image that is [21]. However, the fundamental ideas behind the two are
sharp everywhere. The weighting scheme is carefully quite different. The framework in [21] is based on a simplified
designed to balance between noise reduction and sharpness least-square estimation derived from the imaging model with
preservation. A dehazing process is finally used to enhance several assumptions on image statistics and approximations.
the visual quality. One difficulty with such a method is that The key point of the present paper is reducing the variance of
even though turbulence-caused blur is strongly alleviated spatially and temporally changing PSFs through nonpara-
through the lucky imaging process, the output still suffers metric kernel regression. Compared with [21], the idea in this
from the blur caused by the diffraction-limited PSF [5], [11]. paper is much cleaner and technically sounder. It also
Besides, the blur in lucky regions would inevitably be requires far fewer a priori assumptions than [21].
increased once temporal averaging is used for noise The paper is organized as follows: Section 2 describes the
suppression [18]. restoration framework in detail. Experimental results are
Due to the isoplanatism of turbulence effects (including given in Section 3 to show the performance as compared
blurring and geometric deformation) in small local regions, with other methods, and finally we conclude and discuss
lucky imaging focuses on patch-wise restoration. However, directions of future research in Section 4.
limited patch size also limits the restoration performance.
As Levin et al. pointed out in [19], PSF estimation can be
much improved as the size of the observed patch increases. 2 RESTORATION ALGORITHM
In other words, if PSF can be estimated from a large region The proposed restoration framework contains three main
(e.g., the whole image), then we may be able to achieve steps (see the diagram in Fig. 2):
better deconvolution performance. However, to achieve this
goal the PSF spatial variation needs to be reduced. 1. nonrigid image registration,
Another important factor that may affect restoration is 2. near-diffraction-limited image reconstruction,
sensor noise. High shutter speed is favored in long-distance 3. single image blind deconvolution.

Fig. 2. Block diagram for the proposed restoration framework.


ZHU AND MILANFAR: REMOVING ATMOSPHERIC TURBULENCE VIA SPACE-INVARIANT DECONVOLUTION 3

Given an observed sequence fGk g, step 1 of our proposed


approach registers each frame onto a fixed reference grid,
generating a new sequence fRk g without geometric defor-
mation. This process makes sure that for a given pixel located
at a position, say x, the most correlated pixels in each frame
across time are aligned at this same position. This step
improves the performance of the subsequent temporal
filtering.
Step 2 restores a single image Z from the registered fRk g.
For each local region, the sharpest patch (convolved by a
diffraction-limited PSF) can be detected from the patch Fig. 3. Symmetry constraint, where forward motion vector and backward
motion vector are the inverse of each other.
sequence. Next, patch-wise temporal kernel regression is
carried out to reduce the noise level. An image Z is then
where x and y are the horizontal and vertical intervals
generated by fusing all the denoised sharp patches together.
between neighboring control points. This model guarantees
This image is still blurred by a diffraction-limited PSF,
local smoothness of the turbulence-caused motion field.
which can be approximately viewed as spatially invariant.
In the classic B-spline-based registration approach, the
Finally, a single image blind deconvolution algorithm
deformation vector is estimated by minimizing the follow-
based on natural image statistics is implemented on Z to
ing cost function [22]:
further remove the diffraction-limited blur and to enhance
image quality. Details of each step are given in the following X 2
Cð!pÞ ¼ GðWðx; !p ÞÞ  RðxÞ : ð6Þ
sections. x

2.1 Nonrigid Image Registration However, because the above formulation lacks prior
In [9], we introduced a B-spline based registration algo- constraint, the resulting estimate is unstable and easily gets
rithm to estimate the motion field in each observed frame. stuck in local minima. To improve this method, a stabiliza-
This method can be implemented in the proposed frame- tion constraint is proposed in [4], which makes the estimated
work to remove geometric deformation. In this section, we deformation parameters remain small in the regions that
give a review of this registration algorithm. Assume G have less texture (low image gradient). However, in the
represents a given observed distorted image and R denotes present application, we know that the deformation caused
a reference image without turbulent deformation (which by atmospheric turbulence is independent of image content.
can be obtained by averaging the frame sequence [4]). A In other words, such stabilization constraint would bring
nonrigid deformation model can be employed to describe unnecessary bias into the estimation.
the geometric distortion between these two images. In this Instead, to accurately estimate the deformation (motion)
model, the complex motion is represented by the movement vectors from image grid R to G, we introduce a more natural
of m control points, whose initial positions x x0i ; y^0i ÞT ,
^ 0i ¼ ð^ symmetry constraint [23], [24] into the B-spline registration
i ¼ 1; . . . ; m, are equally spaced on the reference image R. algorithm. This constraint is based on the important
The displacement of all the control points on the given property that the registration should be symmetric or
image G is denoted as the deformation vector: ! p ¼ inverse consistent [23]. Let !p denote the deformation vector
½^x1 ; . . . ; ^
xm ; ^ ym T , where ^
y1 ; . . . ; ^ x and ^ y (also that transforms reference grid R into G (forward deforma-
called deformation parameters) are the horizontal and tion) and let p denote the inverse vector that transforms G
vertical displacement from the initial position [22], [4]. into R (backward deformation). Then, approximately,2 we
The deformed position of any given pixel located at x ¼ should have: ! p ¼  p (see Fig. 3). Combining the two
ðx; yÞT from image R can then be described as vectors into one, pT ¼ ½!
T T
p ; p , the proposed cost function
to minimize becomes
Wðx; !
p Þ ¼ x þ AðxÞ!
p; ð2Þ
X X
where AðxÞ denotes the basis function matrix for x: CðpÞ ¼ jGðWðx; ! p ÞÞ  RðxÞj2 þ jRðWðx; p ÞÞ
 
x x ð7Þ
c1 . . . cm 0 . . . 0  GðxÞj2 þ ð! p þ p ÞT ð!p þ p Þ;
AðxÞ ¼ ; ð3Þ
0 . . . 0 c1 . . . cm
where scalar  controls the effect of the soft symmetry
and AðxÞ! p is the motion vector, which is a linear constraint.
combination of the movements of all control points. The The Gauss-Newton method is used to minimize CðpÞ,
weight (or spline basis) ci is determined by the distance and the update of the parameter set p can be derived as
between x and x^ 0i using B-spline functions: follows:
   
x  x^0i y  y^0i plþ1 ¼ pl  E1 b; ð8Þ
ci ¼   ; ð4Þ
x y
2. The forward and backward motion vectors for each point in the
8 physical world should be the exact opposite of each other. However, in the
< 2=3  ð1  jj=2Þ2 ; if 0  jj  1 registration model the motion field is indexed by pixel locations instead of
ðÞ ¼ ð2  jjÞ3 =6; if 1 < jj < 2 ð5Þ physical points. Thus, the motion vectors for the same position in the
: reference and the target image are not exactly the opposite of each other
0; otherwise;
since they belong to slightly different points. So do the deformation vectors.
4 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

Fig. 4. Block diagram for the near-diffraction-limited image reconstruc-


tion step.
Fig. 5. PSF distortion after the registration step. (a) Magnitude of a
where motion field estimated from a real frame distorted by air turbulence
(video Moon Surface in Fig. 12) using the proposed registration method.
"! # (b) A Gaussian shaped diffraction-limited PSF before alignment. (c)-(e)
E þ I I PSFs after the alignment using the local motion fields given in the
E¼ ; squares A, B, and C, respectively, from (a), where it can be observed
I E þ I that since the local movement could be treated as translational, the
! P ! ! T P shapes of the PSFs are preserved.
E ¼ x d ðxÞ d ðxÞ ; E ¼ x d ðxÞ d ðxÞT ;
  l 
! T @G W x; ! p process can be viewed approximately as patch-wise constant
d ðxÞ ¼ AðxÞ;
@W translational movement (see an example given in Fig. 5). So
  l 
@R W x; p the relationship between an observed local patch gk and the
d ðxÞT ¼ AðxÞ; corresponding registered one rk extracted from Rk can be
@W
described as
and
rk ¼ gk  x ; ð10Þ
! l T  l T

bT ¼ b þ ! p þ p ; b þ!
l l
p þp ;
where x represents a 2D Kronecker Delta function shifted
! X !   !l 

b ¼ d ðxÞ G W x; p  RðxÞ ; by the local registration vector x. Hence, (9) and (10) can
x be combined as follows:
X  l 

b ¼ d ðxÞ RðW x; p  GðxÞ :


rk ¼ z  hk  x þ nk  x
x
¼ z  h~k þ n~k ð11Þ
In the above algorithm, the computational cost for calculat-
! ¼ qk þ n~k :
ing the matrices E and E is tremendous if it is implemented
directly. Instead, we designed a fast implementation method
The PSF h~k ¼ hk  x is simply a shifted version of hk ,
to alleviate this problem. Details can be found in [9]. Once the
which means the registration process preserves the shape
deformation vector p is estimated for each frame, a registered
sequence fRk g is generated through bilinear interpolation. (and spectrum) of local PSFs. Similarly, the shifted noise
patch n~k also has the same statistical properties as nk . As
2.2 Near-Diffraction-Limited Image Reconstruction mentioned before, the registration process increases the
In this section, we estimate a diffraction-limited image Z from correlation among the pixels in the same position from
fRk g, which can be globally deconvolved. A concise descrip- different patches across time. Such a correlation can be
tion of the reconstruction procedure is provided in Algo- utilized for estimating the noise-free sequence fqk g, as we
rithm 1 (see the diagram in Fig. 4). will describe in Section 2.2.2.
To better understand this procedure, let us consider a
Algorithm 1. Procedure for Restoring A Near-Diffraction-
patch-wise imaging model under the isoplanatic scenario
(see Fig. 4): Limited Image from Registered Frames
1. Given a registered sequence fRk g, divide each frame
gk ¼ f  hk  h þ nk into L  L overlapping patches centered at each pixel, and
¼ f  h  hk þ nk ð9Þ calculate the intensity variance of each patch as a local
sharpness measure.
¼ z  h k þ nk ;
2. For a patch sequence frk g centered at location x, detect
where gk can denote any local patch of size L  L extracted the sharpest one rk by maximizing local sharpness
from frame Gk , and its corresponding latent patch is measure (outliers need to be detected and excluded).
denoted by f. Differently from (1), turbulence-caused PSF 3. Set rk as a reference patch, and restore its center pixel
hk is now patch-wise constant and temporally changing. value using temporal kernel regression. Assign this value
We call z ¼ f  h the diffraction-limited patch, which is to the pixel Z½x.
convolved by the space and time invariant h, and thus can
4. Go to the next pixel and return to step 2.
be accurately deconvolved in a larger scene [19] (such as the
whole image). nk represents local noise, which is assumed
to be zero-mean and IID. 2.2.1 Diffraction-Limited Patch Detection
The motion field estimated from the nonrigid image As mentioned in Section 1, once sufficient observations
registration in Section 2.1 is smooth, and the registration are collected, relatively sharp image patches that occa-
ZHU AND MILANFAR: REMOVING ATMOSPHERIC TURBULENCE VIA SPACE-INVARIANT DECONVOLUTION 5

Fig. 6. Outlier identification example. (a) A fused image without outlier removal, where patch A contains outlier pixels and patch B does not contain
any outlier, (b) zoomed patch A from (a), (c) patch A after deblurring, where outlier effect is magnified, (d) deblurred patch A with outlier removal
process, (e) intensity variance of patch A over 100 frames, (f) intensity variance of patch B over 100 frames, (g) histogram of (e), (h) histogram of (f).

sionally appear due to the turbulence variation can be


found. Suppose a k th patch is a diffraction-limited one:
rk  z þ n~k , then simply denoising rk provides a good
estimation of the patch z.
In practice, diffraction-limited patches can be detected
through various local sharpness measures. For example, in
astronomical imaging, Strehl ratio is widely used, which is
the ratio of the peak intensity in the aberrated PSF to the
diffraction-limited PSF [6]. This can be easily measured if
there exists a reference star inside the patch. However, in
general cases Strehl ratio is difficult to measure, and
alternative metrics based on local image intensity are
preferred [13], [25], [26]. In [26], it was shown that image
intensity variance has a unique relationship with the Strehl Fig. 7. Latent sharp image used for simulation.
6 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

Fig. 8. Simulated experiments. (a)-(c) Simulated frames with three turbulence levels (weak, medium, strong). (d)-(f) Near-diffraction-limited images
from the three test sequences with noise variance 2n ¼ 1 and turbulence level weak, medium, and strong, respectively. (g)-(i) Final outputs deblurred
from (d)-(f).

ratio and can be used for patch selection. The intensity caused by registration error and can be strongly magnified
variance of an L  L patch rk is defined as (see Fig. 6c) after the consequential deblurring process
(Section 2.3). In Fig. 6e, we can observe extremely high values
1 X among the sharpness measures for patch A. Patch B gives
sk ¼ ðrk ½x  rk Þ2 ; ð12Þ
L2  1 x another example (see Fig. 6f) where no outlier exists.3
Histograms of these two patches are given in Figs. 6g and 6h,
where rk represents the mean value of patch rk . This metric where the sharpness values for the outlier-free patch have a
is employed in this step. roughly symmetric distribution, while outlier-contaminated
Sharp patch selection by maximizing the above metric can distributions contain a long tail corresponding to high
be sensitive to outliers, such as those caused by objects values.
incorrectly registered or moving objects occasionally appear-
ing in the scene (e.g., a bird flying through the field of view). 3. Patch B contains strong structure (high SNR), and thus its sharpness
measures have values much higher than patch A (see Figs. 6e and 6f). This
Examples are illustrated in Fig. 6a, where some artifacts can also explains why the registration accuracy is lower in patch A compared
be observed in flat regions (see patch A). These artifacts are with patch B.
ZHU AND MILANFAR: REMOVING ATMOSPHERIC TURBULENCE VIA SPACE-INVARIANT DECONVOLUTION 7

TABLE 1
Performance of the Restoration Approaches Evaluated in PSNR Values (dB)

Fig. 9. Image registration results using 100 frames taken from the video Chimney distorted by real atmospheric turbulence.

To alleviate the outlier problem, Hampel’s outlier limited patches needs to be suppressed. We formulate this
identification method is employed [27]. We use a Gaussian denoising problem under a general temporal regression
model N ð; Þ to approximate the distribution of outlier-free framework, where the imaging model is
samples (which are sharpness measurements in our case).
A given sample sk is identified as an outlier if rk ¼ qk þ n~k : ð14Þ
The value of a pixel at x in the lth frame (ql ½x) can be
jsk  j > %1 =2 ; ð13Þ
estimated through zeroth order kernel regression [29], [30]:
where %x is the x quantile of the N ð0; 1Þ distribution and X
denotes the confidence level. Hampel suggested using the q^l ½x ¼ arg min ðrk ½x  ql ½xÞ2 Uðx; k; lÞ; ð15Þ
ql ½x
median and median absolute deviation (MAD) method [28] k

to robustly estimate the data mean  and the standard


deviation , respectively [28]. In our experiment, we set where the weight Uðx; k; lÞ is a positive function measuring
%1 =2 ¼ 6 to strongly suppress the type I error. Any sample sk the “similarity” between ql ½x and qk ½x. One simple but
that is above the threshold
¼ 6^  is detected as outlier and effective way of defining Uð Þ is based on patch-wise
would be ignored when maximizing the local sharpness photometric distance and a Gaussian kernel function:
metric to select the diffraction-limited patch. In Fig. 6d we can
see that with outlier removal the artifacts caused by incorrect !
kqk  ql k2
registration are successfully eliminated from the fused image. Uðx; k; lÞ ¼ exp : ð16Þ
L2 2
2.2.2 Patch-Wise Temporal Kernel Regression
To avoid possible artifacts that may appear in the subse- Here, L2 is the total number of pixels in the patch and the
quent deconvolution step, noise in the selected diffraction- scalar is called the smoothing parameter [29]. The noise-free
8 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

(a) Averaged image (b) Ground truth (c) [10]

(d) [9] (e) Near-diffraction-limited (f) Proposed approach

(g) Zoomed (b) (h) Zoomed (c) (i) Zoomed (d) (j) Zoomed (f)
Fig. 10. Image reconstruction results using 100 frames taken from the video Chimney distorted by real atmospheric turbulence.

photometric distance L12 kqk  ql k2 can be unbiasedly esti- regression, all the weights are positive, which means that
mated using the following function [31]: when restoring the diffraction-limited value, the output is
always more blurry than the observed rk ½x. That is why we
kqk  ql k2 krk  rl k2 call the restored image near-diffraction-limited.
  22n ; ð17Þ
L2 L2
2.3 Single Image Deconvolution
where the noise variance 2n can be estimated using, for Finally, a single image deblurring algorithm is required as a
example, the MAD method [28]. postprocess to deconvolve the near-diffraction-limited
The solution to (15) is image Z. The degradation model is
P
k Uðx; k; lÞrk ½x Z ¼ F  h þ "; ð19Þ
q^l ½x ¼ P ; ð18Þ
k Uðx; k; lÞ

which is nothing but a filter generated by normalizing the where " represents error caused by the process generating
kernel Uð Þ. According to (16), the kernel value within a patch the estimate of Z. Such a blind deconvolution algorithm can
is independent of the pixel position and space-invariant, be described generally using the following:
which means the spatial constancy of the PSF in the
estimated patch q^l is preserved after the regression. <Fb; h>
^ ¼ arg min kZ  h  F k2 þ 1 Rf ðF Þ þ 2 Rh ðhÞ;
F ;h
Of course, in this step we only estimate the center pixel in
ð20Þ
the selected k th patch, which will be assigned to the output
image Z according to the reconstruction procedure given in where Rf and Rh are the regularization terms based on prior
Algorithm 1. One important issue is that in the proposed knowledge about the latent sharp image F and the PSF h.
ZHU AND MILANFAR: REMOVING ATMOSPHERIC TURBULENCE VIA SPACE-INVARIANT DECONVOLUTION 9

Fig. 11. Image reconstruction results using 100 frames taken from the video Building distorted by real atmospheric turbulence.

Recent research on natural image statistics has shown that The cost function (20) is optimized by alternating the
image gradients obey heavy-tailed distributions that have estimation of f and h [33], [20].
most of their mass on small values but give significantly In what follows, we used the default parameter settings as
more probability to large values than Gaussian distributions described in the authors’ project page4 for [20] except the
[32]. Based on these studies, several sparse regularization noise level parameter “noiseStr,” which is chosen in the range
methods have been introduced and have achieved great ½0:01; 0:05 according to the actual noise level observed in the
success in solving the blind deconvolution problem [32], [33], given data. We refer interested readers to [20] for details.
[20]. One example is the method proposed by Shan et al. [20],
which is directly implemented in this step to calculate a final
3 EXPERIMENTS
output. This method uses basically the same estimation form
as (20) with the following sparse regularization term: In this section, we will illustrate the performance of the
proposed approach using both simulated and real image
Rf ðF Þ ¼ k ðFx Þ þ ðFy Þk1 ; ð21Þ sequences. Throughout all the experiments, the intervals of
where Fx and Fy denote the derivatives of F in horizontal the control points in the registration step are set as x ¼ y ¼
and vertical directions, respectively, and 16 pixels and the symmetry constraint parameter:  ¼ 5;000.
In implementing the restoration step in Section 2.2, we set
1 jj   lt the patch size L ¼ 9. Results of the lucky-region algorithm
ðÞ ¼ ð22Þ
ð2 2 þ 3 Þ  > lt : from [17], the multiframe reconstruction approach from [9],
Here, lt , 1 , 2 , and 3 are all fixed parameters [20]. Sparsity 4. http://www.cse.cuhk.edu.hk/~leojia/projects/motion_deblurring/
is also utilized in regularizing h [20]: index.html.
5. In this section, the outputs of method [9] are generated using the
Rh ðhÞ ¼ khk1 : ð23Þ original code. The EFF outputs are directly from the authors of [10]. The
outputs of method [17] are produced by our own implementation.
10 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

and the EFF approach from [10] are also shown for Additional sets of video data taken from long-distance
comparison.5 imaging systems are processed next. The first set of images
(410  380  80) shows the moon’s surface taken from a
3.1 Simulated Experiments ground-based telescope (see Fig. 12a). From Fig. 12b we can
To quantitatively evaluate the algorithm performance, a set see that though the output image of [17] looks slightly
of image sequences with different degrees of turbulence sharper than one of the observed frames, it is still quite
and noise are generated. The latent sharp image6 (300  blurry, probably due to the diffraction-limited blur and the
300) is shown in Fig. 7. The sequences are produced using limited number of frames. The method in [9] provides a
the imaging model described in (1) and (9), where motion better result but with some details (like small craters)
fields, spatially variant PSFs, and spatially invariant vanished (Fig. 12c). The proposed method gives a sig-
diffraction-limited PSF are required. To produce the motion nificant improvement in visual quality (Fig. 12d). It
fields, we first randomly generate a set of deformation successfully removed blur and meanwhile recovered many
vectors through a Gaussian distribution, and then calculate small craters on the surface (Fig. 12h) that can hardly be
a motion field for each frame through the B-spline model seen from either the original frame (Fig. 12e) or the outputs
(2)-(5). The turbulence strength is determined by the of the other two methods (Figs. 12f and 12g).
variance of the Gaussian distribution. The spatially variant The scene of the next video stream consists of a water
PSFs are simulated also using a Gaussian function, where tower located above the ground, imaged at a (horizontal)
the variance of a local PSF is proportional to the magnitude distance of 2.4 kilometers. The (300  220) video is quite
of the corresponding local motion vector. The diffraction- noisy and highly blurred (due to long exposure time).
limited PSF is generated using a disc function. Three Eighty frames were taken from the video to produce the
degrees (weak, medium, and strong) of turbulence are result shown in Fig. 13. Again, the lucky region method did
produced (see Figs. 8a, 8b, and 8c). Also, different levels of not provide much improvement in the result and slightly
white Gaussian noise (with variance 2n ¼ 1; 9; 25) are added changed the object shape (Fig. 13b). The reconstruction
into the sequences. Each sequence contains 100 frames. method in [9] slightly increased the sharpness (Fig. 13c).
Due to the space limits, we only provide results of three The output of the proposed method looks much sharper
sets of sequences with noise variance 2n ¼ 1 in Fig. 8. It is and clean, with many details well restored (Fig. 13d).
clear that the proposed approach significantly improved the
visual quality, and recovered many high-frequency details
of the image content. 4 DISCUSSION AND CONCLUSIONS
Table 1 gives the Peak Signal-to-Noise Ratio (PSNR) values In this paper, we proposed a new approach for restoring a
for all the outputs with four different restoration algorithms single high-quality image from an image sequence distorted
and the averaged PSNR values of each input sequence. It can by air turbulence. The proposed algorithm first registers the
be seen that the proposed approach outperforms in all test frames to suppress geometric deformation using B-spline-
sequences in terms of PSNR. Results of the near-diffraction- based nonrigid image registration, which incorporates a
limited images are also provided in this table, from which we symmetry constraint to effectively improve the estimation
can tell that the final blind-deconvolution step generally accuracy. Next, a regression-based process is carried out to
increases the PSNR by 1-1.5 dB. produce an image convolved with a near-diffraction-limited
PSF, which can be viewed as spatially invariant. Finally, a
3.2 Real Video Experiments
blind deconvolution algorithm is implemented to remove
Several real videos are tested to illustrate the performance diffraction-limited blur from the fused image to generate
of the proposed restoration framework in practical situa- the final output. Experiments using controlled and real data
tions.7 We first show results on data taken under controlled illustrate that this approach is capable of alleviating
but real conditions. The two sets of videos (see Chimney and geometric deformation and space-time varying blur caused
Building in Figs. 9, 10, and 11) of size 237  237 were by turbulence, recovering unprecedented details of the
captured through hot air exhausted by a building’s vent, scene and significantly improving visual quality.
which could be closed to take “ideal” images of the same Another technique which may be related to the present
scene [10]. Each sequence consists of 100 frames degraded work is seeing through water [34], [35]. Compared with air
by strong turbulence effects (see Figs. 9a, 9b, 9c, and 11b ). In turbulence, in most cases the geometric warping effect
the Chimney experiment, some registration results are also induced by water is much stronger, but the blur effect is
given in Figs. 9d, 9e, and 9f, where we can see geometric relatively milder. In the following experiments, we replace
deformation with respect to the averaged image Fig. 10a has the image registration step in Section 2.1 with a robust
effectively been removed. From the output images in iterative registration algorithm from a seeing through water
Figs. 10c, 10d, and 10f we can observe that EFF (c) and the approach [34]. The sequence Chimney with severe turbu-
proposed method (f) provide the best restoration results and lence motion is tested. Results are given in Fig. 14, where (a)
faithfully recover details of the object. The proposed result shows the direct output of method [34], which is highly
(f) looks even sharper, while the EFF output (c) contains blurry since it is just the temporal mean of the registered
some halo artifacts in the edge regions. Similar restoration sequence. We then take the registered sequence to generate
results can be observed in the Building experiment shown in a near-diffraction-limited image (b) using the temporal
Fig. 11. The near-diffraction-limited image generated from regression step described in Section 2.2, this intermediate
the Chimney sequence is also provided in Fig. 10e. result is much sharper than (a). The final deblurred output
is given in (c), which is visually very close to the proposed
6. The pixel intensity range here is ½0; 255.
7. Videos and experimental results are given on the webpage: http:// result given in Fig. 10f except for a mild shape change at the
users.soe.ucsc.edu/~xzhu/doc/turbulence.html. top part of the Chimney. Similar experiments using test
ZHU AND MILANFAR: REMOVING ATMOSPHERIC TURBULENCE VIA SPACE-INVARIANT DECONVOLUTION 11

(a) One observed frame (b) [17]

(c) [9] (d) Proposed approach

(e) Zoomed part of (a) (f) Zoomed part of (b) (g) Zoomed part of (c) (h) Zoomed part of (d)

(i) Zoomed part of (a) (j) Zoomed part of (b) (k) Zoomed part of (c) (l) Zoomed part of (d)
Fig. 12. Image reconstruction result using 80 frames taken from the video Moon Surface distorted by real atmospheric turbulence.
12 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

Fig. 13. Image reconstruction result using 80 frames taken from the video Water Tower distorted by real atmospheric turbulence.

video Building are illustrated in Fig. 15. Probably the robust and one group of algorithms restoring z is multichannel
registration algorithm addressing water distortions is deconvolution filters [36], [37]. These are defined as
capable of correcting strong geometric deformations quite X
well, but it seems that the registration step in Section 2.1 is z^ ¼ tk  gk ; ð24Þ
k
sufficient for handling most air induced motion.
One limitation of the proposed framework is that the where tk is a frame-wise filter of size N  N. In [12], the EFF
near-diffraction-limited patch reconstruction step requires approach attempts to estimate z and fhk g simultaneously
the existence of a diffraction-limited isoplanatic patch for all given fgk g, which can be viewed as a special case of
the local regions. This requirement may be hard to achieve multichannel deconvolution filtering. Assuming that fhk g
can be exactly estimated in the EFF process, then the EFF
if we do not have sufficiently many frames. However, in
reconstruction can be interpreted as calculating each tk as
that case it is still possible to deconvolve the diffraction-
the inverse filter of hk and averaging all the filtered frames
limited patch z. We can consider the local imaging model: to get the final output. In this case, tk needs a large support
gk ¼ z  hk þ nk ; N in general. Giannakis and Heath showed that if we have
sufficient observations, it is also possible (in the absence of
as a single-input multioutput (SIMO) system [36]. Such noise) to perfectly reconstruct z given ftk g of very small
systems have been thoroughly studied in the past few years support [36]. In other words, we can “mildly” deblur each
ZHU AND MILANFAR: REMOVING ATMOSPHERIC TURBULENCE VIA SPACE-INVARIANT DECONVOLUTION 13

(a) (b) (c)

(d) (e) (f)


Fig. 14. Experiments using the seeing through water method [34]. Video Chimney distorted by real atmospheric turbulence is tested. (a) Output of
method [34]. (b) Near-diffraction-limited image generated using the sequence registered by [34]. (c) Image deblurred from (b). (d) Zoomed ground
truth. (e) Zoomed proposed result (same as Fig. 10j). (f) Zoomed (c).

(a) (b) (c)

(d) (e) (f)


Fig. 15. Experiments using the seeing through water method [34]. Video Building distorted by real atmospheric turbulence is tested. (a) Output of
method [34]. (b) Near-diffraction-limited image generated using the sequence registered by Oreifej et al. [34]. (c) Image deblurred from (b). (d)
Zoomed ground truth. (e) Zoomed proposed result (same as Fig. 11i). (f) Zoomed (c).

frame and get a perfectly restored image through fusion. ACKNOWLEDGMENTS


Furthermore, if the filter size N is reduced to 1, the filtering The authors would like to thank Prof. Mikhail A.
process would become a pixel-wise fusion, which is very Vorontsov from the Intelligent Optics Lab of the University
similar to what we presented in Section 2.2. An accurate of Maryland for allowing them to use the video data Water
estimation of z under a more general scenario requires an Tower, and thank Mr. Faisal A. Salem from the University
accurate PSF estimation procedure, which is difficult to of Michigan and Dr. Joseph M. Zawodny from NASA
achieve, especially in noisy cases. Further study should be Langley Research Center for providing them with the
carried out in this direction. video Moon Surface. The authors also thank Mr. M. Hirsch
14 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. X, XXXXXXX 2013

and Dr. S. Harmeling from Max Plank Institute for [23] S. Farsiu, M. Elad, and P. Milanfar, “Constrained, Globally
Optimal, Multi-Frame Motion Estimation,” Proc. IEEE/SP 13th
Biological Cybernetics for sharing with them the sequences Workshop Statistical Signal Processing, pp. 1396-1401, July 2005.
Chimney and Building. This work was supported by US Air [24] M.F. Beg and A. Khan, “Symmetric Data Attachment Terms for
Large Deformation Image Registration,” IEEE Trans. Medical
Force Office of Scientific Research Grant FA9550-07-1-0365 Imaging, vol. 26, no. 9, pp. 1179-1189, Sept. 2007.
and US National Science Foundation Grant CCF-1016018. [25] R.F. Dantowitz, S.W. Teare, and M.J. Kozubal, “Ground-Based
High-Resolution Imaging of Mercury,” The Astronomical J.,
vol. 119, pp. 2455-2457, May 2000.
[26] J.C. Christou, K.J. Mighell, and R.B. Makidon, “Strehl Ratio and
REFERENCES Image Sharpness for Adaptive Optics,” Proc. SPIE Advances in
[1] W.E.K. Middleton, Vision through the Atmosphere. Univ. Toronto Adaptive Optics II, vol. 6272, p. 62721Y, 2006.
Press, 1958. [27] L. Davies and U. Gather, “The Identification of Multiple Outliers,”
[2] M.C. Roggemann and B.M. Welsh, Imaging through Turbulence. J. Am. Statistical Assoc., vol. 88, pp. 782-792, Sept. 1993.
CRC Press, 1996. [28] F.R. Hampel, “The Influence Curve and Its Role in Robust
[3] D. Li, R.M. Mersereau, and S. Simske, “Atmospheric Turbulence- Estimation,” J. Am. Statistical Assoc., vol. 69, pp. 383-393, 1974.
Degraded Image Restoration Using Principal Components Ana- [29] M.P. Wand and M.C. Jones, Kernel Smoothing, series Monographs
lysis,” IEEE Geoscience and Remote Sensing Letters, vol. 4, no. 3, on Statistics and Applied Probability. Chapman and Hall, 1995.
pp. 340-344, July 2007. [30] H. Takeda, S. Farsiu, and P. Milanfar, “Kernel Regression for
[4] M. Shimizu, S. Yoshimura, M. Tanaka, and M. Okutomi, “Super- Image Processing and Reconstruction,” IEEE Trans. Image Proces-
Resolution from Image Sequence under Influence of Hot-Air sing, vol. 16, no. 2, pp. 349-366, Feb. 2007.
Optical Turbulence,” Proc. IEEE Conf. Computer Vision and Pattern [31] A. Buades, B. Coll, and J.M. Morel, “A Review of Image Denoising
Recognition, June 2008. Algorithms, with a New One,” Multiscale Modeling and Simulation
[5] R.N. Tubbs, “Lucky Exposures: Diffraction Limited Astronomical (SIAM Interdisciplinary J.), vol. 4, no. 2, pp. 490-530, 2005.
Imaging through the Atmosphere,” PhD thesis, Cambridge Univ., [32] R. Fergus, B. Singh, A. Hertsmann, S.T. Roweis, and W.T.
Sept. 2003. Freeman, “Removing Camera Shake from a Single Image,” Proc.
[6] R.K. Tyson, Principles of Adaptive Optics. Academic Press, 1998. ACM Siggraph, 2006.
[7] M.A. Vorontsov, “Parallel Image Processing Based on an Evolu- [33] A. Levin, R. Fergus, F. Durand, and W.T. Freeman, “Image and
tion Equation with Anisotropic Gain: Integrated Optoelectronic Depth from a Conventional Camera with a Coded Aperture,”
Architectures,” J. Optical Soc. Am. A, vol. 16, pp. 1623-1637, 1999. ACM Trans. Graphics, vol. 26, article 70, 2007.
[8] M.A. Vorontsov and G.W. Carhart, “Anisoplanatic Imaging [34] O. Oreifej, G. Shu, T. Pace, and M. Shah, “A Two-Stage
Through Turbulent Media: Image Recovery by Local Information Reconstruction Approach for Seeing through Water Clearly,”
Fusion from a Set of Short-Exposure Images,” J. Optical Soc. Am. A, Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
vol. 18, no. 6, pp. 1312-1324, June 2001. [35] Y. Tian and S. Narasimhan, “Seeing through Water: Image
[9] X. Zhu and P. Milanfar, “Image Reconstruction from Videos Restoration Using Model-Based Tracking,” Proc. 12th IEEE Int’l
Distorted by Atmospheric Turbulence,” Proc. SPIE Electronic Conf. Computer Vision, 2009.
Imaging Conf. Visual Information Processing and Comm., Jan. 2010. [36] G.B. Giannakis and R.W. Heath, “Blind Identification of Multi-
[10] M. Hirsch, S. Sra, B. Schölkopf, and S. Harmeling, “Efficient Filter channel FIR Blurs and Perfect Image Restoration,” IEEE Trans.
Flow for Space-Variant Multiframe Blind Deconvolution,” Proc. Image Processing, vol. 9, no. 11, pp. 1877-1896, Nov. 2000.
IEEE Conf. Computer Vision and Pattern Recognition, pp. 607-614, [37] W. Souidene, K. Abed-Meraim, and A. Beghdadi, “A New Look to
June 2010. Multichannel Blind Image Deconvolution,” IEEE Trans. Image
[11] N.M. Law, “Lucky Imaging: Diffraction-Limited Astronomy from Processing, vol. 18, no. 7, pp. 1487-1500, July 2009.
the Ground in the Visible,” PhD thesis, Cambridge Univ., May
2003. Xiang Zhu received the BS and MS degrees in
[12] S. Harmeling, M. Hirsch, S. Sra, and B. Schölkopf, “Online Blind electrical engineering from Nanjing University,
Deconvolution for Astronomical Imaging,” Proc. IEEE Int’l Conf. Nanjing, China, in 2005 and 2008, respectively.
Computational Photography, Apr. 2009. He is currently working toward the PhD degree
[13] M.C. Roggemann, C.A. Stoudt, and B.M. Welsh, “Image-Spectrum in electrical engineering at the University of
Signal-to-Noise-Ratio Improvements by Statistical Frame Selection California, Santa Cruz. His research interests
for Adaptive-Optics Imaging through Atmospheric Turbulence,” are in the domain of image processing (denois-
Optical Eng., vol. 33, no. 10, pp. 3254-3264, Oct. 1994. ing, deblurring, super-resolution, and image
[14] D.L. Fried, “Probability of Getting a Lucky Short-Exposure Image quality assessment) and computer vision. He is
through Turbulence,” Optical Soc. Am., J., vol. 68, pp. 1651-1658, a student member of the IEEE.
1978.
[15] M. Aubailly, M.A. Vorontsov, G.W. Carhart, and M.T. Valley,
“Image Enhancement by Local Information Fusion with Pre-
Processing and Composed Metric,” Proc. SPIE, vol. 7090, 2008. Peyman Milanfar received the BS degree in
[16] S. John and M.A. Vorontsov, “Multiframe Selective Information electrical engineering and mathematics from the
Fusion from Robust Error Estimation Theory,” IEEE Trans. Image University of California, Berkeley, and the MS
Processing, vol. 14, no. 5, pp. 577-584, May 2005. and PhD degrees in electrical engineering from
[17] M. Aubailly, M.A. Vorontsov, G.W. Carhat, and M.T. Valley, the Massachusetts Institute of Technology,
“Automated Video Enhancement from a Stream of Atmospheri- respectively. Until 1999, he was a senior
cally-Distorted Images: The Lucky-Region Fusion Approach,” research engineer at SRI International, Menlo
Proc. SPIE, vol. 7463, 2009. Park, California. He is a professor of electrical
[18] N. Joshi and M. Cohen, “Seeing Mt. Rainier: Lucky Imaging for engineering and an associate dean for research
Multi-Image Denoising, Sharpening, and Haze Removal,” Proc. in the Baskin School of Engineering at the
IEEE Int’l Conf. Computational Photography, Mar. 2010. University of California, Santa Cruz. From 1998 to 2000, he was a
[19] A. Levin, Y. Weiss, F. Durand, and W.T. Freeman, “Under- consulting assistant professor of computer science at Stanford
standing and Evaluating Blind Deconvolution Algorithms,” Proc. University, where he was also a visiting associate professor in 2002.
IEEE Conf. Computer Vision and Pattern Recognition, 2009. His research interests include statistical signal, image, and video
[20] Q. Shan, J. Jia, and A. Agarwala, “High-Quality Motion processing, and computational photography and vision. He won a US
Deblurring from a Single Image,” ACM Trans. Graphics, vol. 27, National Science Foundation (NSF) CAREER award in 2000, and the
article 73, 2008. best paper award from the IEEE Signal Processing Society in 2010. He
[21] X. Zhu and P. Milanfar, “Stabilizing and Deblurring Atmospheric is a member of the Signal Processing Society’s Awards Board, and the
Turbulence,” Proc. IEEE Int’l Conf. Computational Photography, Apr. Image, Video and Multidimensional Signal Processing technical
2011. committee. He is a fellow of the IEEE.
[22] R. Szeliski and J. Coughlan, “Spline-Based Image Registration,”
Int’l J. Computer Vision, vol. 22, no. 93, pp. 199-218, 1997.

You might also like