Weighted Schatten P-Norm Minimization For Image Denoising and Background Subtraction

4842 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO.
10, OCTOBER 2016
Weighted Schatten p-Norm Minimization for Image

Denoising and Background Subtraction
Yuan Xie, Member, IEEE, Shuhang Gu, Yan Liu, Wangmeng Zuo, Senior Member, IEEE,
Wensheng Zhang, and Lei Zhang, Senior Member, IEEE
Abstract— Low rank matrix approximation (LRMA), which namely low rank matrix approximation (LRMA). For example,
aims to recover the underlying low rank matrix from its degraded the video clip captured by a static camera satisfies the “low
observation, has a wide range of applications in computer vision. rank + sparse” structure so that the background modeling
The latest LRMA methods resort to using the nuclear norm
minimization (NNM) as a convex relaxation of the nonconvex can be conducted via LRMA [1], [2]. Also, the occluded or
rank minimization. However, NNM tends to over-shrink the rank corrupted faces can be recovered by exploiting the low rank
components and treats the different rank components equally, nature of matrix constructed by human facial images [3]–[5].
limiting its flexibility in practical applications. We propose a more The success of recent image restoration methods [6]–[9] partly
flexible model, namely, the weighted Schatten p-norm minimiza- stem from nonlocal self-similarities in natural images, which
tion (WSNM), to generalize the NNM to the Schatten p-norm
minimization with weights assigned to different singular values. can be also considered as a low rank priori. Thanks to the
The proposed WSNM not only gives better approximation to the current convex/non-convex optimization techniques, a large
original low-rank assumption, but also considers the importance amount of modified models and improved algorithms have
of different rank components. We analyze the solution of been proposed for LRMA [1], [10]–[12], [28], [30], [31].
WSNM and prove that, under certain weights permutation, Generally speaking, LRMA can be achieved by factorization
WSNM can be equivalently transformed into independent
non-convex l p -norm subproblems, whose global optimum can based models [13]–[18] and regularization based models.
be efficiently solved by generalized iterated shrinkage algorithm. We focus on the latter category in this work. One of the
We apply WSNM to typical low-level vision problems, e.g., image most representative low rank regularizers is the nuclear norm,
denoising and background subtraction. Extensive experimental which is defined as the sum of the singular values of a given
results show, both qualitatively and quantitatively, that the
matrix X ∈ Rm×n , i.e., X∗ = i |σi (X)| 1 . According
proposed WSNM can more effectively remove noise, and model
the complex and dynamic scenes compared with state-of-the-art to [19], nuclear norm is the tightest convex relaxation of the
methods. original rank minimization problem. Given a matrix Y, the aim
of nuclear norm minimization (NNM) is to find a low rank
Index Terms— Low rank, weighted Schatten p-norm, low-level
vision. matrix X which satisfies the following objective function:
X̂ = arg min X − Y2F + λX∗ , (1)

X
I. I NTRODUCTION
where λ is a trade-off parameter between the loss function
T HERE is a rapidly growing interest in the recovery of an
unknown low rank matrix from its degraded observation, and the low rank regularization induced by the nuclear norm.
Candès and Recht [20] showed that the low rank matrix can
Manuscript received December 22, 2015; revised May 17, 2016; accepted be perfectly recovered by NNM under certain conditions, and
August 5, 2016. Date of publication August 10, 2016; date of current version Cai et al. [21] proposed a soft-thresholding operation to solve
August 26, 2016. This work was supported in part by the Hong Kong
Scholars Program and the HK RGC GRF under Grant PolyU 5313/13E, NNM efficiently. Due to the theoretical guarantees and the
and in part by the National Natural Science Foundation of China under simple optimization scheme, NNM has been attracting great
Grant 61402480, Grant 61432008, Grant 61472423, Grant 61502495, research interests in recent years.
Grant 41401383, and Grant 61373077. The associate editor coordinating
the review of this manuscript and approving it for publication was Despite the convexity of the NNM model, it has been
Prof. Dacheng Tao. indicated in [26], [28], [29], and [31] that the recovery perfor-
Y. Xie is with the Department of Computing, The Hong Kong Polytechnic mance of such a convex relaxation will degrade in the presence
University, Hong Kong, and also with the Research Center of Precision
Sensing and Control, Institute of Automation, Chinese Academy of Sciences, of measurement noise, and the solution can seriously deviate
Beijing 100190, China (e-mail: yuan.xie@ia.ac.cn). from the original solution of rank minimization problem. More
S. Gu, Y. Liu, and L. Zhang are with the Department of Computing, specifically, as shown in the experimental section, the NNM
The Hong Kong Polytechnic University, Hong Kong (e-mail: shuhanggu@
gmail.com; csyliu@comp.polyu.edu.hk; cslzhang@comp.polyu.edu.hk). based model will shrink too much the low rank components
W. Zuo is with the School of Computer Science and Technology, of the data. Therefore, it has been proposed in [23], [28], [29],
Harbin Institute of Technology, Harbin 150001, China (e-mail: and [31] to enforce low rank regularization by using the
cswmzuo@gmail.com).
W. Zhang is with the Research Center of Precision Sensing and Control, Schatten p-norm,which is defined as the l p norm of the
p
Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China singular values ( i σi )1/ p with 0 < p ≤ 1. Theoretically,
(e-mail: wensheng.zhang@ia.ac.cn). Schatten p-norm will guarantee a more accurate recovery of
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. the signal while requiring only a weaker restricted isometry
Digital Object Identifier 10.1109/TIP.2016.2599290 property than traditional trace norm [29]. The empirical results
1057-7149 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on April 18,2020 at 05:03:57 UTC from IEEE Xplore. Restrictions apply.
XIE et al.: WEIGHTED SCHATTEN p-NORM MINIMIZATION FOR IMAGE DENOISING AND BACKGROUND SUBTRACTION 4843
also show that the Schatten p-norm based model outperforms • We apply the proposed WSNM to image denoising
the standard NNM. However, most of the Schatten p-norm and background subtraction, and achieve state-of-the-art
based models treat all singular values equally, and they are performance, which demonstrates the great potentials of
not flexible enough to deal with many real problems where WSNM in low level vision applications.
different rank components have different importances. The remainder of this paper is organized as follows.
Another way to improve the performance of low rank In Section II, we describe our proposed WSNM model in detail
approximation is to treat each rank component differently and analyze the optimization scheme to solve it. In Section III,
other than treat the singular values equally as in NNM. The WSNM is applied to image denosing and background subtrac-
truncated nuclear norm regularization (TNNR) [24] and the tion, respectively. The experimental results are presented in
partial sum minimization (PSM) [25] resort to only minimizing Section IV, and Section V concludes the paper.
the smallest N − r singular values while keeping the largest r
ones unchanged, where N is the number of singular values and II. W EIGHTED S CHATTEN p-N ORM M INIMIZATION
r is the rank of the matrix. Nevertheless, the rank r is hard to A. Problem Formulation
estimate, and could vary with the content of the data matrix. The proposed weighted Schatten p-norm of
To incorporate the prior knowledge of different singular values matrix X ∈ Rm×n is defined as
more reasonably, recently Gu et al. [22] proposed
the weighted 1p
nuclear norm, which is defined as Xw,∗ = i |wi σi (X)|1 , min{n,m} p
Xw,S p = wi σi , (2)
where w = [w1 , . . . , wn ] and wi ≥ 0 is a non-negative i=1
weight assigned to σi (X). The weighted nuclear norm is
where w = [w1 , . . . , wmin{n,m} ] is a non-negative vector, and
flexible to deal with many real problems, e.g., image restora-
σi is the i -th singular value of X. Then the weighted Schatten
tion, in which larger singular values need to be penalized
p-norm of a matrix X with power p is
less than the smaller ones so that the major data compo- min{n,m}
p p
nents can be preserved. Compared with traditional NNM, the Xw,S p = wi σi = tr (W p ), (3)
i=1
so-called weighted nuclear norm minimization (WNNM)
scheme assigns different weights to different singular where both W and are diagonal matrices whose diagonal
values such that the values of soft thresholds become more entries are composed of wi and σi , respectively.
reasonable. Given a matrix Y, our proposed LRMA model aims to find a
Inspired by the Schatten p-norm minimization [28] matrix X, which is as close to Y as possible under the F-norm
and WNNM [22], in this paper we propose a new data fidelity and the weighted Schatten p-norm regularization:
low rank regularizer namely Weighted Schatten p-Norm p
X̂ = arg min X − Y2F + λXw,S p . (4)
Minimization (WSNM) for LRMA. WSNM is flexible in deal- X
ing with different rank components, and gives better approx- where λ is a tradeoff parameter to balance the data fidelity and
imation to the original LRMA problem. As can be seen later, regularization. The solution of the above weighted Schatten
WSNM generalizes WNNM to a unified low rank based frame- p-norm minimization (WSNM) problem is discussed in detail
work, while WNNM is a special case of the proposed WSNM. in the next subsection. Note that WNNM [22] is a special case
Nonetheless, introducing weights for Schatten p-norm of WSNM when power p is set to 1.
makes the problem much more difficult than the one con-
taining only weighted schema or Schatten p-norm, since the B. General Solution of WSNM
weights order plays a critical role in optimization. Without Having discussed in [22], the convexity property of
using weighted schema, the problem can be directly decom- the optimization problem cannot be preserved because of
posed into independent l p -norm subproblems [23]. But such the added weights in NNM. Furthermore, the nonconvex
a solution does not fit our case due to the auxiliary weight relaxation brought by the Schatten p-norm makes the above
variables. Choosing p = 1, the optimal solution can be problem much more difficult to optimize. We first give the fol-
achieved by the widely used soft-thresholding operator when lowing theorem and lemma before analyzing the optimization
the weights satisfy a non-descending order [22]. To solve our of WSNM:
problem, we first present the general solution of WSNM, and Theorem 1 (Von-Neumann [32]): For any m × n
then show that under certain weights permutation, WSNM matrices A and B, σ (A) = [σ1 (A), . . . , σr (A)]T and
can be equivalently transformed into independent non-convex σ (B) = [σ1 (B), . . . , σr (B)]T , where r = min(m, n),
l p -norm subproblems, whose global optimum can be are the singular values of A and B, respectively, then
efficiently solved by the recently proposed generalized soft- tr (AT B) ≤ tr (σ (A)T σ (B)). The case of equality occurs
thresholding algorithm (GST) [33]. Meanwhile, rigorous math- if and only if it is possible to find unitaries U and V that
ematical proof of the equivalence is presented by analyzing the simultaneously singular value decompose A and B in the
property of GST. We highlight the main contributions of this sense that
paper as follows:
A = U A VT, and B = U B VT, (5)
• We propose a new model of LRMA, i.e., WSNM, and
present its general solution. where A and B denote ordered singular value matrices
• We present an efficient optimization algorithm to solve with the singular values σ (A) and σ (B) along the diagonal
WSNM with non-descending weights. with the same order, respectively.
4844 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO. 10, OCTOBER 2016
Fig. 1. Plots of the function f i (δ) with different orders of wi . In each subfigures, points #1, #2 and #3 denote the global optimums of f 1 (δ), f 2 (δ)
and f 3 (δ), respectively. (a) σ1 = 2.3, σ2 = 2.25, σ3 = 2.0, and w1 = 1.85, w2 = 1.48, w3 = 1.45, #3 ≤ #1 ≤ #2 (compare the horizontal coordinate).
(b) σ1 = 2.3, σ2 = 2.25, σ3 = 2.1, and w1 = 1.8, w2 = 1.4, w3 = 1.5, #3 ≤ #1 ≤ #2. (c) σ1 = 2.3, σ2 = 2.2, σ3 = 2.1, and w1 = 1.3, w2 = 1.45, w3 = 1.6,
#3 ≤ #2 ≤ #1.
Lemma 1: Let the SVD of Y be Y = UVT with Algorithm 1 Generalized Soft-Thresholding (GST) [33]
= di ag(σ1 , . . . , σr ). Suppose that all the singular values
are in non-ascending order, then the optimal solution of (4)
will be X = UVT with = di ag(δ1, . . . , δr ), where δi is
given by solving the problem below:
⎧ r

⎪
⎨ min p
(δi − σi )2 + wi δi , i = 1, . . . , r (6a)
δ1 ,...,δr
⎪
⎩ i=1
s.t. δi ≥ 0, and δi ≥ δ j , for i ≤ j. (6b)
Proof: The proof can be found in the appendix.
Solving the problem (6) is still non-trivial because of the
non-convexity and non-smoothness of the objective function
with additional order constraint (i.e., δi ≥ δ j , i ≤ j ).
Intuitively, if the order constraint in (6b) can be discarded,
the problem (6a) can be consequently decoupled into r inde- an implicit function w.r.t. variables σ and w. By analyzing
the property of S G ST (σ ; w), we can achieve efficient solution
pendent subproblems: p
p
for WSNM under certain weights permutation, which will be
min fi (δ) = (δi − σi )2 + wi δi , i = 1, . . . , r, (7) discussed in the next subsection.
δi ≥0
then the challenges can be much reduced. The Schatten

p-norm based minimization has been discussed in several C. Efficient Solution With Non-Descending Weights
recent works such as [33] and [34]. Before analyzing the We now return to the optimization of problem (6). Unfor-
solution of problem (6), here we firstly introduce an efficient tunately, the solutions of the decoupled subproblems (7) may
solution of its partial problem (7). Without loss of generality, not satisfy the order constraint (6b) due to the influence of
the non-negative constraint δ ≥ 0 can be dropped since different weight conditions, which is illustrated in Fig. 1.
singular value σi ≥ 0 [33]. Each subproblem can be effectively Suppose that σi (i = 1, . . . , r ) are in non-ascending order,
solved by the generalized soft-thresholding (GST) algorithm from Fig. 1 (a) and Fig. 1 (b), the horizontal coordinates of
proposed in [33] (Algorithm 1). Given p and wi , there exists δ1 (#1), δ2 (#2) and δ3 (#3) reveal that the order constraint
a specific threshold in (6b) will not hold if wi (i = 1, . . . , r ) are non-ascending
1 p−1 or in arbitrary order. However, Fig. 1 (c) (#3 ≤ #2 ≤ #1)
τ pG ST (wi ) = (2wi (1 − p)) 2− p + wi p(2wi (1 − p)) 2− p . (8)
gives us a hypothesis that constraint (6b) can be satisfied in
If σi < τ pG ST (wi ), δi = 0 is the global minimum; otherwise, the case of non-descending order of weights wi . To validate
the optimum will be obtained at non-zero point. According this hypothesis, we introduce the following lemma:
Lemma 2: Considering the unique minimum S G ST (σ ; w)
to [33], for any σi ∈ (τ pG ST (wi ), +∞), fi (δ) has one unique p
minimum S G ST (σ ; w ), which can be obtained by solving the
i i
as an implicit function w.r.t. σ and w, we have the following
p
following equation: inequality when σ is fixed:
p−1
G ST G ST SG
p
ST
(σ ; wi ) ≥ S G
p
ST
(σ ; w j ), for wi ≤ w j , i ≤ j. (10)
S p (σi ; wi ) − σi + wi p S p (σi ; wi ) = 0. (9)
Proof: The proof can be found in the appendix.
For more details about the GST algorithm, please refer to [33]. Given the above lemma, we are now ready to prove the
Here, we can consider the global minimum S G ST (σ ; w) as following Theorem 2:
p
Algorithm 2 WSNM via GST Algorithm 3 Image Denoising by WSNM
Theorem 2: If the weights satisfy 0 ≤ w1 ≤ w2 ≤ . . . ≤ wr ,

the optimal solutions of all the independent subproblems in (7)
also satisfy the order constraint δ1 ≥ δ2 ≥ . . . ≥ δr .
Proof: The proof can be also found in the appendix.
According to Theorem 2, when the weights are in non-
descending order, solving all the decoupled subproblems (7) is
equivalent to solving the coupled problem (6). So far, the orig- value of the optimal solution of model (11), owns the same
inal problem (4) has been converted to (7) which can be solved property such that the larger the value of δ j (X̂i ), the less
more easily. The proposed algorithm of WSNM for LRMA it should be shrunk. Therefore, an intuitive way for setting
is summarized in Algorithm 2. Generally, the non-descending weights is that the weight should be inversely proportional
order of wi is meaningful for most practical applications in low to δ j (X̂i ), and we let

level vision, because the components with larger singular val- √ 1/ p
ues need to be penalized less than the smaller ones, and hence w j = c n (δ j (X̂i ) + ε), (12)
the preservation of major data components can be guaranteed.
where n is the number of similar patches in √ Yi , ε is set
III. A PPLYING WSNM TO I MAGE D ENOISING to 10−16 to avoid dividing by zero, and c = 2 2σn2 . Since
AND BACKGROUND S UBTRACTION δ j (X̂i ) is unavailable before X̂ is estimated, it can be
To validate the usefulness of the proposed WSNM, we apply initialized by

it to two typical low level vision applications: image denoising δ j (X̂i ) = max{σ j2 (Yi ) − nσn2 , 0}. (13)
and background subtraction. For image denoising, similar to
WNNM [22], WSNM is applied to the matrix of image To restore clean image iteratively, we adopt the iterative
nonlocal similar patches for noise removal. For background regularization scheme in [7], which adds filtered residual back
modeling, like RPCA [1], we propose a WSNM based RPCA to the denoised image as follows:
method to capture the “low rank + sparse” structure of input y(k) = x̂(k−1) + α(y − x̂(k−1) ), (14)
data matrix.
where k denotes the iteration number and α is a relaxation
A. WSNM in Image Denoising parameter. Finally, by aggregating all the denoised patches
together, the image x can be reconstructed. The whole denois-
For a local patch yi in a degraded image y, we search
ing algorithm is summarized in Algorithm 3.
its nonlocal similar patches {ỹ j }nj i=1 by the block matching
method proposed in [6]. Then, {ỹ j }nj i=1 is stacked into a B. WSNM-RPCA for Background Subtraction
matrix Yi , whose columns are composed of the vectorized Background subtraction from video sequences captured by a
patches ỹ j ( j = 1, . . . , n i ). According to degradation model static camera can be considered as a low-rank matrix analysis
of additive white Gaussian noise, we have Yi = Xi + Ni , problem [1], [12], which can be formulated by the following
where Xi and Ni are the patch matrices of the original image NNM based RPCA (NNM-RPCA) model [35]:
and noise, respectively. Under the assumption of low rank,
the matrix Xi can be estimated from Yi by using the LRMA min E1 + X∗ , s.t.Y = X + E, (15)
E,X
methods. Hence, we apply the proposed WSNM model to where the error matrix E is enforced by l1 -norm so that the
estimate Xi , and its corresponding optimization problem can model is robust to sparse noise. Here, we propose a new RPCA
be defined as model namely WSNM-RPCA by using WSNM to replace the
1 p NNM in (15):
X̂i = arg min 2 Yi − Xi 2F + Xi w,S p , (11)
Xi σn p
min E1 + Xw,S p , s.t.Y = X + E. (16)
E,X
where σn2 denotes the noise variance, the first term of (11)
represents the F-norm data fidelity term, and the second term Its augmented Lagrange function is
plays the role of low rank regularization. Usually, σ j (Xi ), L(X, E, Z, μ)
the j -th singular value of Xi , with a larger value is more p μ
important than small ones since it represents the energy of the = E1 + Xw,S p + Z, Y − X − E
+ Y − X − E2F ,
2
j -th component of Xi . Similarly, δ j (X̂i ), the j -th singular (17)
Algorithm 4 WSNM-RPCA
Fig. 3. The influence of changing p upon denoised results under different

noise levels on 40 images randomly selected from the Berkeley Segmentation
Dataset [41].
Fig. 2. Illustration of the over-shrinkage problem. (a) Original. (b) Noisy

Image (σ = 30). (c) Singluar Values of X∗p (#1). (d) Singluar Values
of X∗p (#2).
Fig. 4. The 20 test images for image denoising.
where Z is the Lagrange multiplier, μ is a positive scalar, and point sequence satisfies:
the weights are set as: lim Ek+1 − Ek 2F + Xk+1 − Xk 2F = 0. (19)
√ k→∞
wi = C mn (σi (Y) + ε). (18)
Moreover, the accumulation points form a compact
Minimizing Eq.(17) directly is still rather challenging. continuum C.
According to the recently developed alternating direction Proof: The proof can be found in the appendix.
minimization technique [35], the Lagrangian function can be Although Theorem 3 only ensures that the change of the
solved by minimizing each variable alternatively while fixing variables in consecutive iterations tends to zero, there is no
the other variables. The optimization procedure is described guarantee that the sequence will converge to a local mini-
in Algorithm 4. mum. However, in our experiments the proposed method con-
Here, we will briefly analyze the convergence of the verges fast, which confirms the effectiveness of the proposed
proposed algorithm for WSNM-RPCA. Since the weighted optimization.
Schatten p-norm is not convex and has no general form of
IV. E XPERIMENTAL R ESULTS AND A NALYSIS
subgradient, the convergence analysis of Algorithm 4 is much
more difficult. The following theorem gives the convergence A. Image Denoising
analysis: Firstly, we test the performance of the proposed WSNM in
Theorem 3: If the weights satisfy 0 ≤ w1 ≤ w2 ≤ image denoising, and compare it with six representative algo-
. . . ≤ wr , the sequence Ek and Xk obtained by Algorithm 4 rithms: block-matching 3D filtering [6] (BM3D), patch-based
(WSNM-RPCA) has at least one accumulation point, and the near-optimal image denoising [37] (PBNO), spatially adaptive
TABLE I
D ENOISING R ESULTS (PSNR) BY D IFFERENT M ETHODS
iterative singular-value thresholding [7] (SAIST), expected minimization [22] (WNNM). The denoising results of all
patch log likelihood for image denoising [38] (EPLL), global methods are generated from the source codes or executables
image denoising [39] (GID), and weighted nuclear norm provided by their authors, and we keep the parameter settings
TABLE II
D ENOISING R ESULTS (SSIM) BY D IFFERENT M ETHODS
mentioned in original papers for all the test images. The code Several parameters need to be set in the proposed algo-
and data of the proposed method are available on the website rithm. According to the analysis of power p (discussed in
https://sites.google.com/site/yuanxiehomepage/. Section IV-A.1), we choose p ={1.0,0.85,0.75, 0.7, 0.1, 0.05}
Fig. 5. Denoising results on image Lolly by different methods (noise level σn = 50). (a) Ground Truth. (b) Noisy Image. (c) BM3D, PSNR = 28.94dB.
(d) PBNO, PSNR = 28.72dB. (e) EPLL, PSNR = 28.52dB. (f) SAIST, PSNR = 28.82dB. (g) WNNM, PSNR = 28.95dB. (h) WSNM, PSNR = 29.00dB.
The figure is better viewed in zoomed PDF.
Fig. 6. Denoising results on image Boat by different methods (noise level σn = 50). (a) Ground Truth. (b) Noisy Image. (c) BM3D, PSNR = 26.78dB.
for σn = 20, 30, 50, 60, 75, and 100 in the proposed red and green boxes) randomly cropped from the noisy image
WSNM. Other parameters settings are the same as WNNM. (Fig.2 (b)). Let {δi } be the singular values of the matrix of
( p)
All experiments are implemented in Matlab on a PC with similar patches in the clean image Fig.2 (a), and {δi } be
3.5GHz CPU and 16GB RAM. ∗
the singular values of the optimal solution X p of model (11).
( p)
1) Advantages of the Weighted Schatten p-Norm: This We show the solution {δi } in Fig.2 (c) and Fig.2 (d) for
subsection illustrates the advantages of the proposed weighted patches #1 and #2, respectively. From Fig.2 (c), we can see that
( p=1)
Schatten p-norm. Here, we use a test to point out that weighted the {δi } (denoted by cyan line) are deviated far from {δi }
nuclear norm suffers from a problem: the obtained singular (denoted by magenta line), meaning that the over-shrinkage
values can be over-shrunk, which leads to solution devia- is serious. As p decreases, more high rank components of
p p
tion. In Fig.2, we use both WSNM and WNNM to perform {δi } become zeros, while the low rank components of {δi }
low rank approximation on the two patches (marked by the are getting closer to {δi }.
Fig. 7. Denoising results on image Parrot by different methods (noise level σn = 50). (a) Ground Truth. (b) Noisy Image. (c) BM3D, PSNR = 25.89dB.
Fig. 8. Denoising results on image Monarch by different methods (noise level σn = 100). (a) Ground Truth. (b) Noisy Image. (c) BM3D, PSNR = 22.51dB.
In addition, it is necessary for us to analyze the suitable add noise to them, and test the proposed WSNM with dif-
setting of power p for each noise level σn . So, we randomly ferent power p under different noise levels. In each sub-
select 40 images from the Berkeley Segmentation Dataset [41], figure of Fig.3, horizontal coordinate denotes the values
Fig. 9. Denoising results on image House by different methods (noise level σn = 100). (a) Ground Truth. (b) Noisy Image. (c) BM3D, PSNR = 25.87dB.
X̂ −X 2F
Fig. 10. The log-scale relative error log of NNM-RPCA, WNNM-RPCA and WSNM-RPCA with different ranks and outliers.
X 2F
of power p changing from 0.05 to 1 with interval 0.05, The PSNR and SSIM performance of seven competing
vertical coordinate represents the averaged value of PSNR denoising algorithms is reported in Table I and Table II,
under certain noise level. In this test, six noise levels σn = respectively (the highest PSNR and SSIM values are marked
{20, 30, 50, 60, 75, 100} are used. When handling low and in bold). An overall impression observed from Table I is that
medium noise levels (20, 30 and 50), as shown in first three the proposed WSNM achieves the highest PSNR in almost
subfigures of Fig. 3, the best values of p are 1.0, 0.85 and 0.75, all cases. When the noise levels are increasing from 20 to
respectively. With the noise level becoming stronger, more 50 and to 100, the improvements of WSNM over WNNM
rank components of data are contaminated. Consequently, the increase from 0.03dB to 0.1dB and to 0.12dB on average,
high rank parts will be penalized heavily, which means that the respectively. To sum up, on average our proposed WSNM out-
small values of p are preferred (0.7, 0.1 and 0.05 for noise performs all the other competing methods at all noise levels,
levels 60, 75 and 100, respectively), as demonstrated in the and the improvement becomes more significant as the noise
last three subfigures of Fig. 3. To sum up, the best value of increases.
power p is inversely proportional to the noise level, and the In terms of visual quality, as shown in Fig. 5 ∼ Fig. 9,
empirical values will be directly applied to the test images in our method also outperforms other state-of-the-art denoising
the next subsection. algorithms. In the highlighted red window of Fig. 5, one can
2) Comparison With State-of-the-Art Methods: In this sub- see that the proposed WSNM recovers the face structure well,
section, we compare the proposed WSNM with BM3D, but WNNM generates more artifacts. When we increase the
PBNO, SAIST, EPLL, GID and WNNM on 20 widely used noise level to 100, it can be seen in the zoom-in window
test images displayed in Fig. 4. Zero mean additive white of Fig. 8 that, the proposed WSNM can well reconstruct
Gaussian noises (with variance σn = 20, 30, 50, 60, 75, 100) wing veins of the butterfly, while many artifacts are produced
are added to those test images to generate the noisy by other methods. Similar observations can also be observed
observations. from Figs. 6, 7, and 9. In summary, WSNM presents strong
TABLE III
L OW R ANK M ATRIX R ECOVERY R ESULTS BY NNM-RPCA, WNNM-RPCA AND WSNM-RPCA. 300 × 300 L OW R ANK D ATA
W ITH R ANK F ROM 15 TO 150; 5% E NTRIES A RE C ORRUPTED W ITH S PARSE N OISE
In practice, the proposed method will take about 60 seconds

to restore a noisy image of size 256 × 256, and take about
4 minutes for a 512 × 512 image under Matlab programming
environment in a PC with 3.5GHz CPU and 16GB RAM. The
speed can be improved by using parallel computing or GPU
acceleration.
B. Experimental Results for WSNM-RPCA
1) Synthetic Evaluations: In this subsection, we conduct
extensive synthetic experiments to evaluate the performance of
the proposed WSNM-RPCA for low rank data matrix recovery.
We generate the ground truth low rank matrix X ∈ m×m
by the multiplication of two random matrices generated from
a Gaussian distribution N (0, 1), e.g., X = ABT , where
A and B are of size m × r . Here r = m × Pr indicates the
upper bound of Rank(X). Moreover, m 2 × Pe entries in the
ground truth matrix X are corrupted by sparse noise E, whose
elements obey a uniform distribution between [−50, 50].
In experiments, we set m = 300, and let Pr and Pe vary
from 0.01 to 0.4 with a step of 0.01 to test the performance
of WSNM-RPCA and its two competitors NNM-RPCA [35]
and WNNM-RPCA (note that WNNM-RPCA is a special case
of WSNM-RPCA with p = 1). For each parameter setting
{Pr , Pe }, we generate the synthetic data 10 times, and the
final results are reported by averaging the 10 runs.
√ λ, which is an important
Fig. 11. The background subtraction results by different methods on Water For the NNM-RPCA model, we set
surface and Lobby datasets. The first row shows the original frames and their
ground truth segmentations.
parameter in optimization, to 1/ m to follow the origi-
nal√ work [35]. For WNNM-RPCA and our model, wi =
C mn/(σi (X)+ε) and we set C = 101/ p . The step factor ρ in
denoising capability, producing promising visual quality while ALM method for all models is set to 1.2. Some experimental
keeping higher PSNR indices. results are shown in Tables III and IV. From them, we can
We give a brief discussion of the computational complexity conclude that, when the rank of latent matrix increases or
of the proposed denoising method. For a matrix of size more entries are corrupted in the observed data, NNM-RPCA
h 2 × B, where h denotes the width and height of patch ykj always fails to estimate the intrinsic rank of the ground truth
in Algorithm 3, and B is the number of similar patches, the matrix. WNNM-RPCA and WSNM-RPCA with p = 0.4 can
calculation of SVD actually dominates the main computation get better recovery results than NNM-RPCA but they are still
in each iteration (step 7), and it needs O(min{h 2 B 2 , h 4 B}) sensitive to the rank and noise, while WSNM-RPCA with
flops. The GST algorithm (step 9) only costs O(J B), where p = 0.7 leads to the best recovery of the latent matrix.
J denotes the number of iterations in the GST Algorithm. To facilitate the visual comparison, the log-scale relative error
Therefore, an amount of K × S × O(min{h 2 B 2 , h 4 B} + J B) is used to measure the performance, which is defined as:
flops are required for the proposed method, where K denotes
the number of iterations in Algorithm 3, and S represents X̂ − X2F
log , (20)
the total number of patches in a sliding window (step 5). X2F
TABLE IV
L OW R ANK M ATRIX R ECOVERY R ESULTS BY NNM-RPCA, WNNM-RPCA AND WSNM-RPCA. 300 × 300 L OW R ANK D ATA
W ITH R ANK F ROM 15 TO 150; 10% E NTRIES A RE C ORRUPTED W ITH S PARSE N OISE
Fig. 12. The background subtraction results by different methods on Curtain (left), Airport (middle) and Bootstrap (right) datasets. The first row shows the
original frames and their ground truth segmentations.
where X̂ denotes the recovered matrix. The log-scale rel- Let’s discuss the computational complexity of the pro-
ative error map of recovered matrix by the three models posed WSNM-RPCA model. For a matrix Y with a size of
are presented in Fig. 10. From Fig. 10 one can obviously h× B, in each iteration of Algorithm 4, the SVD dominates the
observe that the success area of WSNM-RPCA (both p = 0.7 computational cost of the subproblem in step 5, and it needs
and p = 0.4) are larger than that of the NNM-RPCA and O(min{h B 2 , h 2 B} + J B) flops, where J denotes the number
WNNM-RPCA, which means that WSNM-RPCA is able of iterations in the GST Algorithm. The computational demand
to recover the low rank matrix with sparse noise in more for subproblem in step 4 is O(h B). Therefore, an amount of
challenging cases. T × O(min{h B 2 , h 2 B} + J B + h B) flops are required for the
TABLE V to image denoising and background subtraction to validate

BACKGROUND S UBTRACTION R ESULTS BY D IFFERENT M ETHODS its effectiveness. The experimental results demonstrated that
WSNM leads to impressive improvements over state-of-the-art
methods. In the future, we will extend the proposed WSNM
to other applications such as face hallucination [42], facial
deblur [43], and face sketch synthesis [44].
A PPENDIX
A. Proof of Lemma 1
Proof: Let the optimal solution of problem (4) have the
compact SVD X = QR T , and the SVD of matrix Y be
Y = UVT , where both and are diagonal matrices with
the same order (here non-ascending). According to Theorem 1,
we have
X − Y2F = tr (T ) + tr ( T ) − 2tr (XT Y)
proposed method, where T denotes the number of iterations ≥ tr (T ) + tr ( T ) − 2tr (T )
in ALM. In practice, the proposed method will take about = − 2F . (21)
2 seconds for a matrix of size 300 × 300 to pursue a low rank
approximation. This implies that
2) Background Subtraction: In this subsection, we X − Y2F + tr (W p ) ≥ − 2F + tr (W p ). (22)
test the proposed WSNM-RPCA and other competing
methods, including NNM-RPCA [35], RegL1ALM [5], Note that the equality holds if and only if Q = U and R = V
MoG-RPCA [40] and WNNM based RPCA, on all the nine according to (5). Therefore, minimizing (4) can be reduced to
video sequences provided by Li et al. [36] with all frames minimizing the problem in (6).
involved. For our WSNM-RPCA, we set the parameter C in
Eq. (18) to 2 max(m 2 , n 2 ) and power p = 0.7. The reason B. Proof of Lemma 2
of choosing p = 0.7 is based on the experimental results of Proof: As mentioned in the main paper, for each subprob-
WSNM-RPCA on synthesis data, as illustrated in Fig. 10. It is lem, we only need to solve:
a tradeoff between enforcing low-rank and separating outliers. 1 p
To measure the background modeling output quantitatively, f i (δ) = (σi − δ)2 + wi δi , 0 < p ≤ 1, i ∈ {1, . . . , r }.
2
we use S(A, B) = A∩B A∪B to calculate the similarity between (23)
the estimated foreground regions and the ground truths.
To generate the binary foreground map, the MRF model is According to [33], for any σi ∈ (τ pG ST (wi ), +∞), f i (δ) has
one unique minimum S G ST (σ ; w ), which can be obtained by
used to label the absolute value of the estimated sparse error. p i i
The quantitative results by different methods are illustrated solving the following equation:
in Table V. On all the nine test sequences, the proposed p−1
WSNM-RPCA model achieves the best results. Moreover, SG
p
ST
(σi ; wi ) − σi + w i p S G ST
p (σi ; wi ) = 0. (24)
the visual results of challenging frames in five sequences
are shown in Fig. 11 and Fig. 12, which demonstrate that However, solving Eq. (24) directly is non-trivial, and an
our approach can extract clear background and seperate iterative algorithm was proposed in [33], which is shown in
foreground region with high accuracy. In contrast, the results Algorithm 1. The following analysis is based on this algorithm.
estimated by other methods exhibit various degrees of ghost When |σ | ≤ τ pG ST (wi ) and |σ | ≤ τ pG ST (w j ), since
shadow in the background, leading to incomplete foreground τ p ST (w) is a monotonically increasing function, we have
G
segmentation. τ pG ST (wi ) ≤ τ pG ST (w j ). Then according to Algorithm 1, we
get S G ST (σ ; w ) = S G ST (σ ; w ) = 0. Hence, inequality (10)
V. C ONCLUSIONS p i p j
holds.
In this paper, a weighted Schatten p-norm minimiza-
When |σ | > τ pG ST (wi ) and |σ | ≤ τ pG ST (w j ), with
tion (WSNM) model was proposed for low rank matrix ST (σ ; w ) = 0. Moreover,
approximation. WSNM has two major merits: on one hand, Algorithm 1, we can achieve S G p j
the objective function (23) indicates that S G ST (σ ; w ) ≥ 0 if
it is flexible to fit into practical applications by providing p i
different treatments for different rank components; on the σ is no less than zero, and hence inequality (10) still holds.
other hand, the Schatten p-norm promotes the reconstructed When |σ | > τ pG ST (wi ) and |σ | > τ pG ST (w j ), we use
low rank matrix to be closer to the latent low rank data the mathematical induction method to prove that inequality
(10) does hold. Referring to Algorithm 1, let S G p,k (σ ; w)
ST
matrix. We showed that, when the weights are in non-
descending order, the solution of WSNM has global optimum denote δ (k) with respect to w. When k = 0, we
have S G p,k (σ ; wi ) = S p,k (σ ; w j ) = |σ |, meaning that
ST G ST
which can be efficiently solved by the generalized iterated
shrinkage algorithm. The proposed WSNM was then applied S p,k (σ ; wi ) ≥ S p,k (σ ; w j ) holds. Suppose that inequality
G ST G ST
p,m (σ ; wi ) ≥ S p,m (σ ; w j ) holds for k = m, when k =

SG ST G ST From the augmented lagrange function, for a bounded Zk ,
m + 1, we have: Xk → ∞ or Ek → ∞ will lead to L(X, E, Z, μ) → ∞.
Hence, Xk and Ek are also bounded. There exist at least one
p,m+1 (σ ; wi ) = |σ | − wi p(S p,m (σ ; wi ))
SG ,
ST G ST p−1
(25)
accumulation point for {Xk , Ek , Zk }. Specifically, we get
p,m+1 (σ ; w j )
SG = |σ | − w j p(S G
p,m (σ ; w j )) .
ST ST p−1
(26)
lim Y − Xk+1 − Ek+1 2F = lim Zk+1 − Zk 2F = 0,
k→∞ k→∞
p,m (σ ; wi )
SG ≥ p,m (σ ; w j ) and 0 < p ≤ 1, we have:
Since ST SG ST
(34)
p,m+1 (σ ; wi )
SG ≥ p,m+1 (σ ; w j )
ST
SG ST
(27)
the accumulation point is a feasible solution for the objective
p,k (σ ; wi )
SG ≥ p,k (σ ; w j )
So far, we have proven that ST SG ST function.
holds for any nonnegative integer k. If k reaches J , we can We now prove that the difference between two consecutive
also get iterations will become zero. For E, we have:
SG
p
ST
(σ ; wi ) ≥ S G
p
ST
(σ ; w j ), (28) lim Ek+1 − Ek 2F
k→∞
which means that inequality (10) still holds. This completes = lim S 1 (Y + μ−1 −1
k Zk − Xk ) − (Y + μk Zk − Xk ) F
2
k→∞ μk
the proof of Lemma 2.
mn
≤ lim = 0, (35)
k→∞ μk
C. Proof of Theorem 2
ST (y; w) as where S 1 (·) is the soft-thresholding operation with
Proof: Considering the unique minimum S G p μk
1
an implicit function w.r.t. y and w, to prove Theorem 2 we parameter μk ,
m and n are the size of matrix Y. Similarly,
only need to confirm that: for X, we have
SG
p
ST
(yi ; wi ) ≥ S G
p
ST
(y j ; w j ), lim Xk+1 − Xk 2F
k→∞
for yi ≥ y j , wi ≤ w j , i ≤ j. (29)
= lim Xk+1 − (Y + μ−1 −1 2
k Zk−1 − Ek − μk Lk ) F
k→∞
On one hand, for a fixed w, the following inequality holds:
≤ lim Xk+1 − (Y − Ek+1 )2F + μ−1 −1
k Zk−1 − μk Zk F
2
SG
p
ST
(yi ; w) ≥ S G
p
ST
(y j ; w), for yi ≥ y j , i ≤ j, (30) k→∞
= lim Uk (k − k )VkT + μ−1 −2
k Zk F + μk Zk−1 − Zk F
2 2
which has been proved in [34]. On the other hand, according k→∞
to Lemma 2, for a fixed y, we have J wi
≤ ( )2 + μ−2
k (Zk F + Zk−1 − Zk F ) = 0.
2 2
μk F
SG
p
ST
(y; wi ) ≥ S G
p
ST
(y; w j ), for wi ≤ w j , i ≤ j. (31) i
(36)
Therefore, inequalities (30) and (31) indicate that (29) holds.
The proof is completed. The proof is completed.
D. Proof of Theorem 3 ACKNOWLEDGMENT

Proof: Denote the SVD of matrix {Y + μ−1 k Zk − Ek+1 } in The authors would like to thank editor and anonymous
the k + 1-th iteration as Uk k VkT , where k is the diagonal
reviewers who gave valuable suggestion that has helped to
singular value matrix. By using the GST algorithm for WSNM,
improve the quality of the paper.
we have:
Xk+1 = Uk k VkT , (32)
R EFERENCES
where k = {di ag(δk1, δk2 , . . . , δkn )}
is the singular value [1] J. Wright, Y. Peng, Y. Ma, A. Ganesh, and S. Rao, “Robust principal
matrix after generalized soft-thresholding. Thus, based on component analysis: Exact recovery of corrupted low-rank matrices
step 6 in the WSNM-RPCA algorithm, we have: via convex optimization,” in Proc. Neural Inf. Process. Syst., 2009,
pp. 1–9.
Zk+1 2F = Zk + μk (Y − Xk+1 − Ek+1 )2F [2] Y. Mu, J. Dong, X. Yuan, and S. Yan, “Accelerated low-rank visual
recovery by random projection,” in Proc. IEEE Comput. Vis. Pattern
= μ2k μ−1 2
k Zk + Y − Xk+1 − Ek+1 F Recognit., Jun. 2011, pp. 2609–2616.
[3] F. De la Torre and M. J. Black, “A framework for robust subspace
= μ2k Uk k VkT − Uk k VkT 2F learning,” Int. J. Comput. Vis., vol. 54, no. 1, pp. 117–142, Aug. 2003.
= μ2k k − k 2F [4] G. Liu, Z. Lin, and Y. Yu, “Robust subspace segmentation by low-rank
J wi representation,” in Proc. Int. Conf. Mach. Learn., 2010.
[5] Y. Zheng, G. Liu, S. Sugimoto, S. Yan, and M. Okutomi, “Practical
≤ μ2k ( )2
μk F low-rank matrix approximation under robust L 1 norm,” in Proc. IEEE
i Comput. Vis. Pattern Recognit., Jun. 2012, pp. 1410–1417.

[6] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising
= J wi 2F , (33) by sparse 3-D transform-domain collaborative filtering,” IEEE Trans.
i Image Process., vol. 16, no. 8, pp. 2080–2095, Aug. 2007.
where J is the number of iterations in Algorithm GST. [7] W. Dong, G. Shi, and X. Li, “Nonlocal image restoration with bilateral
variance estimation: A low-rank approach,” IEEE Trans. Image Process.,
So, {Zk } is bounded. vol. 22, no. 2, pp. 700–711, Feb. 2013.
[8] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local [35] Z. Lin, M. Chen, and Y. Ma, “The augmented Lagrange multiplier
sparse models for image restoration,” in Proc. IEEE Int. Conf. Comput. method for exact recovery of corrupted low-rank matrices,” Dept. Elect.
Vis., Sep./Oct. 2009, pp. 2272–2279. Comput. Eng., UIUC, Champaign, IL, USA, Tech. Rep. UILU-ENG-
[9] R. Wang and D. Tao, “Non-local auto-encoder with collaborative sta- 09-2215, 2009.
bilization for image restoration,” IEEE Trans. Image Process., vol. 25, [36] L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian, “Statistical modeling of
no. 5, pp. 2117–2129, May 2016. complex backgrounds for foreground object detection,” IEEE Trans.
[10] N. Srebro and T. Jaakkola, “Weighted low-rank approximations,” in Image Process., vol. 13, no. 11, pp. 1459–1472, Nov. 2004.
Proc. IEEE Int. Conf. Mach. Learn., 2003. [37] P. Chatterjee and P. Milanfar, “Patch-based near-optimal image denois-
[11] A. Eriksson and A. van den Hengel, “Efficient computation of robust ing,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 1635–1649,
low-rank matrix approximations in the presence of missing data using Apr. 2012.
the L 1 norm,” in Proc. IEEE Comput. Vis. Pattern Recognit., Jun. 2010, [38] D. Zoran and Y. Weiss, “From learning models of natural image patches
pp. 771–778. to whole image restoration,” in Proc. IEEE Int. Conf. Comput. Vis.,
[12] E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component Nov. 2011, pp. 479–486.
analysis?” J. ACM, vol. 58, no. 1, pp. 1–37, 2009. [39] H. Talebi and P. Milanfar, “Global image denoising,” IEEE Trans. Image
[13] A. M. Buchanan and A. W. Fitzgibbon, “Damped Newton algorithms Process., vol. 23, no. 2, pp. 755–768, Feb. 2014.
for matrix factorization with missing data,” in Proc. IEEE Comput. Vis. [40] Q. Zhao, D. Meng, Z. Xu, W. Zuo, and L. Zhang, “Robust principal
Pattern Recognit., Jun. 2005, pp. 316–322. component analysis with complex noise,” in Proc. IEEE Int. Conf. Mach.
[14] Q. Ke and T. Kanade, “Robust L 1 norm factorization in the presence of Learn., 2014.
outliers and missing data by alternative convex programming,” in Proc. [41] The Berkeley Segmentation Dataset and Bench-
IEEE Comput. Vis. Pattern Recognit., Jun. 2005, pp. 739–746. mark, accessed on Jun. 1, 2007. [Online]. Available:
[15] N. Wang, T. Yao, J. Wang, and D. Yeung, “A probabilistic approach http://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
to robust matrix factorization,” in Proc. Eur. Conf. Comput. Vis., 2012, [42] N. Wang, D. Tao, X. Gao, X. Li, and J. Li, “A comprehensive survey
pp. 126–139. to face hallucination,” Int. J. Comput. Vis., vol. 106, no. 1, pp. 9–30,
[16] R. Cabral, F. De la Torre, J. Costeira, and A. Bernardino, “Unifying 2014.
nuclear norm and bilinear factorization approaches for low-rank matrix [43] D. Tian and D. Tao, “Coupled learning for facial deblur,” IEEE Trans.
decomposition,” in Proc. IEEE Int. Conf. Comput. Vis., Dec. 2013, Image Process., vol. 25, no. 2, pp. 961–972, Feb. 2016.
pp. 2488–2495. [44] N. Wang, D. Tao, X. Gao, X. Li, and J. Li, “Transductive face sketch-
[17] T. Liu, M. Gong, and D. Tao, “Large-cone nonnegative matrix factor- photo synthesis,” IEEE Trans. Neural Netw. Learn. Syst., vol. 24, no. 9,
ization,” IEEE Trans. Neural Netw. Learn. Syst., to be published. pp. 1364–1376, Sep. 2013.
[18] T. Liu and D. Tao, “On the performance of Manhattan nonnegative
matrix factorization,” IEEE Trans. Neural Netw. Learn. Syst., vol. 27,
no. 9, pp. 1851–1863, Sep. 2016. Yuan Xie (M’12) received the master’s degree from
[19] M. Fazel, “Matrix rank minimization with applications,” the School of Information Science and Technol-
Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford, CA, ogy, Xiamen University, China, in 2010, and the
USA, 2002. Ph.D. degree in pattern recognition and intelli-
[20] E. J. Candès and B. Recht, “Exact matrix completion via convex gent systems from the Institute of Automation,
optimization,” Found. Comput. Math., vol. 9, no. 6, pp. 717–772, 2009. Chinese Academy of Sciences (CAS), in 2013. He is
[21] J.-F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding currently with the Visual Computing Laboratory,
algorithm for matrix completion,” SIAM J. Optim., vol. 20, no. 4, Department of Computing, The Hong Kong Poly-
pp. 1956–1982, 2010. technic University, Hong Kong, and also with the
[22] S. Gu, L. Zhang, W. Zuo, and X. Feng, “Weighted nuclear norm Research Center of Precision Sensing and Control,
minimization with application to image denoising,” in Proc. IEEE Institute of Automation, CAS. His research interests
Comput. Vis. Pattern Recognit., Jun. 2014, pp. 2862–2869. include image processing, computer vision, machine learning, and pattern
[23] C. Lu, C. Zhu, C. Xu, S. Yan, and Z. Lin, “Generalized singular value recognition.
thresholding,” in Proc. AAAI Conf. Artif. Intell., 2015, pp. 1805–1811. Dr. Xie received the Hong Kong Scholar Award from the Society of
[24] D. Zhang, Y. Hu, J. Ye, X. Li, and X. He, “Matrix completion by Hong Kong Scholars and the China National Postdoctoral Council in 2014.
truncated nuclear norm regularization,” in Proc. IEEE Comput. Vis.
Pattern Recognit., Jun. 2012, pp. 2192–2199.
[25] T. H. Oh, H. Kim, Y. W. Tai, J. C. Bazin, and I. S. Kweon, “Partial
sum minimization of singular values in RPCA for low-level vision,” in Shuhang Gu received the B.E. degree from
Proc. IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 145–152. the School of Astronautics, Beijing University of
[26] R. Chartrand, “Exact reconstruction of sparse signals via noncon- Aeronatics and Astronautics, China, in 2010, and the
vex minimization,” IEEE Signal Process. Lett., vol. 14, no. 10, M.E. degree from the Institute of Pattern Recogni-
pp. 707–710, Oct. 2007. tion and Artificial Intelligence, Huazhong University
[27] E. J. Candès, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by of Science and Technology, China, in 2013. He
reweighted 1 minimization,” J. Fourier Anal. Appl., vol. 14, nos. 5–6, is currently pursuing the Ph.D. degree with the
pp. 877–905, 2008. Department of Computing, The Hong Kong Poly-
[28] F. Nie, H. Huang, and C. Ding, “Low-rank matrix recovery via efficient technic University. His research interests include
schatten p-norm minimization,” in Proc. AAAI Conf. Artif. Intell., 2012, image restoration, sparse, and low rank models.
pp. 655–661.
[29] L. Liu, W. Huang, and D. Chen, “Exact minimum rank approximation
via Schatten p-norm minimization,” J. Comput. Appl. Math., vol. 267,
no. 1, pp. 218–227, 2014.
Yan Liu received the B.Eng. degree from the
[30] C. Lu, J. Tang, S. Yan, and Z. Lin, “Generalized nonconvex nonsmooth
low-rank minimization,” in Proc. IEEE Comput. Vis. Pattern Recognit., Department of Electronic Engineering, Southeast
University, the M.Sc. degree from the School
Jun. 2014, pp. 4130–4137.
of Business, Nanjing University, China, and the
[31] K. Mohan and M. Fazel, “Iterative reweighted algorithms for matrix
rank minimization,” J. Mach. Learn. Res., vol. 13, no. 1, pp. 3441–3473, Ph.D. degree from the Department of Computer
Science, Columbia University, USA. She is an Asso-
2012.
ciate Professor with the Department of Comput-
[32] L. Mirsky, “A trace inequality of John von Neumann,” Monatshefte
ing, The Hong Kong Polytechnic University. As a
Math., vol. 79, no. 4, pp. 303–306, 1975.
[33] W. Zuo, D. Meng, L. Zhang, X. Feng, and D. Zhang, “A generalized Director of the Cognitive Computing Laboratory,
she focuses her research in brain modeling, ranging
iterated shrinkage algorithm for non-convex sparse coding,” in Proc.
from image/video content analysis, music therapy,
IEEE Int. Conf. Comput. Vis., Dec. 2013, pp. 217–224.
[34] M. Nikolova, “Analysis of the recovery of edges in images and signals manifold learning, and deep learning. She is a Principle Investigator for
by minimizing nonconvex regularized least-squares,” SIAM Multiscale more than ten projects and published about 20 papers in various international
conferences and journals.
Model Simul., vol. 4, no. 3, pp. 960–991, 2005.
Wangmeng Zuo (M’09–SM’15) received the Ph.D. Lei Zhang (M’04–SM’14) received the B.Sc. degree
degree in computer application technology from the from the Shenyang Institute of Aeronautical Engi-
Harbin Institute of Technology, Harbin, China, in neering, Shenyang, China, in 1995, and the
2007. In 2004, from 2005 to 2006, and from 2007 to M.Sc. and Ph.D. degrees in control theory and engi-
2008, he was a Research Assistant with the Depart- neering from Northwestern Polytechnical University,
ment of Computing, The Hong Kong Polytechnic Xi’an, China, in 1998 and 2001, respectively. From
University, Hong Kong. From 2009 to 2010, he 2001 to 2002, he was a Research Associate with
was a Visiting Professor with Microsoft Research the Department of Computing, The Hong Kong
Asia. He is currently a Professor with the School of Polytechnic University. From 2003 to 2006, he
Computer Science and Technology, Harbin Institute was a Post-Doctoral Fellow with the Department
of Technology. His current research interests include of Electrical and Computer Engineering, McMaster
image modeling and blind restoration, discriminative learning, biometrics, and University, Canada. In 2006, he joined the Department of Computing, The
computer vision. He is an Associate Editor of the IET Biometrics. Hong Kong Polytechnic University, as an Assistant Professor, where he has
been a Full Professor since 2015. His research interests include computer
Wensheng Zhang received the Ph.D. degree in vision, pattern recognition, image and video processing, and biometrics.
pattern recognition and intelligent systems from the He has published more than 200 papers in those areas. As of 2016, his
Institute of Automation, Chinese Academy of Sci- publications have been cited more than 19 000 times in the literature. He is
ences (CAS), in 2000. He is a Professor of Machine an Associate Editor of the IEEE T RANSACTIONS ON I MAGE P ROCESSING,
Learning and Data Mining and the Director of the the Journal of Electronic Imaging, and Image and Vision Computing. He was
Research and Development Department with the selected as a “Highly Cited Researcher” by Thomson Reuters, 2015.
Institute of Automation, CAS. His research interests
include computer vision, pattern recognition, artifi-
cial intelligence, and computer human interaction.

Weighted Schatten P-Norm Minimization For Image Denoising and Background Subtraction

Uploaded by

Copyright:

Available Formats

Weighted Schatten P-Norm Minimization For Image Denoising and Background Subtraction

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Weighted Schatten P-Norm Minimization For Image Denoising and Background Subtraction

Uploaded by

Copyright:

Available Formats

4842 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 25, NO.

10, OCTOBER 2016

Weighted Schatten p-Norm Minimization for Image

X̂ = arg min X − Y2F + λX∗ , (1)

then the challenges can be much reduced. The Schatten

Algorithm 2 WSNM via GST Algorithm 3 Image Denoising by WSNM

Theorem 2: If the weights satisfy 0 ≤ w1 ≤ w2 ≤ . . . ≤ wr ,

Fig. 3. The influence of changing p upon denoised results under different

Fig. 2. Illustration of the over-shrinkage problem. (a) Original. (b) Noisy

In practice, the proposed method will take about 60 seconds

TABLE V to image denoising and background subtraction to validate

p,m (σ ; wi ) ≥ S p,m (σ ; w j ) holds for k = m, when k =

D. Proof of Theorem 3 ACKNOWLEDGMENT

You might also like