Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks

Jin Li1, Ziqiang He1, Anwei Luo1, Jian-Fang Hu1, Z. Jane Wang2, Xiangui Kang1
1Guangdong Key Lab of Information Security,
School of Computer Science and Engineering, Sun Yat-Sen University
2Electrical and Computer Engineering Dept, University of British Columbia
Corresponding author.
Abstract

Imperceptible adversarial attacks aim to fool DNNs by adding imperceptible perturbation to the input data. Previous methods typically improve the imperceptibility of attacks by integrating common attack paradigms with specifically designed perception-based losses or the capabilities of generative models. In this paper, we propose Adversarial Attacks in Diffusion (AdvAD), a novel modeling framework distinct from existing attack paradigms. AdvAD innovatively conceptualizes attacking as a non-parametric diffusion process by theoretically exploring basic modeling approach rather than using the denoising or generation abilities of regular diffusion models requiring neural networks. At each step, much subtler yet effective adversarial guidance is crafted using only the attacked model without any additional network, which gradually leads the end of diffusion process from the original image to a desired imperceptible adversarial example. Grounded in a solid theoretical foundation of the proposed non-parametric diffusion process, AdvAD achieves high attack efficacy and imperceptibility with intrinsically lower overall perturbation strength. Additionally, an enhanced version AdvAD-X is proposed to evaluate the extreme of our novel framework under an ideal scenario. Extensive experiments demonstrate the effectiveness of the proposed AdvAD and AdvAD-X. Compared with state-of-the-art imperceptible attacks, AdvAD achieves an average of 99.9%percent\%% (+17.3%percent\%%) ASR with 1.34 (-0.97) l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT distance, 49.74 (+4.76) PSNR and 0.9971 (+0.0043) SSIM against four prevalent DNNs with three different architectures on the ImageNet-compatible dataset. Code is available at https://github.com/XianguiKang/AdvAD.

1 Introduction

Deep Neural Networks (DNNs) are shown to be vulnerable to adversarial attacks [1, 2] (i.e., add maliciously crafted perturbations to the input data), posing serious security concerns to real-world applications [3]. The research of adversarial attacks also plays an important role in proactively exposing potential threats, as well as promoting model robustness and corresponding defense methods [4, 5, 6, 7, 8, 9, 10, 11]. Many attacks [12, 13, 14, 15] focus on maximizing the attack success rate and transferability under relatively lenient restrictions (i.e., lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT or l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT norm) of adversarial perturbation, but they could have poor stealthiness and imperceptibility since the crafted adversarial examples can be easily detected by the Human Visual System (HVS) [16]. Therefore, imperceptible adversarial attacks [17, 18, 19, 20, 21, 22, 23], aiming to maintain attacking efficacy while improving imperceptibility, have attracted considerable attention.

Current imperceptible adversarial attacks could be summarized into two categories: 1) perturbation-based attacks devised on perceptual characteristics, and 2) unrestricted attacks. The first one is motivated by the fact that adding adversarial perturbations to different components of an image has varying perceptual quality levels to the HVS. By studying components such as image color [19], texture complexity [21], frequency spectrum [23, 24], etc., these methods design corresponding perceptual-based loss functions and incorporate them to the optimization process to craft adversarial examples where the adversarial perturbation is constrained and hidden within specific image regions. Instead of injecting noise-like adversarial perturbations, unrestricted attacks heavily but reasonably modify attributes of images like semantic content to perform attacks. Apart from early work that adopts GANs [25], some recent methods combine the prevalent diffusion models [26, 27, 28] into the adversarial optimization process in an image edition-like way of repeatedly adding noise and denoising to eliminate the noise pattern within the final adversarial examples [29, 30] or optimize the embedding of latent diffusion models [31, 32]. However, due to the uncertainty of generative models and the unrestricted setting itself, some unrestricted adversarial examples inevitably exhibit obvious unnatural texture or semantic changes and lose the imperceptibility, especially for images with complex content. Although previous methods have equipped attacks with imperceptibility utilizing various designs mentioned above, it remains an essential challenge of achieving imperceptible adversarial attacks: How to attack with inherently minimal perturbation strength from a modeling perspective?

To address this fundamental challenge, we propose Adversarial Attacks in Diffusion (AdvAD), a brand new modeling framework distinct from common attack paradigms of gradient ascending [2] or optimization with adversarial losses [17]. The proposed AdvAD explores a novel non-parametric diffusion process for attacks, which fully inherits two key merits of diffusion models: i) the modeling philosophy of converting a difficult task into a series of simple sub-tasks, and ii) solid theoretical foundation. Specifically, AdvAD achieves high attack efficacy with intrinsically lower perturbation strength by innovatively modeling the attack process as a decomposed diffusion trajectory from an initialized noise to an adversarial example. At each step, a much subtler (for imperceptibility) yet more effective (for attack performance) adversarial guidance is calculated and injected with two cooperating, theoretically grounded non-parametric modules called Attacked Model Guidance (AMG) and Pixel-level Constraint (PC), which gradually leads the end of this trajectory from the original image distribution to a desired adversarially conditioned distribution based on the theory of diffusion models (e.g., deterministic diffusion[27], conditional sampling [33, 34], etc.).

Here, we would like to clarify that the proposed diffusion process for attacks is considered as non-parametric since it does not require additional networks as needed in regular diffusion models for noise estimation. AdvAD firstly initializes a fixed diffusion noise, which is then ingeniously manipulated at each step via the adversarial guidance crafted by the proposed AMG and PC modules using only the attacked model with theoretically derived equations. In this way, the proposed AdvAD is facilitated with the modeling approach of diffusion models rather than their denoising or generative capabilities, which avoids the negative impact like semantic content changes caused by the uncertainty of generative models and also promises relatively low computational complexity. Based on AdvAD, we further propose an enhanced version AdvAD-X (‘X’ for ‘eXtreme’) with two extra strategies to squeeze the extreme performance in an ideal scenario of the proposed new modeling framework with unique properties, which also possesses theoretical significance and provides new insights for revealing the robustness of DNNs. In summary, our main contributions are:

  • Addressing the essential challenge of imperceptible adversarial attacks from a novel modeling perspective for the first time, we theoretically explore and derive the basic modeling of diffusion models to perform attacks with inherently lower perturbation strength through a non-parametric diffusion process that requires no additional networks.

  • We propose two attack versions, AdvAD and AdvAD-X. For the basic AdvAD, the AMG and PC modules cooperate to craft much subtler yet effective adversarial guidance which is progressively injected via initialized diffusion noise at each step, and AdvAD-X further reduces the perturbation strength to an extreme level in an ideal scenario with theoretical significance.

  • Extensive experiments are conducted to evaluate the effectiveness of our methods in terms of attack success rate, imperceptibility, and robustness. Experimental results demonstrate the superiority of the novel modeling approach for imperceptible adversarial attacks.

2 Preliminaries

Adversarial Attacks. Given an original image 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT with ground-truth label ygtsubscript𝑦𝑔𝑡y_{gt}italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT and a classifier f()𝑓f(\cdot)italic_f ( ⋅ ) satisfying f(𝒙ori)=ygt𝑓subscript𝒙𝑜𝑟𝑖subscript𝑦𝑔𝑡f(\boldsymbol{x}_{ori})=y_{gt}italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT, normal untargeted attacks aim to craft the adversarial example 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT that misleads the classifier, formulated as:

f(𝒙adv)ygt,s.t.𝒙adv𝒙oripζ,formulae-sequence𝑓subscript𝒙𝑎𝑑𝑣subscript𝑦𝑔𝑡𝑠𝑡subscriptnormsubscript𝒙𝑎𝑑𝑣subscript𝒙𝑜𝑟𝑖𝑝𝜁f(\boldsymbol{x}_{adv})\neq y_{gt},\quad s.t.\|\boldsymbol{x}_{adv}-% \boldsymbol{x}_{ori}\|_{p}\leq\zeta,italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) ≠ italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT , italic_s . italic_t . ∥ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ≤ italic_ζ , (1)

where p\|\cdot\|_{p}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT represents lpsubscript𝑙𝑝l_{p}italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT-norm that is usually implemented with lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT-norm to limit the distance between 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT and 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT within an upper bound of budget ζ𝜁\zetaitalic_ζ. In this paper, we focus on the more general setting of untargeted attacks. More information on related works is provided in Appendix A.

Deterministic Diffusion Process. In the deterministic situation of DDIM [27] with σt=0subscript𝜎𝑡0\sigma_{t}=0italic_σ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = 0, for an image 𝒙0subscript𝒙0\boldsymbol{x}_{0}bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and pre-defined diffusion coefficients α0:T(0,1]Tsubscript𝛼:0𝑇superscript01𝑇\alpha_{0:T}\in(0,1]^{T}italic_α start_POSTSUBSCRIPT 0 : italic_T end_POSTSUBSCRIPT ∈ ( 0 , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT for step t[0:T]t\in[0:T]italic_t ∈ [ 0 : italic_T ], 𝒙tsubscript𝒙𝑡\boldsymbol{x}_{t}bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in the Forward process of adding noise to 𝒙0subscript𝒙0\boldsymbol{x}_{0}bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is given by 𝒙t=αt𝒙0+1αtϵsubscript𝒙𝑡subscript𝛼𝑡subscript𝒙01subscript𝛼𝑡bold-italic-ϵ\boldsymbol{x}_{t}=\sqrt{\alpha_{t}}\boldsymbol{x}_{0}+\sqrt{1-\alpha_{t}}% \boldsymbol{\epsilon}bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ, where ϵ𝒩(𝟎,𝑰)similar-tobold-italic-ϵ𝒩0𝑰\boldsymbol{\epsilon}\sim\mathcal{N}(\boldsymbol{0},\boldsymbol{\mathit{I}})bold_italic_ϵ ∼ caligraphic_N ( bold_0 , bold_italic_I ) represents Gaussian noise. For the Backward denoising steps, unlike the DDPM [26] based on Markov chains that each state directly depends on the previous one, DDIM employs a non-Markovian approach. In the backward process, each step first involves calculating a "prediction" of final step 𝒙t0superscriptsubscript𝒙𝑡0\boldsymbol{x}_{t}^{0}bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT from current 𝒙tsubscript𝒙𝑡\boldsymbol{x}_{t}bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, then adding noise to it again to obtain 𝒙t1subscript𝒙𝑡1\boldsymbol{x}_{t-1}bold_italic_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT, expressed as:

𝒙t1=αt1(𝒙t1αtϵθ(𝒙t)αt)+1αt1ϵθ(𝒙t),subscript𝒙𝑡1subscript𝛼𝑡1subscript𝒙𝑡1subscript𝛼𝑡subscriptbold-italic-ϵ𝜃subscript𝒙𝑡subscript𝛼𝑡1subscript𝛼𝑡1subscriptbold-italic-ϵ𝜃subscript𝒙𝑡\boldsymbol{x}_{t-1}=\sqrt{\alpha_{t-1}}(\frac{\boldsymbol{x}_{t}-\sqrt{1-% \alpha_{t}}\boldsymbol{\epsilon}_{\theta}(\boldsymbol{x}_{t})}{\sqrt{\alpha_{t% }}})+\sqrt{1-\alpha_{t-1}}\boldsymbol{\epsilon}_{\theta}(\boldsymbol{x}_{t}),bold_italic_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT = square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) , (2)

where ϵθ(𝒙t)subscriptbold-italic-ϵ𝜃subscript𝒙𝑡\boldsymbol{\epsilon}_{\theta}(\boldsymbol{x}_{t})bold_italic_ϵ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) is a estimated diffusion noise using a pretrained neural network θ𝜃\thetaitalic_θ for current step, and the term in the first parenthesis represents the predicted 𝒙t0superscriptsubscript𝒙𝑡0\boldsymbol{x}_{t}^{0}bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, derived by a simple variation of Eq. (2).

Conditional sampling. Song et al. [34] propose the conditional sampling technique for the score-based generative models with score function 𝒙tlogp(𝒙t)subscriptsubscript𝒙𝑡log𝑝subscript𝒙𝑡\nabla_{\boldsymbol{x}_{t}}\text{log}\,p(\boldsymbol{x}_{t})∇ start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log italic_p ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) [35], a kind of generative model has close relationship to diffusion models. Without loss of generality, for a condition y𝑦yitalic_y (e.g., class label, mask, etc.) and corresponding conditional distribution p(𝒙|y)𝑝conditional𝒙𝑦p(\boldsymbol{x}|y)italic_p ( bold_italic_x | italic_y ), a score-based model can sample from p(𝒙|y)𝑝conditional𝒙𝑦p(\boldsymbol{x}|y)italic_p ( bold_italic_x | italic_y ) by modifying the score function at each step of t𝑡titalic_t to 𝒙tlog(p(𝒙t)p(y|𝒙t))subscriptsubscript𝒙𝑡log𝑝subscript𝒙𝑡𝑝conditional𝑦subscript𝒙𝑡\nabla_{\boldsymbol{x}_{t}}\text{log}(p(\boldsymbol{x}_{t})p(y|\boldsymbol{x}_% {t}))∇ start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log ( italic_p ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) italic_p ( italic_y | bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) if p(y|𝒙t)𝑝conditional𝑦subscript𝒙𝑡p(y|\boldsymbol{x}_{t})italic_p ( italic_y | bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) is known. Subsequently, with the connection between the score function and the noise ϵtsubscriptbold-italic-ϵ𝑡\boldsymbol{\epsilon}_{t}bold_italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT of diffusion models as 𝒙tlogp(𝒙t)=1/1αtϵtsubscriptsubscript𝒙𝑡log𝑝subscript𝒙𝑡11subscript𝛼𝑡subscriptbold-italic-ϵ𝑡\nabla_{\boldsymbol{x}_{t}}\text{log}\,p(\boldsymbol{x}_{t})=-{1}/{\sqrt{1-% \alpha_{t}}}\boldsymbol{\epsilon}_{t}∇ start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log italic_p ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = - 1 / square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [34], this joint distribution could be expanded to the deterministic process of DDIM, achieved by updating the noise ϵtsubscriptbold-italic-ϵ𝑡\boldsymbol{\epsilon}_{t}bold_italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT to ϵtsuperscriptsubscriptbold-italic-ϵ𝑡\boldsymbol{\epsilon}_{t}^{\prime}bold_italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT at each step as [33]:

ϵt=ϵt1αt𝒙tlogp(y|𝒙t).superscriptsubscriptbold-italic-ϵ𝑡subscriptbold-italic-ϵ𝑡1subscript𝛼𝑡subscriptsubscript𝒙𝑡log𝑝conditional𝑦subscript𝒙𝑡\boldsymbol{\epsilon}_{t}^{\prime}=\boldsymbol{\epsilon}_{t}-\sqrt{1-\alpha_{t% }}\nabla_{\boldsymbol{x}_{t}}\text{log}\,p(y|\boldsymbol{x}_{t}).bold_italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∇ start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log italic_p ( italic_y | bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) . (3)

3 Proposed Adversarial Attacks in Diffusion

3.1 Overview

From a novel modeling perspective, we propose Adversarial Attacks in Diffusion (AdvAD) to attack with inherently smaller perturbation strength through a non-parametric diffusion process for the first time. As shown in Figure 1, different from previous attack paradigms that employ gradient ascending or optimization with varying kinds of adversarial losses, AdvAD innovatively performs attack within a decomposed non-parametric diffusion trajectory starting from an initialized noise, in which very subtle yet effective adversarial guidance is crafted and injected to gradually push the end of this trajectory to a desired adversarially conditioned distribution from the original image.

Intuitively, given the original image 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT with an initialized Gaussian noise ϵ0𝒩(𝟎,𝑰)similar-tosubscriptbold-italic-ϵ0𝒩0𝑰\boldsymbol{\epsilon}_{0}\sim\mathcal{N}(\boldsymbol{0},\boldsymbol{\mathit{I}})bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ caligraphic_N ( bold_0 , bold_italic_I ), a fixed diffusion trajectory from 𝒙¯Tsubscriptbold-¯𝒙𝑇\boldsymbol{\bar{x}}_{T}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT to 𝒙¯0subscriptbold-¯𝒙0\boldsymbol{\bar{x}}_{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (𝒙¯0=𝒙orisubscriptbold-¯𝒙0subscript𝒙𝑜𝑟𝑖\boldsymbol{\bar{x}}_{0}=\boldsymbol{x}_{ori}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT) can be easily obtained using DDIM Backward for the deterministic diffusion process as:

𝒙¯T=αT𝒙ori+1αTϵ0,subscriptbold-¯𝒙𝑇subscript𝛼𝑇subscript𝒙𝑜𝑟𝑖1subscript𝛼𝑇subscriptbold-italic-ϵ0\boldsymbol{\bar{x}}_{T}=\sqrt{\alpha_{T}}\boldsymbol{x}_{ori}+\sqrt{1-\alpha_% {T}}\boldsymbol{\epsilon}_{0},overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , (4)
𝒙¯t1=αt1(𝒙¯t1αtϵ0αt)+1αt1ϵ0.subscriptbold-¯𝒙𝑡1subscript𝛼𝑡1subscriptbold-¯𝒙𝑡1subscript𝛼𝑡subscriptbold-italic-ϵ0subscript𝛼𝑡1subscript𝛼𝑡1subscriptbold-italic-ϵ0\boldsymbol{\bar{x}}_{t-1}=\sqrt{\alpha_{t-1}}(\frac{\boldsymbol{\bar{x}}_{t}-% \sqrt{1-\alpha_{t}}\boldsymbol{\epsilon}_{0}}{\sqrt{\alpha_{t}}})+\sqrt{1-% \alpha_{t-1}}\boldsymbol{\epsilon}_{0}.overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT = square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT . (5)

With this deterministic diffusion trajectory of the original image, performing adversarial attacks within it requires solving two main problems: i) directing the final result of this diffusion process to a desired adversarial example rather than the original image; ii) ensuring the modified trajectory (denoted as 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT for step t𝑡titalic_t) close to the original trajectory (𝒙¯tsubscriptbold-¯𝒙𝑡\boldsymbol{\bar{x}}_{t}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, ϵ0subscriptbold-italic-ϵ0\boldsymbol{{\epsilon}}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT for step t𝑡titalic_t) of the clean image to achieve the imperceptibility of attacks. To fulfill the dual purposes, we propose two theoretically grounded modules, called Attacked Model Guidance (AMG) and Pixel-level Constraint (PC) to work together. At each step, AMG utilizes only the attacked model f()𝑓f(\cdot)italic_f ( ⋅ ) to produce the adversarial guidance without requiring any additional networks, synergistically collaborated with PC to constrain and streamline the diffusion process injected with the guidances.

Refer to caption
Figure 1: Overview of the proposed Adversarial Attacks in Diffusion (AdvAD) that models the attack as a non-parametric diffusing process. At each step, Attacked Model Guidance (AMG) module adopts the non-Markovian process for approximating 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT using 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT to craft adversarial guidance and injects it into the initialized diffusion noise, then Pixel-level Constraint (PC) module imposes restriction to produce the noise for the next step and serves to control the whole process precisely.

3.2 Attacked Model Guidance Module

By viewing the attack process as a distribution-to-distribution transformation through a non-parametric diffusion process, the proposed AMG module theoretically integrates the conditional sampling technique of diffusion models to craft the adversarial guidance only using the attacked model f()𝑓f(\cdot)italic_f ( ⋅ ). For untargeted attacks, the ultimate goal is modifying 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT with f(𝒙ori)=ygt𝑓subscript𝒙𝑜𝑟𝑖subscript𝑦𝑔𝑡f(\boldsymbol{x}_{ori})=y_{gt}italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT to 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT so that f(𝒙adv)ygt𝑓subscript𝒙𝑎𝑑𝑣subscript𝑦𝑔𝑡f(\boldsymbol{x}_{adv})\neq y_{gt}italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) ≠ italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT, which can be regarded as directing the determined distribution p(𝒙ori)𝑝subscript𝒙𝑜𝑟𝑖p(\boldsymbol{x}_{ori})italic_p ( bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ) of the original diffusion trajectory to an distribution of 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT with the attacked model as p(𝒙adv|f(𝒙adv)ygt)𝑝conditionalsubscript𝒙𝑎𝑑𝑣𝑓subscript𝒙𝑎𝑑𝑣subscript𝑦𝑔𝑡p(\boldsymbol{x}_{adv}|f(\boldsymbol{x}_{adv})\neq y_{gt})italic_p ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT | italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) ≠ italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT ). Thus, we regard f(𝒙adv)ygt𝑓subscript𝒙𝑎𝑑𝑣subscript𝑦𝑔𝑡f(\boldsymbol{x}_{adv})\neq y_{gt}italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) ≠ italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT as an adversarial condition, and employ the conditional sampling technique to the original trajectory by manipulating the diffusion noise to achieve this, expressed as:

ϵ^tsuperscriptsubscriptbold-^bold-italic-ϵ𝑡\displaystyle\boldsymbol{\hat{\epsilon}}_{t}^{\prime}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT =ϵ01αt𝒙^tlogp(f(𝒙adv)ygt|𝒙^t)absentsubscriptbold-italic-ϵ01subscript𝛼𝑡subscriptsubscriptbold-^𝒙𝑡log𝑝𝑓subscript𝒙𝑎𝑑𝑣conditionalsubscript𝑦𝑔𝑡subscriptbold-^𝒙𝑡\displaystyle=\boldsymbol{\epsilon}_{0}-\sqrt{1-\alpha_{t}}\nabla_{\boldsymbol% {\hat{x}}_{t}}\text{log}\,p(f(\boldsymbol{x}_{adv})\neq y_{gt}|\boldsymbol{% \hat{x}}_{t})= bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∇ start_POSTSUBSCRIPT overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log italic_p ( italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) ≠ italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) (6)
=ϵ01αt𝒙^tlog(1p(f(𝒙adv)=ygt|𝒙^t)).absentsubscriptbold-italic-ϵ01subscript𝛼𝑡subscriptsubscriptbold-^𝒙𝑡log1𝑝𝑓subscript𝒙𝑎𝑑𝑣conditionalsubscript𝑦𝑔𝑡subscriptbold-^𝒙𝑡\displaystyle=\boldsymbol{{\epsilon}}_{0}-\sqrt{1-\alpha_{t}}\nabla_{% \boldsymbol{\hat{x}}_{t}}\text{log}(1-p(f(\boldsymbol{x}_{adv})=y_{gt}|% \boldsymbol{\hat{x}}_{t})).= bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∇ start_POSTSUBSCRIPT overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log ( 1 - italic_p ( italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) .

However, Eq. (6) is unsolvable since 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT is unknown during the diffusing process. To address this, inspired by the properties of deterministic non-Markovian DDIM that a final diffusion result is firstly predicted at each step, we calculate 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT via the equation of DDIM non-Markovian process with ϵ^t+1subscriptbold-^bold-italic-ϵ𝑡1\boldsymbol{\hat{\epsilon}}_{t+1}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT from the previous step, and use it to approximate 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT, expressed as:

𝒙adv𝒙^t0=𝒙^t1αtϵ^t+1αt.subscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡0subscriptbold-^𝒙𝑡1subscript𝛼𝑡subscriptbold-^bold-italic-ϵ𝑡1subscript𝛼𝑡\boldsymbol{x}_{adv}\approx\boldsymbol{\hat{x}}_{t}^{0}=\frac{\boldsymbol{\hat% {x}}_{t}-\sqrt{1-\alpha_{t}}\;\boldsymbol{\hat{\epsilon}}_{t+1}}{\sqrt{\alpha_% {t}}}.bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ≈ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG . (7)

The accurate error upper bound and convergence of this approximation are given in Proposition 2 in conjunction with the proposed PC module, and the validity of this approximation can also be explained intuitively from the premise of our method. That is, we have 𝒙¯t0=𝒙orisuperscriptsubscriptbold-¯𝒙𝑡0subscript𝒙𝑜𝑟𝑖\boldsymbol{\bar{x}}_{t}^{0}=\boldsymbol{x}_{ori}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT for all step t𝑡titalic_t in the original diffusion trajectory, and the modified trajectory should be very close to the original one, so that the relationship between 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT should satisfy 𝒙^t0𝒙advsuperscriptsubscriptbold-^𝒙𝑡0subscript𝒙𝑎𝑑𝑣\boldsymbol{\hat{x}}_{t}^{0}\approx\boldsymbol{x}_{adv}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ≈ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT.

With Eq. (7), the term of p(f(𝒙adv)=ygt|𝒙^t)𝑝𝑓subscript𝒙𝑎𝑑𝑣conditionalsubscript𝑦𝑔𝑡subscriptbold-^𝒙𝑡p(f(\boldsymbol{x}_{adv})=y_{gt}|\boldsymbol{\hat{x}}_{t})italic_p ( italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) in Eq. (6) can be written as p(f(𝒙^t0)=ygt|𝒙^t)=p(f(𝒙^t0)=ygt)𝑝𝑓superscriptsubscriptbold-^𝒙𝑡0conditionalsubscript𝑦𝑔𝑡subscriptbold-^𝒙𝑡𝑝𝑓superscriptsubscriptbold-^𝒙𝑡0subscript𝑦𝑔𝑡p(f(\boldsymbol{\hat{x}}_{t}^{0})=y_{gt}|\boldsymbol{\hat{x}}_{t})=p(f(% \boldsymbol{\hat{x}}_{t}^{0})=y_{gt})italic_p ( italic_f ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = italic_p ( italic_f ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT ) since 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT is calculated from 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, which is exactly the output logits of f(𝒙^t0)𝑓superscriptsubscriptbold-^𝒙𝑡0f(\boldsymbol{\hat{x}}_{t}^{0})italic_f ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) with Softmax()𝑆𝑜𝑓𝑡𝑚𝑎𝑥Softmax(\cdot)italic_S italic_o italic_f italic_t italic_m italic_a italic_x ( ⋅ ) function for the class ygtsubscript𝑦𝑔𝑡y_{gt}italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT. Denoting this term as the classification probability of the attacked model as pf(ygt|𝒙^t0)subscript𝑝𝑓conditionalsubscript𝑦𝑔𝑡superscriptsubscriptbold-^𝒙𝑡0p_{f}(y_{gt}|\boldsymbol{\hat{x}}_{t}^{0})italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ), we can obtain the solvable equation of AMG module that injects adversarial guidance to the initialized diffusion noise using only f()𝑓f(\cdot)italic_f ( ⋅ ) without any additional network:

ϵ^t=AMG(ϵ0,𝒙^t0,f(),ygt)=ϵ01αt𝒙^tlog(1pf(ygt|𝒙^t0)).subscriptsuperscriptbold-^bold-italic-ϵ𝑡AMGsubscriptbold-italic-ϵ0superscriptsubscriptbold-^𝒙𝑡0𝑓subscript𝑦𝑔𝑡subscriptbold-italic-ϵ01subscript𝛼𝑡subscriptsubscriptbold-^𝒙𝑡log1subscript𝑝𝑓conditionalsubscript𝑦𝑔𝑡superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{\epsilon}}^{\prime}_{t}=\text{AMG}(\boldsymbol{\epsilon}_{0},% \boldsymbol{\hat{x}}_{t}^{0},f(\cdot),y_{gt})=\boldsymbol{\epsilon}_{0}-\sqrt{% 1-\alpha_{t}}\nabla_{\boldsymbol{\hat{x}}_{t}}\text{log}(1-p_{f}(y_{gt}|% \boldsymbol{\hat{x}}_{t}^{0})).overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = AMG ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , italic_f ( ⋅ ) , italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT ) = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∇ start_POSTSUBSCRIPT overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log ( 1 - italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) . (8)

At this point, in addition to the benefits from modeling, this calculation process of AMG also plays a role in endowing AdvAD with imperceptibility. As the attack progresses, the probability pfsubscript𝑝𝑓p_{f}italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, the term of log(1pf)log1subscript𝑝𝑓\text{log}(1-p_{f})log ( 1 - italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) as well as coefficient 1αt1subscript𝛼𝑡\sqrt{1-\alpha_{t}}square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG gradually approach 0, which means the strength of injected adversarial guidance gradually converge to 0 in AdvAD, while common classification losses (e.g, Cross-Entropy, Log Loss, etc.) used in other attack paradigms may increase on the contrary. Further analysis and experiments on this property are provided in Proposition 1 and Sec. 4.5.

Algorithm 1 AdvAD

Input: Attacked model f()𝑓f(\cdot)italic_f ( ⋅ ), image 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT with label ygtsubscript𝑦𝑔𝑡y_{gt}italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT, budget ξ𝜉\xiitalic_ξ, step T𝑇Titalic_T;
Output: Adversarial example 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT

1:  Initialize pre-defined diffusion coefficients α0:T(0,1]T+1subscript𝛼:0𝑇superscript01𝑇1\alpha_{0:T}\in(0,1]^{T+1}italic_α start_POSTSUBSCRIPT 0 : italic_T end_POSTSUBSCRIPT ∈ ( 0 , 1 ] start_POSTSUPERSCRIPT italic_T + 1 end_POSTSUPERSCRIPT;
2:  Initialize ϵ0𝒩(𝟎,𝑰)similar-tosubscriptbold-italic-ϵ0𝒩0𝑰\boldsymbol{\epsilon}_{0}\sim\mathcal{N}(\boldsymbol{0},\boldsymbol{\mathit{I}})bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ caligraphic_N ( bold_0 , bold_italic_I ); \triangleright Initialize and fix diffusion noise ϵ0subscriptbold-italic-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.
3:  Transform the range of 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT to [-1, 1]; \triangleright Align with data range of diffusion process.
4:  Calculate 𝒙¯Tsubscriptbold-¯𝒙𝑇\boldsymbol{\bar{x}}_{T}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT via Eq. (4); \triangleright Forward process of adding noise ϵ0subscriptbold-italic-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT.
5:  Set 𝒙^T:=𝒙¯Tassignsubscriptbold-^𝒙𝑇subscriptbold-¯𝒙𝑇\boldsymbol{\hat{x}}_{T}:=\boldsymbol{\bar{x}}_{T}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT := overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, ϵ^T+1:=ϵ0assignsubscriptbold-^bold-italic-ϵ𝑇1subscriptbold-italic-ϵ0\boldsymbol{\hat{\epsilon}}_{T+1}:=\boldsymbol{\epsilon}_{0}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT := bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; \triangleright Non-parametric diffusion process.
6:  for t=T𝑡𝑇t=Titalic_t = italic_T to 1111 do
7:     Calculate 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT via Eq. (7); \triangleright Approximation of 𝒙^t0𝒙advsuperscriptsubscriptbold-^𝒙𝑡0subscript𝒙𝑎𝑑𝑣\boldsymbol{\hat{x}}_{t}^{0}\approx\boldsymbol{x}_{adv}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ≈ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT.
8:     Transform the range of 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT to [0, 255]; \triangleright Align with data range of image.
9:     Calculate ϵ^tsuperscriptsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}^{\prime}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT with AMG via Eq. (8); \triangleright Inject adversarial guidance.
10:     Calculate ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT with PC via Eq. (10); \triangleright Constraint modified diffusion noise.
11:     Calculate 𝒙^t1subscriptbold-^𝒙𝑡1\boldsymbol{\hat{x}}_{t-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT via Eq. (11); \triangleright One step backward from t𝑡titalic_t to t1𝑡1t-1italic_t - 1. 
12:  Transform the range of 𝒙^0subscriptbold-^𝒙0\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to [0, 255]; \triangleright Endpoint of the process.
13:  return 𝒙adv=int8(round(𝒙^0))subscript𝒙𝑎𝑑𝑣int8roundsubscriptbold-^𝒙0\boldsymbol{x}_{adv}=\text{int8}(\text{round}(\boldsymbol{\hat{x}}_{0}))bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = int8 ( round ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ); \triangleright Return actual 8-bit image 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT.

3.3 Pixel-level Constraint Module

Collaborating with AMG, the PC module is introduced to impose precise control and streamline the modified diffusion trajectory for attacks. A straightforward choice is to design PC for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT that constrains each 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT using 𝒙¯tsubscriptbold-¯𝒙𝑡\boldsymbol{\bar{x}}_{t}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, thus ensuring 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT close to 𝒙¯t0superscriptsubscriptbold-¯𝒙𝑡0\boldsymbol{\bar{x}}_{t}^{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and the final 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT close to 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT. However, such a "hard" constraint directly applied to 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT will impair the effectiveness of AMG and disrupt coherence of the transforming trajectory. Therefore, we formulate a more suitable PC for ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as in Theorem 1.

Theorem 1

Given diffusion coefficients αT:0(0,1]Tsubscript𝛼:𝑇0superscript01𝑇\alpha_{T:0}\in(0,1]^{T}italic_α start_POSTSUBSCRIPT italic_T : 0 end_POSTSUBSCRIPT ∈ ( 0 , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, the 𝐱orisubscript𝐱𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT, 𝐱¯tsubscriptbold-¯𝐱𝑡\boldsymbol{\bar{x}}_{t}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, ϵ0subscriptbold-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT from the original trajectory, 𝐱^tsubscriptbold-^𝐱𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, ϵ^tsubscriptbold-^bold-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from the modified trajectory, and a variable ξ𝜉\xiitalic_ξ, if ϵ^tsubscriptbold-^bold-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and ϵ0subscriptbold-ϵ0\boldsymbol{{\epsilon}}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT satisfies

ϵ^tϵ0αT1αTξ,subscriptnormsubscriptbold-^bold-italic-ϵ𝑡subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\hat{\epsilon}}_{t}-\boldsymbol{\epsilon}_{0}\|_{\infty}\leq% \frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi,∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ , (9)

for all t[T:1]t\in[T:1]italic_t ∈ [ italic_T : 1 ], then it follows that 𝐱^t𝐱¯t(αt1αtαT1αT)ξ,𝐱^t0𝐱oriξ,and𝐱^0𝐱oriξformulae-sequencesubscriptnormsubscriptbold-^𝐱𝑡subscriptbold-¯𝐱𝑡subscript𝛼𝑡1subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉formulae-sequencesubscriptnormsuperscriptsubscriptbold-^𝐱𝑡0subscript𝐱𝑜𝑟𝑖𝜉andsubscriptnormsubscriptbold-^𝐱0subscript𝐱𝑜𝑟𝑖𝜉\|\boldsymbol{\hat{x}}_{t}-\boldsymbol{\bar{x}}_{t}\|_{\infty}\leq(\sqrt{% \alpha_{t}}-\sqrt{1-\alpha_{t}}\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}})% \xi,\ \|\boldsymbol{\hat{x}}_{t}^{0}-\boldsymbol{x}_{ori}\|_{\infty}\leq\xi,\ % \text{and}\ \|\boldsymbol{\hat{x}}_{0}-\boldsymbol{x}_{ori}\|_{\infty}\leq\xi∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ ( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ , ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ , and ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ hold true.

According to Theorem 1, the PC for ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is implemented as:

ϵ^t=PC(ϵ^t)=𝒫l(ϵ0,αT1αTξ)(ϵ^t).subscriptbold-^bold-italic-ϵ𝑡PCsubscriptsuperscriptbold-^bold-italic-ϵ𝑡subscript𝒫subscript𝑙subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉subscriptsuperscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}=\text{PC}(\boldsymbol{\hat{\epsilon}}^{\prime}% _{t})=\mathcal{P}_{l_{\infty}({\boldsymbol{\epsilon}_{0}},{\frac{\sqrt{\alpha_% {T}}}{\sqrt{1-\alpha_{T}}}\xi})}(\boldsymbol{\hat{\epsilon}}^{\prime}_{t}).overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = PC ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = caligraphic_P start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ ) end_POSTSUBSCRIPT ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) . (10)

where 𝒫l(ϵ,ξ)()subscript𝒫subscript𝑙bold-italic-ϵ𝜉\mathcal{P}_{l_{\infty}({\boldsymbol{\epsilon}},{\xi})}(\cdot)caligraphic_P start_POSTSUBSCRIPT italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ( bold_italic_ϵ , italic_ξ ) end_POSTSUBSCRIPT ( ⋅ ) is a projection operation that constrains the output ϵ^tsubscriptsuperscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}^{\prime}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT of AMG()(\cdot)( ⋅ ) to ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT based on a lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT-norm ball of ϵ0subscriptbold-italic-ϵ0\boldsymbol{{\epsilon}}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to satisfy Eq. (9). After PC, the diffusion noise ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT for next step is obtained, and the 𝒙^t1subscriptbold-^𝒙𝑡1\boldsymbol{\hat{x}}_{t-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT can be calculated using the deterministic DDIM backward equation as:

𝒙^t1=αt1(𝒙^t1αtϵ^tαt)+1αt1ϵ^t.subscriptbold-^𝒙𝑡1subscript𝛼𝑡1subscriptbold-^𝒙𝑡1subscript𝛼𝑡subscriptbold-^bold-italic-ϵ𝑡subscript𝛼𝑡1subscript𝛼𝑡1subscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{x}}_{t-1}=\sqrt{\alpha_{t-1}}(\frac{\boldsymbol{\hat{x}}_{t}-% \sqrt{1-\alpha_{t}}\boldsymbol{\hat{\epsilon}}_{t}}{\sqrt{\alpha_{t}}})+\sqrt{% 1-\alpha_{t-1}}\boldsymbol{\hat{\epsilon}}_{t}.overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT = square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT . (11)

The elaborate PC for ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT directly cooperates with AMG to constrain the diffusion noise, which streamlines the whole diffusion process and can serve to simultaneously control the terms of 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, and 𝒙^0subscriptbold-^𝒙0\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, satisfying the premise that two trajectories are close and ensuring the effectiveness of AdvAD. The complete pseudo code of AdvAD is provided in Algorithm 1.

Subsquently, based on Theorem 1, we further give two propositions about AdvAD as:

Proposition 1

Under the conditions of Theorem 1, by denoting constrained ϵ^t=ϵ0𝛅tsubscriptbold-^bold-ϵ𝑡subscriptbold-ϵ0subscript𝛅𝑡\boldsymbol{\hat{\epsilon}}_{t}=\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_% {t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, we have

𝒙adv=𝒙ori+t=1Tλt𝜹t,subscript𝒙𝑎𝑑𝑣subscript𝒙𝑜𝑟𝑖superscriptsubscript𝑡1𝑇subscript𝜆𝑡subscript𝜹𝑡\boldsymbol{x}_{adv}=\boldsymbol{x}_{ori}+\sum_{t=1}^{T}\lambda_{t}\boldsymbol% {\delta}_{t},bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (12)

where λt=1αtαt1αt1αt1subscript𝜆𝑡1subscript𝛼𝑡subscript𝛼𝑡1subscript𝛼𝑡1subscript𝛼𝑡1\lambda_{t}=\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}-\frac{\sqrt{1-{% \alpha_{t-1}}}}{\sqrt{\alpha_{t-1}}}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG end_ARG, and 𝛅tαT1αTξsubscriptnormsubscript𝛅𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\delta}_{t}\|_{\infty}\leq\frac{\sqrt{\alpha_{T}}}{\sqrt{1-% \alpha_{T}}}\xi∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ.

Proposition 2

Under the conditions of Theorem 1, the upper bound on the error of the approximation in Eq. (7) can be expressed as

𝒙adv𝒙^t0 21αtαtαT1αTξ.subscriptnormsubscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡021subscript𝛼𝑡subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\left\|\boldsymbol{x}_{adv}-\boldsymbol{\hat{x}}_{t}^{0}\right\|_{\infty}\leq% \;2\cdot\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\frac{\sqrt{\alpha_{T}}}{% \sqrt{1-\alpha_{T}}}\xi.∥ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT - overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ 2 ⋅ divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ . (13)

Proposition 1 explicitly states the much subtler and decreasing strength of the adversarial guidance injected at each step of AdvAD’s non-parametric diffusion process, and also allows for a quantitative analysis (as in Sec. 5.5). Proposition 2 indicates the validity and convergence of the approximation of 𝒙adv𝒙^t0subscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{x}_{adv}\approx\boldsymbol{\hat{x}}_{t}^{0}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ≈ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT in AMG (Eq. (7)). It is evident that as t𝑡titalic_t goes from T𝑇Titalic_T to 1111, αtsubscript𝛼𝑡\alpha_{t}italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT increases from 00 to 1111, the upper bound on the approximation error rapidly converge from 2ξ2𝜉2\xi2 italic_ξ to 00. The detailed derivations of the mentioned PC for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, proofs of Theorem 1 and Proposition 1, 2 are provided in Appendix B.

3.4 AdvAD to AdvAD-X: Extreme Version

Building upon AdvAD, we further propose a scheme called AdvAD-X (‘X’ for ‘eXtreme’) with two extra strategies called Dynamic Guidance Injection (DGI) and CAM Assistance (CA), aiming to squeeze the extreme performance of our novel modeling framework in an ideal scenario that is usually overlooked but has theoretical significance.

DGI and CA Strageties.

As aforementioned, the attack capability of AdvAD comes from the very subtle yet effective adversarial guidance crafted by AMG and PC, and the intensity of guidance will decrease to 0 as the process progresses. Thus, the DGI is naturally emerged as a dynamic skipping stragety to skip the unnecessary calculation and injection of adversarial guidance, especially for those steps in the later process. With DGI, AdvAD-X dynamically avoids the execution of AMG and PC and adopts original ϵ0subscriptbold-italic-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT as the diffusion noise for the steps where the 𝒙^t0𝒙advsuperscriptsubscriptbold-^𝒙𝑡0subscript𝒙𝑎𝑑𝑣\boldsymbol{\hat{x}}_{t}^{0}\approx\boldsymbol{x}_{adv}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ≈ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT is already able to mislead the attacked model, reducing the accumulated guidance strength as well as the computational complexity. On the other hand, inspired by the Class Activation Mapping (CAM) [36] identifies critical regions of an image about a decision made of a classifier, our CA strategy calculates a mask (if available) 𝒎𝒎\boldsymbol{m}bold_italic_m ranging from 00 to 1111 of 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT with f()𝑓f(\cdot)italic_f ( ⋅ ) and ygtsubscript𝑦𝑔𝑡y_{gt}italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT using GradCAM [37] to further suppress the strength of adversarial guidance within the non-critical image regions in those steps that are not skipped. The equation of AMG with CA strategy can be modified as:

ϵ^t=ϵ0𝒎1αt𝒙^tlog(1pf(ygt|𝒙^t0)).subscriptsuperscriptbold-^bold-italic-ϵ𝑡subscriptbold-italic-ϵ0𝒎1subscript𝛼𝑡subscriptsubscriptbold-^𝒙𝑡log1subscript𝑝𝑓conditionalsubscript𝑦𝑔𝑡superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{\epsilon}}^{\prime}_{t}=\boldsymbol{\epsilon}_{0}-\boldsymbol% {m}\cdot\sqrt{1-\alpha_{t}}\nabla_{\boldsymbol{\hat{x}}_{t}}\text{log}(1-p_{f}% (y_{gt}|\boldsymbol{\hat{x}}_{t}^{0})).overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_m ⋅ square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∇ start_POSTSUBSCRIPT overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT log ( 1 - italic_p start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT | overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) ) . (14)

Ideal Scenario.

Equipped with DGI, AdvAD-X omits a large number of adversarial guidance that is injected by default in AdvAD, while the absolute strength of guidance in each of the remaining steps are also suppressed by CA, successfully reducing the final adversarial perturbation to an extreme level. This extreme case leads to a problem that in the default setting of attacking with 8-Bit RGB images, the adversarial perturbation of pixels where the intensity is less than 0.50.50.50.5 will be erased due to the quantization. However, in practice, the input of DNNs is normalized as floating-point data type to avoid gradient problems during training [38, 39], and white-box attack allows access to the entire of DNNs. Therefore, for AdvAD-X, we specifically consider an ideal scenario that directly input the raw final adversarial example in floating-point data to DNNs without quantization to evaluate the extreme performance of AdvAD-X. The pseudo code of AdvAD-X is provided in Appendix C.

4 Experiments

4.1 Experimental Setup

Dataset. In line with prior studies [19, 40, 15, 32], our experiments are conducted on the ImageNet-compatible Dataset 111https://github.com/cleverhans-lab/cleverhans/tree/master/cleverhans_v3.1.0/examples/nips17_adversarial_competition/dataset, containing 1,000 images of ImageNet [41] classes with size of 299×299299299299\times 299299 × 299, and the images are resized to standard input size of 224×224224224224\times 224224 × 224 in all experiments. Models. We select the widely used CNNs of ResNet-50 [42] and enhanced ConvNeXt-Base [43], Swin Transformer-Base [44] with Transformer [45] architecture, and VisionMamba-Small [46] with the recently emerged advanced Mamba [47] architecture. Attacks. We choose classic PGD [12] and seven attacks that claim having imperceptibility as comparison methods, including normal imperceptible attacks of AdvDrop [21], PerC-AL [19], SSAH [24], and unrestricted attacks of NCF [40], ACA [32], DiffAttack [31], Diff-PGD [30], and the generative capability of diffusion models are utilized the last three attacks. For our proposed AdvAD and AdvAD-X, we set ξ=8/255𝜉8255\xi=8/255italic_ξ = 8 / 255 and T=1000𝑇1000T=1000italic_T = 1000 for all experiments unless specifically mentioned. All the other comparison methods are evaluated using their official open-source code with the default hyper-parameters. The results of AdvAD-X are obtained in the ideal scenario with float-pointing raw data as described in Sec. 3.4. Evaluation Metrics. Attack success rate (ASR) is used to evaluate the attack efficacy, and seven metrics are adopted to comprehensively assess the imperceptibility, including l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT distances for absolute perturbation strength; Peak Signal-to-Noise Ratio (PSNR), Structure Similarity (SSIM) [48], and three network-based metrics, i.e., Learned Perceptual Image Patch Similarity (LPIPS) [49], Fréchet Inception Distance (FID) [50], and a non-reference metric MUSIQ [51] for image quality.

Table 1: Results of untargeted white-box attack success rate (ASR) and other evaluation metrics for imperceptibility when employing different attacks and attacked models. The reported running times are obtained using a RTX 3090 GPU on a same machine. bold-†\boldsymbol{{\dagger}}bold_† and blue mean the results of AdvAD-X are obtained with floating-point data type in the ideal scenario as described in Sec 3.4.
Model Attack Method Time (s) \downarrow ASR (%percent\%%) \uparrow lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT \downarrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT \downarrow  PSNR \uparrow  SSIM \uparrow  FID \downarrow  LPIPS \downarrow  MUSIQ \uparrow
ResNet-50 [42] PGD [12] 25 98.6 0.031 8.17 33.53 0.8830 35.25 0.0517 52.24
NCF [40] 2739 89.9 0.783 75.16 14.79 0.6374 58.99 0.3052 49.12
ACA [32] 82239 89.8 0.839 52.42 18.00 0.5659 69.57 0.3381 55.47
DiffAttack [31] 34954 96.6 0.743 30.51 22.63 0.6750 55.29 0.1130 55.67
DiffPGD [30] 6057 92.1 0.246 11.43 30.95 0.8902 22.18 0.0315 55.05
AdvDrop [21] 193 96.8 0.062 3.17 41.91 0.9872 5.57 0.0061 54.96
PerC-AL [19] 4085 98.8 0.131 2.05 46.35 0.9894 8.62 0.0029 55.84
SSAH [24] 428 99.7 0.033 2.65 43.73 0.9911 4.48 0.0021 55.49
AdvAD (ours) 2201 99.7 0.010 1.06 51.84 0.9980 2.42 0.0005 56.35
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 806 100.0 0.002 0.34 63.62 0.9997 0.23 0.0001 56.59
ConvNeXt -Base [43] PGD [12] 127 99.9 0.031 7.98 33.74 0.8845 32.03 0.0386 51.85
NCF [40] 5222 59.4 0.750 72.89 15.10 0.6616 50.52 0.2846 49.70
ACA [32] 83149 82.2 0.835 52.16 18.05 0.5676 68.45 0.3421 55.11
DiffAttack [31] 35417 97.8 0.754 31.70 22.28 0.6610 72.22 0.1277 54.80
DiffPGD [30] 6325 76.9 0.245 11.45 30.94 0.8908 21.05 0.0306 54.75
AdvDrop [21] 838 96.9 0.057 3.26 41.69 0.9864 6.42 0.0055 54.80
PerC-AL [19] 18271 10.3 - - - - - - -
SSAH [24] 3423 84.6 0.026 2.24 45.19 0.9928 3.04 0.0011 55.78
AdvAD (ours) 15240 100.0 0.016 1.49 48.61 0.9964 5.07 0.0009 55.97
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 5245 99.8 0.004 0.64 58.01 0.9993 0.62 0.0001 56.43
Swin Trans. -Base [44] PGD [12] 93 98.5 0.031 7.85 33.88 0.8861 21.34 0.0378 51.91
NCF [40] 4690 63.7 0.733 69.92 15.48 0.6822 47.17 0.2709 49.77
ACA [32] 83706 79.6 0.831 50.70 18.31 0.5757 64.83 0.3341 55.65
DiffAttack [31] 36736 89.7 0.741 30.45 22.67 0.6727 53.32 0.1143 55.72
DiffPGD [30] 6499 69.1 0.244 11.26 31.10 0.8945 16.19 0.0276 55.25
AdvDrop [21] 673 97.2 0.063 3.37 41.43 0.9853 5.22 0.0065 54.73
PerC-AL [19] 15258 95.6 0.144 2.15 45.93 0.9882 3.53 0.0015 55.66
SSAH [24] 1737 96.3 0.035 2.41 44.60 0.9927 2.57 0.0010 55.53
AdvAD (ours) 9729 100.0 0.013 1.19 50.57 0.9978 1.70 0.0004 56.17
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 5243 99.7 0.005 0.52 60.29 0.9995 0.25 0.0001 56.47
VisionMamba -Small [46] PGD [12] 63 95.7 0.031 7.99 33.73 0.8884 26.09 0.0503 52.37
NCF [40] 3919 71.7 0.738 68.71 15.68 0.6876 46.07 0.2629 50.05
ACA [32] 96851 84.2 0.831 50.88 18.28 0.5753 65.77 0.3329 55.28
DiffAttack [31] 43043 90.9 0.749 30.94 22.52 0.6693 52.16 0.1179 55.66
DiffPGD [30] 7638 83.4 0.248 11.75 30.68 0.8845 21.02 0.0378 54.19
AdvDrop [21] 1311 97.0 0.076 4.42 39.30 0.9761 8.02 0.0086 54.34
PerC-AL [19] 10400 6.5 - - - - - - -
SSAH [24] 1204 49.8 0.028 1.95 46.41 0.9946 2.08 0.0018 55.96
AdvAD (ours) 6154 99.7 0.016 1.62 47.94 0.9960 3.67 0.0017 56.17
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 4021 99.4 0.005 0.69 58.90 0.9989 0.51 0.0004 56.50
Refer to caption
Figure 2: Visualizations of adversarial examples and corresponding perturbations crafted by nine imperceptible attacks. Perturbations are amplified as marked in top-right for the convenience of observation. Please zoom in to observe the details of the images with original resolution of 224×224224224224\times 224224 × 224.

4.2 Comparison with State-of-the-art Methods

White-Box Attacks.

Table 1 reports the untargeted attack performance and imperceptibility of ten methods against four attacked models. It is evident that the proposed AdvAD with novel modeling framework consistently demonstrates superior performance in terms of both ASR and imperceptibility. For the normal imperceptible adversarial attacks, the absolute adversarial perturbation strength of AdvAD in lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT and l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT distance are only 0.014 and 1.34 in average, which is about half of the state-of-the-art restricted imperceptible attack SSAH, and AdvAD maintains almost 99.9% ASR, supporting our key idea that it inherently reduces the strength of perturbation required for attacks from a modeling perspective. When attacking more advanced models from ResNet to VisionMamba, AdvAD always demonstrates the best ASR and imperceptibility, yet other methods tend to have some performance degradation (e.g., PerC-AL and SSAH for ConvNeXt and VisionMamba). For unrestricted attacks, it is expected for them to perform poorly in the quantitative metrics, but if the results are poor for all image quality metrics, it usually indicates that the images are damaged. Meanwhile, since the optimizer may not find the global optimal solution, the optimization-based methods tent to show sub-optimal ASR. For AdvAD-X, surprisingly, the perturbation strength is reduced to an extremely low level with still high attack efficacy in the ideal scenario with floating-point raw data.

Visualization.

The visualizations of adversarial examples againt ResNet-50 in Figure 2 clearly show the characteristics of different imperceptible attacks against ResNet-50. For the first image with a relatively simple and clear object, the unrestricted attacks of NCF, DiffAttack and ACA perform attacks by modifying the semantics fairly, while DiffPGD uses denoising to avoid significant semantic modifications, but often has lower ASR as in Table 1. However, for the image with complex content, the unrestricted attacks result in obvious unnatural color, texture, artifacts and semantic changes. For the normal attacks with perceptual-based restrictions, by amplifying the noises, it can be seen that AdvDrop has a obvious gridding effect due to the blocking operation in DCT operation, and the perturbation strength in PerC-AL and SSAH is also related to the edge or texture components of the image. In contrast, our AdvAD continuously maintains uniform and lower perturbation which is very difficult to be seen even in the adversarial examples with ×5absent5\times 5× 5 noise. For AdvAD-X, the perturbations are very slight modifications to the decimal places of the floating-point raw data for each pixel, thus it is still difficult to be seen even after ×100absent100\times 100× 100 magnification. More quantitative comparisons and visualizations are provided in Appendix D.1, D.2.

Table 2: Results of ASR against defenses for robustness evaluation, including three post-processing purification methods and four adversarial training white-box robust models.
Attack Method Post Purifications (Normal Res-50) Attack Adversarial Training Model All Avg.
NRP [4]  DS [52] Diffusion [5] Avg. Inc-V3 [8] Res-50 [9] Swin-B [53] ConvNeXt-B [53] Avg.
AdvDrop [21] 50.2 30.1 37.1 39.1 93.7 72.4 31.2 37.3 58.7 50.3
PerC-AL [19] 30.3  28.8 25.4 28.2 99.9 46.1 8.2 7.0 40.3 35.1
SSAH [24] 25.6  28.0 11.0 21.5 91.2 84.6 16.8 47.4 60.0 43.5
AdvAD (ours) 51.5 29.5 31.2 37.4 98.9 79.3 60.2 62.7 75.3 59.0
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 13.4 27.6 10.2 17.1 57.2 45.2 18.0 16.2 34.2 26.8
Refer to caption
Figure 3: Rubostness on JPEG compression and Bit-depth reduction with different factors.
Figure 4: Results of imperceptible attacks against random smoothing defense. Adversarial examples are crafted using only the base model, and then 100 rounds of random smoothing are applied to obtain the final ASR. σ𝜎\sigmaitalic_σ is the variance of smoothing noise.
σ=0.25𝜎0.25\sigma=0.25italic_σ = 0.25 σ=0.50𝜎0.50\sigma=0.50italic_σ = 0.50 σ=1.00𝜎1.00\sigma=1.00italic_σ = 1.00
ASR\uparrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT\downarrow ASR\uparrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT\downarrow ASR\uparrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT\downarrow
clean 17.3 - 30.3 - 46.8 -
AdvDrop 25.2 5.97 33.5 6.21 48.7 5.61
SSAH 21.8 13.84 32.4 14.82 46.9 13.68
AdvAD (ours) 28.2 2.41 36.8 2.51 50.4 2.08

4.3 Robustness

The robustness of attacks is also evaluated against defense methods, including purification methods of NRP [4], DS [52], diffusion-based purification [5] and adversarial training robust models of Inc-V3 [8], Res-50 [9], Swin-B [53], ConvNeXt-B [53]. Two classic image transformation defenses of JPEG compression [54], Bit-depth reduction [55], and another type of defense, random smoothing [11], are also included. Considering the robustness and transferability of attacks are comparable only under close perturbation budget, the unrestricted attacks are not included in this and the next section.

As shown in Table 2, the proposed AdvAD demonstrates the best robustness in overall average compared with other imperceptible attacks of AdvDrop, PerC-AL and SSAH. Specifically, when attacking robust models, AdvAD achieved an much higher average ASR of 75.3%percent\%%. For post-processing purifications aim at eliminating adversarial perturbations, despite the inherently lower perturbation strength, AdvAD still maintains the best or second-best ASR against different purifications, which is comparable to AdvDrop with much higher perturbation strength. Similarly, for the results of classic image transformation defenses in Figure 4, AdvAD also exhibits advantages in most of the factors. In addition, since random smoothing is not a truly end-to-end method but a method that uses the base model to make multiple predictions on noise-augmented images, we adpot a semi-white-box setup to fully test the attack performance as described in the caption. Table 4 shows the experimental results, and the PerC-AL is not included because it fails to attack in this setting. It can be seen that for all σ𝜎\sigmaitalic_σ, our AdvAD continuously achieves the best ASR with smaller perturbation strength.

Refer to caption
Figure 5: More results of (a) effect of step T𝑇Titalic_T on AdvAD and (b) transferability-imperceptibility relationship of attacks.

We suppose the robustness of AdvAD mainly benefits from two aspects. Firstly, AdvAD performs attacks during a unique non-parametric diffusion process with adversarial guidance, which may be easier to break through existing adversarial training models using common attack paradigms. On the other hand, the inherently lower perturbation crafted by AdvAD is spread across the images more uniformly rather than gathering in some areas as can be seen in the visualization, making it more difficult to be eliminated. For AdvAD-X, it is anticipated to exhibit weak robustness since the extremely low perturbation in the ideal scenario is easy to defense.

Table 3: Transferability and effect of T𝑇Titalic_T of the proposed AdvAD. * means white-box ASR.
Model Attack Method Res-50 Mob-V2 Inc-V3 VGG-19 l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT \downarrow  PSNR \uparrow  SSIM \uparrow  FID \downarrow  LPIPS \downarrow
Res-50 [42] SSAH [24] 99.7superscript99.7\text{{99.7}}^{*}99.7 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 15.5 20.4 12.7 2.65 43.73 0.9911 4.48 0.0021
AdvAD (T𝑇Titalic_T=1000) 99.7superscript99.7\text{{99.7}}^{*}99.7 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 18.3 22.6 15.1 1.06 51.84 0.9980 2.42 0.0005
AdvDrop [21] 96.8superscript96.8\text{96.8}^{*}96.8 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 17.3 23.1 15.8 3.17 41.91 0.9872 5.57 0.0061
PerC-AL [19] 98.8superscript98.8\text{98.8}^{*}98.8 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 22.4 23.8 17.4 2.05 46.35 0.9894 8.62 0.0029
AdvAD (T𝑇Titalic_T=100) 100.0superscript100.0\text{{100.0}}^{*}100.0 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 23.5 24.9 19.9 1.97 46.04 0.9912 7.15 0.0026
PGD [12] 98.6superscript98.6\text{98.6}^{*}98.6 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 41.4 36.7 36.0 8.17 33.53 0.8830 35.25 0.0517
AdvAD (T𝑇Titalic_T=10) 100.0superscript100.0\text{{100.0}}^{*}100.0 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 44.3 37.6 42.9 7.21 34.63 0.9015 30.84 0.0547
Mob-V2 [56] SSAH [24] 7.7 97.8superscript97.8\text{97.8}^{*}97.8 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 19.8 11.6 2.18 45.24 0.9930 2.95 0.0016
AdvAD (T𝑇Titalic_T=1000) 9.7 99.7superscript99.7\text{{99.7}}^{*}99.7 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 21.3 14.8 0.94 53.08 0.9982 1.46 0.0004
AdvDrop [21] 9.7 97.7superscript97.7\text{97.7}^{*}97.7 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 22.7 15.0 3.16 41.94 0.9873 4.88 0.0064
PerC-AL [19] 12.7 99.8superscript99.8\text{99.8}^{*}99.8 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 23.3 17.8 2.16 45.67 0.9879 8.77 0.0032
AdvAD (T𝑇Titalic_T=100) 12.2 100.0superscript100.0\text{{100.0}}^{*}100.0 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 23.4 17.9 1.83 46.68 0.9919 4.73 0.0020
PGD [12] 29.9 99.9superscript99.9\text{99.9}^{*}99.9 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 35.3 37.9 8.29 33.41 0.8803 34.57 0.0500
AdvAD (T𝑇Titalic_T=10) 30.6 100.0superscript100.0\text{{100.0}}^{*}100.0 start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT 35.3 38.5 7.23 34.60 0.9006 27.25 0.0480

4.4 Transferability and Effect of Step T𝑇Titalic_T on AdvAD

Table 3 reports the ASRs of black-box attacks and the corresponding results of imperceptibility. We also test AdvAD with different step of T𝑇Titalic_T for comprehensive evaluation. Consistent with the diffusion models, a larger T𝑇Titalic_T denotes a finer decomposition granularity of the entire process, corresponding to the strength of adversarial guidance at each step. Thus, AdvAD with a larger T𝑇Titalic_T exhibits better imperceptibility, while a smaller T𝑇Titalic_T implies stronger black-box transferability. Notbly, though there is a clear negative correlation between imperceptibility and transferability, our AdvAD exceeds all comparison attacks in both of transferability and imperceptibility at different comparable levels, demonstrating the effectiveness of the proposed novel modeling framework.

To further elaborate the relationship between transferability and imperceptibility of AdvAD, as well as the optimal trade-off in practice, we plot two line graphs in Figure 5 under more values of T𝑇Titalic_T. As shown in Figure 5 (a), as the value of T𝑇Titalic_T on the horizontal axis changes, the relationship between imperceptibility and transferability shows a clear proportional trend as mentioned above, consistent across different surrogate models. For the optimal trade-off, we consider that the intersection point of the two curves represents a balance between imperceptibility and transferability. Accordingly, for the ResNet-50 and MobileNetV2 models, the optimal values of T𝑇Titalic_T are 50 and 25, respectively. Moreover, Figure 5 (b) illustrates more direct curves of this relationship and the positions of other comparison methods within it. Note that, all the other comparison methods are located to the lower left of the curve of AdvAD. This indicates that our method consistently achieves the best results in both transferability and imperceptibility compared with other state-of-the-art restricted imperceptible attacks, demonstrating the effectiveness of our AdvAD as a new attack framework with flexibility through the proposed non-parametric diffusion process.

Refer to caption
Figure 6: Values of λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (left) and 𝜹tsubscriptnormsubscript𝜹𝑡\|\boldsymbol{\delta}_{t}\|_{\infty}∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (right) of Eq. (12) throughout the diffusion process.
Figure 7: Results of AdvAD and AdvAD-X with smaller ξ𝜉\xiitalic_ξ against Res-50. Step T𝑇Titalic_T is fixed as 1000100010001000.
ξ𝜉\xiitalic_ξ Attack ASR\uparrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT\downarrow PSNR\uparrow SSIM\uparrow FID\downarrow
4/255 AdvAD 98.6 0.93 53.27 0.9986 1.78
AdvAD-XsuperscriptAdvAD-Xbold-†\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 100.0 0.29 65.07 0.9998 0.18
2/255 AdvAD 96.1 0.82 54.85 0.9989 1.33
AdvAD-XsuperscriptAdvAD-Xbold-†\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 99.4 0.27 65.95 0.9998 0.15
1/255 AdvAD 87.4 0.66 57.87 0.9993 0.77
AdvAD-XsuperscriptAdvAD-Xbold-†\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 94.8 0.26 66.42 0.9998 0.14

4.5 Analysis

Eq. (12) in Practice. With the derived analytical formulation of Proposition 1, in Figure 7, we illustrate the actual values of λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝜹tsubscriptnormsubscript𝜹𝑡\|\boldsymbol{\delta}_{t}\|_{\infty}∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT of Eq. (12) using 100 randomly selected images. While Proposition 1 indicates that the upper bound of 𝜹tsubscriptnormsubscript𝜹𝑡\|\boldsymbol{\delta}_{t}\|_{\infty}∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT is invariant with respect to step t𝑡titalic_t, the actual strength of the adversarial guidance produced by AMG rapidly decreases as the process progresses, which validates the unique property given at the end of Sec. 3.2. With the similarly decreasing λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, the whole term of λt𝜹tsubscript𝜆𝑡subscriptnormsubscript𝜹𝑡\lambda_{t}\|\boldsymbol{\delta}_{t}\|_{\infty}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT representing lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT distance of the guidance at step t𝑡titalic_t also decreases from about 0.0008 to 0, supporting that the proposed modeling framework performs imperceptible attacks with inherently small perturbation strength. Performance with Smaller ξ𝜉\xiitalic_ξ. The results of AdvAD and AdvAD-X with smaller ξ𝜉\xiitalic_ξ for PC module are shown in Table 7. As ξ𝜉\xiitalic_ξ decreases from 8888 to 2222, the imperceptibility is naturally improved because of the upper bound of perturbation becomes lower, yet the ASR of 94.8% only drops slightly. When ξ=1/255𝜉1255\xi=1/255italic_ξ = 1 / 255, AdvAD still holds 87.4% ASR with 57.87 PSNR and 0.9993 SSIM, which means a large number of examples still can fool the DNN with a maximum of ±1plus-or-minus1\pm 1± 1 modification for each pixel, demonstrating the effectiveness of the adversarial guidance injected in the proposed diffusion process for attacks. Moreover, we provide the ablation study of AdvAD-X and additional discussions in Appendix D.3, D.4.

5 Conclusion and Outlook

In this paper, we propose a novel, fundamental modeling framework distinct from existing paradigms to tackle the challenge of imperceptible attacks. By exploring and deriving basic theory of diffusion models, the proposed AdvAD performs attacks through a non-parametric diffusion process with adversarial guidance, achieving inherently lower overall perturbation strength with high attack efficacy from a modeling perspective. Besides, the proposed AdvAD-X evaluates the extreme of this novel modeling framework and further reduces the perturbation strength to an extremely low level in an ideal scenario. Extensive experimental results support the effectiveness and progressiveness of the proposed methods. Beyond imperceptibility, AdvAD holds the potential to become a general and extensible attack paradigm thanks to the solid theoretical foundation and the innovative, controllable diffusion-based process for attacks. In addition, we also hope the new observation that AdvAD-X can successfully attack with extremely small perturbation using floating-point raw data can bring inspiration for revealing the robustness and interpretability (e.g., decision boundaries) of DNNs.

Acknowledgments and Disclosure of Funding

This work was supported by NSFC (Grant No. 62072484), Natural Science Foundation of Guangdong Province (Grant No. 2514050000889) and Guangdong Key Laboratory of Information Security (No. 2023B1212060026).

References

  • Szegedy et al. [2014] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
  • Goodfellow et al. [2015] I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
  • Yuan et al. [2019] X. Yuan, P. He, Q. Zhu, and X. Li. Adversarial examples: Attacks and defenses for deep learning. IEEE Transactions on Neural Networks and Learning Systems, 30(9):2805–2824, 2019.
  • Naseer et al. [2020] M. Naseer, S. Khan, M. Hayat, F. S. Khan, and F. Porikli. A self-supervised approach for adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 262–271, 2020.
  • Lee and Kim [2023] M. Lee and D. Kim. Robust evaluation of diffusion-based adversarial purification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 134–144, 2023.
  • Singh et al. [2024] N. D. Singh, F. Croce, and M. Hein. Revisiting adversarial training for imagenet: Architectures, training and generalization across threat models. Advances in Neural Information Processing Systems, 36, 2024.
  • Luo et al. [2023] A. Luo, C. Kong, J. Huang, Y. Hu, X. Kang, and A. C Kot. Beyond the prior forgery knowledge: Mining critical clues for general face forgery detection. IEEE Transactions on Information Forensics and Security, 19:1168–1182, 2023.
  • Tramèr et al. [2018] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel. Ensemble adversarial training: Attacks and defenses. In International Conference on Learning Representations, 2018.
  • Salman et al. [2020a] H. Salman, A. Ilyas, L. Engstrom, A. Kapoor, and A. Madry. Do adversarially robust imagenet models transfer better? Advances in Neural Information Processing Systems, 33:3533–3545, 2020a.
  • Luo et al. [2021] A. Luo, E. Li, Y. Liu, X. Kang, and Z J. Wang. A capsule network based approach for detection of audio spoofing attacks. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 6359–6363. IEEE, 2021.
  • Cohen et al. [2019] J. Cohen, E. Rosenfeld, and Z. Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pages 1310–1320. PMLR, 2019.
  • Madry et al. [2018] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. International Conference on Learning Representations, 2018.
  • Dong et al. [2018] Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li. Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9185–9193, 2018.
  • Zhang et al. [2022] Y. Zhang, Y. Tan, T. Chen, X. Liu, Q. Zhang, and Y. Li. Enhancing the transferability of adversarial examples with random patch. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 1672–1678, 2022.
  • Wei et al. [2023] Z. Wei, J. Chen, Z. Wu, and Y. Jiang. Enhancing the self-universality for transferable targeted attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12281–12290, 2023.
  • Sharif et al. [2018] M. Sharif, L. Bauer, and M. Reiter. On the suitability of lp-norms for creating and preventing adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1605–1613, 2018.
  • Carlini and Wagner [2017] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, pages 39–57. IEEE, 2017.
  • Luo et al. [2018] B. Luo, Y. Liu, L. Wei, and Q. Xu. Towards imperceptible and robust adversarial example attacks against neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
  • Zhao et al. [2020] Z. Zhao, Z. Liu, and M. Larson. Towards large yet imperceptible adversarial image perturbations with perceptual color distance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1039–1048, 2020.
  • Laidlaw et al. [2021] C. Laidlaw, S. Singla, and S. Feizi. Perceptual adversarial robustness: Defense against unseen threat models. In International Conference on Learning Representations, 2021.
  • Duan et al. [2021] R. Duan, Y. Chen, D. Niu, Y. Yang, A. Qin, and Y. He. Advdrop: Adversarial attack to dnns by dropping information. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7506–7515, 2021.
  • Chen et al. [2023a] Z. Chen, Z. Wang, J. Huang, W. Zhao, X. Liu, and D. Guan. Imperceptible adversarial attack via invertible neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 414–424, 2023a.
  • Jia et al. [2022] S. Jia, C. Ma, T. Yao, B. Yin, S. Ding, and X. Yang. Exploring frequency adversarial attacks for face forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4103–4112, 2022.
  • Luo et al. [2022] C. Luo, Q. Lin, W. Xie, B. Wu, J. Xie, and L. Shen. Frequency-driven imperceptible adversarial attack on semantic similarity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15315–15324, 2022.
  • Song et al. [2018] Y. Song, R. Shu, N. Kushman, and S. Ermon. Constructing unrestricted adversarial examples with generative models. Advances in neural information processing systems, 31, 2018.
  • Ho et al. [2020] J. Ho, A. Jain, and P. Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  • Song et al. [2020a] J. Song, C. Meng, and S. Ermon. Denoising diffusion implicit models. In International Conference on Learning Representations, 2020a.
  • Rombach et al. [2022] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Cision and Pattern Recognition, pages 10684–10695, 2022.
  • Chen et al. [2023b] X. Chen, X. Gao, J. Zhao, K. Ye, and C. Xu. Advdiffuser: Natural adversarial example synthesis with diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4562–4572, 2023b.
  • Xue et al. [2023] H. Xue, A. Araujo, B. Hu, and Y. Chen. Diffusion-based adversarial sample generation for improved stealthiness and controllability. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  • Chen et al. [2023c] J. Chen, H. Chen, K. Chen, Y. Zhang, Z. Zou, and Z. Shi. Diffusion models for imperceptible and transferable adversarial attack. arXiv preprint arXiv:2305.08192, 2023c.
  • Chen et al. [2024] Z. Chen, B. Li, S. Wu, K. Jiang, S. Ding, and W. Zhang. Content-based unrestricted adversarial attack. Advances in Neural Information Processing Systems, 36, 2024.
  • Dhariwal and Nichol [2021] P. Dhariwal and A. Nichol. Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
  • Song et al. [2020b] Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020b.
  • Song and Ermon [2019] Y. Song and S. Ermon. Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems, 32, 2019.
  • Zhou et al. [2016] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016.
  • Selvaraju et al. [2017] R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, pages 618–626, 2017.
  • Krizhevsky et al. [2012] A. Krizhevsky, I. Sutskever, and G. E Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 2012.
  • Ioffe and Szegedy [2015] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
  • Yuan et al. [2022] S. Yuan, Q. Zhang, L. Gao, Y. Cheng, and J. Song. Natural color fool: Towards boosting black-box unrestricted attacks. Advances in Neural Information Processing Systems, 35:7546–7560, 2022.
  • Russakovsky et al. [2015] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252, 2015.
  • He et al. [2016] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer VVision and pattern recognition, pages 770–778, 2016.
  • Liu et al. [2022] Z. Liu, H. Mao, C. Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.
  • Liu et al. [2021] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
  • Vaswani et al. [2017] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  • Zhu et al. [2024] L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang. Vision mamba: Efficient visual representation learning with bidirectional state space model. In International conference on machine learning, 2024.
  • Gu and Dao [2023] A. Gu and T. Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
  • Wang et al. [2004] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004.
  • Zhang et al. [2018] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018.
  • Heusel et al. [2017] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  • Ke et al. [2021] J. Ke, Q. Wang, Y. Wang, P. Milanfar, and F. Yang. Musiq: Multi-scale image quality transformer. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5148–5157, 2021.
  • Salman et al. [2020b] H. Salman, M. Sun, G. Yang, A. Kapoor, and J Z. Kolter. Denoised smoothing: A provable defense for pretrained classifiers. Advances in Neural Information Processing Systems, 33:21945–21957, 2020b.
  • Liu et al. [2023] C. Liu, Y. Dong, W. Xiang, X. Yang, H. Su, J. Zhu, Y. Chen, Y. He, H. Xue, and S. Zheng. A comprehensive study on robustness of image classification models: Benchmarking and rethinking. arXiv preprint arXiv:2302.14301, 2023.
  • Das et al. [2018] N. Das, M. Shanbhogue, S. Chen, F. Hohman, S. Li, L. Chen, M. E. Kounavis, and D. Chau. Shield: Fast, practical defense and vaccination for deep learning using jpeg compression. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 196–204, 2018.
  • Guo et al. [2018] C. Guo, M. Rana, M. Cisse, and L. van der Maaten. Countering adversarial images using input transformations. In International Conference on Learning Representations, 2018.
  • Sandler et al. [2018] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.

Appendix A Related Work

Beginning with the attack paradigm of Fast Gradient Sign Method (FGSM) [2], there are numerous great works focusing on the imperceptibility of adversarial attacks have been proposed [17, 18, 19, 20, 21, 22, 23]. In contrast to another line of attacks aimed at improving the attack success rate and the transferability for black-box models with a more lenient limitation of perturbation strength [13, 14, 15], imperceptible adversarial attacks are dedicate to accomplish attacks using as minimal perturbation as possible while deceiving human perception. Among them, PerC-AL [19] improves the imperceptibility by alternating between the classification loss and perceptual color difference when updating perturbations. AdvDrop [21] uses Discrete Cosine Transform (DCT) to discard details in images that are imperceptible for humans. SSAH [24] limits perturbation to high-frequency components using Discrete Wavelet Transform (DWT) to make it undetectable. Similarly, AdvINN [22] also utilizes the DWT and exploits invertible neural networks to specially perform targeted attacks. In addition, with an unrestriced setting [25], some recent works have incorporated the capabilities of generative models (e.g., diffusion models [26]) into common attack frameworks to make adversarial examples more natural and enhance the imperceptibility. DiffAttack [31] and ACA [32] combine the optimization of adversarial losses with the Stable Diffusion [28] to generate unrestricted adversarial examples, while Diff-PGD [30] and AdvDiffuser [29] incorporate the classic PGD method [12] into the diffusion steps to make the adversarial examples undergo denoising processing.

Compared to these traditional restricted imperceptible attacks or the recent unrestricted imperceptible attacks (e.g., diffusion-based), the proposed AdvAD is a completely novel approach distinct from existing attack paradigms. It is the first pilot framework which innovatively conceptualizes attacking as a non-parametric diffusion process by theoretically exploring fundamental modeling approach of diffusion models rather than using their denoising or generative abilities, achieving high attack efficacy and imperceptibility with intrinsically lower perturbation strength. Following the setting of restricted attack, the modeling of AdvAD is theoretically derived from conditional sampling of diffusion models, supporting its attack performance and imperceptibility, and does not require any loss functions, optimizers, or additional neural networks.

Appendix B Derivations and Proofs

In this section, we first introduce the specific straightforward Pixel-level Constraint (PC) for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT that is simply mentioned at the beginning of Section 3.3 of the main text as an intuitive preliminary, then we provide the detailed proofs of Theorem 1 and Proposition 1, 2 given in the PC for ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

B.1 Straightforward PC for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT

For each known 𝒙¯𝒕subscriptbold-¯𝒙𝒕\boldsymbol{\bar{x}_{t}}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT bold_italic_t end_POSTSUBSCRIPT in the fixed diffusion trajectory of the original image 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{{x}}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT, and modified 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in the attacking trajectory with the proposed Attacked Model Guidance (AMG) leading to the adversarial example 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{{x}}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT, the objective of PC is to control and constrain these two trajectories to be close, ensuring the final 𝒙^0subscriptbold-^𝒙0\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (i.e., 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{{x}}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT) close to 𝒙¯0subscriptbold-¯𝒙0\boldsymbol{\bar{x}}_{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (i.e., 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{{x}}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT). It is obvious that a straightforward way to achieve the goal by directly constrain every 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT using 𝒙¯tsubscriptbold-¯𝒙𝑡\boldsymbol{\bar{x}}_{t}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Thus, in PC for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, we can utilize the restriction of adversarial examples and the relationship between 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{{x}}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT, 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT and 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT to derive the constraint for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Given the budget ξ𝜉\xiitalic_ξ, the desired restriction of adversarial examples is

𝒙adv𝒙oriξ.subscriptnormsubscript𝒙𝑎𝑑𝑣subscript𝒙𝑜𝑟𝑖𝜉\left\|\boldsymbol{x}_{adv}-\boldsymbol{x}_{ori}\right\|_{\infty}\leq\xi.∥ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ . (15)

Next, since the ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is unconstrained in the case of PC for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, we adopt the initialized ϵ0subscriptbold-italic-ϵ0\boldsymbol{{\epsilon}}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to calculate 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT approximating 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT as:

𝒙adv𝒙^t0(ϵ0)=𝒙^t1αtϵ0αt.subscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡0subscriptbold-italic-ϵ0subscriptbold-^𝒙𝑡1subscript𝛼𝑡subscriptbold-italic-ϵ0subscript𝛼𝑡\boldsymbol{x}_{adv}\approx\boldsymbol{\hat{x}}_{t}^{0}(\boldsymbol{\epsilon}_% {0})=\frac{\boldsymbol{\hat{x}}_{t}-\sqrt{1-\alpha_{t}}\boldsymbol{\epsilon}_{% 0}}{\sqrt{\alpha_{t}}}.bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ≈ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG . (16)

where α0=1subscript𝛼01\alpha_{0}=1italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1, and α1:T(0,1]Tsubscript𝛼:1𝑇superscript01𝑇\alpha_{1:T}\in(0,1]^{T}italic_α start_POSTSUBSCRIPT 1 : italic_T end_POSTSUBSCRIPT ∈ ( 0 , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is a pre-defined decreasing scalar sequence. And the 𝒙¯tsubscriptbold-¯𝒙𝑡\boldsymbol{\bar{x}}_{t}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT of the original trajectory is calculated as:

𝒙¯t=αt𝒙ori+1αtϵ0subscriptbold-¯𝒙𝑡subscript𝛼𝑡subscript𝒙𝑜𝑟𝑖1subscript𝛼𝑡subscriptbold-italic-ϵ0\boldsymbol{\bar{x}}_{t}=\sqrt{\alpha_{t}}\boldsymbol{x}_{ori}+\sqrt{1-\alpha_% {t}}\boldsymbol{\epsilon}_{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT (17)

By substituting Eq. (16) and Eq. (17) into Eq. (15), the constraint for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT can be easily derived, denoted as:

𝒙^t1αtϵ0αt𝒙¯t1αtϵ0αtξ𝒙^tαt𝒙¯tαtξ𝒙^t𝒙¯tαtξsubscriptdelimited-∥∥subscriptbold-^𝒙𝑡1subscript𝛼𝑡subscriptbold-italic-ϵ0subscript𝛼𝑡subscriptbold-¯𝒙𝑡1subscript𝛼𝑡subscriptbold-italic-ϵ0subscript𝛼𝑡𝜉subscriptdelimited-∥∥subscriptbold-^𝒙𝑡subscript𝛼𝑡subscriptbold-¯𝒙𝑡subscript𝛼𝑡𝜉subscriptdelimited-∥∥subscriptbold-^𝒙𝑡subscriptbold-¯𝒙𝑡subscript𝛼𝑡𝜉\begin{split}&\left\|\frac{\boldsymbol{\hat{x}}_{t}-\sqrt{1-\alpha_{t}}% \boldsymbol{\epsilon}_{0}}{\sqrt{\alpha_{t}}}-\frac{\boldsymbol{\bar{x}}_{t}-% \sqrt{1-\alpha_{t}}\boldsymbol{\epsilon}_{0}}{\sqrt{\alpha_{t}}}\right\|_{% \infty}\leq\xi\\ \xLeftrightarrow{}&\left\|\frac{\boldsymbol{\hat{x}}_{t}}{\sqrt{\alpha_{t}}}-% \frac{\boldsymbol{\bar{x}}_{t}}{\sqrt{\alpha_{t}}}\right\|_{\infty}\leq\xi\\ \xLeftrightarrow{}&\left\|{\boldsymbol{\hat{x}}_{t}}-{\boldsymbol{\bar{x}}_{t}% }\right\|_{\infty}\leq\sqrt{\alpha_{t}}\,\xi\end{split}start_ROW start_CELL end_CELL start_CELL ∥ divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ∥ divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG italic_ξ end_CELL end_ROW (18)

In this way, the PC for 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT can be implemented at the start of each step to achieve the basic restrictions by employing a projection operation to 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT according to Eq. (18). However, this direct modification of 𝒙^tsubscriptbold-^𝒙𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT at each step obviously disrupts the entire diffusion trajectory from noise to our desired adversarial distribution, and can not achieve the final imperceptibility. Additionally, it is observed that the estimation of 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT in each step employs a fixed ϵ0subscriptbold-italic-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, impairing the adversarial guidance crafted by AMG.

B.2 Proof of Theorem 1

Therefore, we carefully analysis the important noise term in our diffusion-based modeling approach for adversarial attacks, and present Theorem 1 to support our PC for ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as described in the main text. The proof of Theorem 1 is provided as follow.

Theorem 1

Given diffusion coefficients αT:0(0,1]Tsubscript𝛼:𝑇0superscript01𝑇\alpha_{T:0}\in(0,1]^{T}italic_α start_POSTSUBSCRIPT italic_T : 0 end_POSTSUBSCRIPT ∈ ( 0 , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, the 𝐱orisubscript𝐱𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT, 𝐱¯tsubscriptbold-¯𝐱𝑡\boldsymbol{\bar{x}}_{t}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, ϵ0subscriptbold-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT from the original trajectory, 𝐱^tsubscriptbold-^𝐱𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, ϵ^tsubscriptbold-^bold-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from the modified trajectory, and a variable ξ𝜉\xiitalic_ξ, if ϵ^tsubscriptbold-^bold-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and ϵ0subscriptbold-ϵ0\boldsymbol{{\epsilon}}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT satisfies

ϵ^tϵ0αT1αTξ,subscriptnormsubscriptbold-^bold-italic-ϵ𝑡subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\hat{\epsilon}}_{t}-\boldsymbol{\epsilon}_{0}\|_{\infty}\leq% \frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi,∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ , (19)

for all t[T:1]t\in[T:1]italic_t ∈ [ italic_T : 1 ], then it follows that 𝐱^t𝐱¯t(αt1αtαT1αT)ξ,𝐱^t0𝐱oriξ,and𝐱^0𝐱oriξformulae-sequencesubscriptnormsubscriptbold-^𝐱𝑡subscriptbold-¯𝐱𝑡subscript𝛼𝑡1subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉formulae-sequencesubscriptnormsuperscriptsubscriptbold-^𝐱𝑡0subscript𝐱𝑜𝑟𝑖𝜉andsubscriptnormsubscriptbold-^𝐱0subscript𝐱𝑜𝑟𝑖𝜉\|\boldsymbol{\hat{x}}_{t}-\boldsymbol{\bar{x}}_{t}\|_{\infty}\leq(\sqrt{% \alpha_{t}}-\sqrt{1-\alpha_{t}}\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}})% \xi,\ \|\boldsymbol{\hat{x}}_{t}^{0}-\boldsymbol{x}_{ori}\|_{\infty}\leq\xi,\ % \text{and}\ \|\boldsymbol{\hat{x}}_{0}-\boldsymbol{x}_{ori}\|_{\infty}\leq\xi∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ ( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ , ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ , and ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ hold true.

Proof 1

We prove Theorem 1 using mathematical induction.

Initial case.

For trajectories of the adversarial example 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT and original image 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT that start from ϵ^T+1=ϵ0subscriptbold-^bold-italic-ϵ𝑇1subscriptbold-italic-ϵ0\boldsymbol{\hat{\epsilon}}_{T+1}=\boldsymbol{\epsilon}_{0}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, 𝒙^T=𝒙¯Tsubscriptbold-^𝒙𝑇subscriptbold-¯𝒙𝑇\boldsymbol{\hat{x}}_{T}=\boldsymbol{\bar{x}}_{T}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT and 𝒙^T0=𝒙¯T0superscriptsubscriptbold-^𝒙𝑇0superscriptsubscriptbold-¯𝒙𝑇0\boldsymbol{\hat{x}}_{T}^{0}=\boldsymbol{\bar{x}}_{T}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT, we can unfold the formula for computing the 𝒙^T10superscriptsubscriptbold-^𝒙𝑇10\boldsymbol{\hat{x}}_{T-1}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT with the updated ϵ^Tsubscriptbold-^bold-italic-ϵ𝑇\boldsymbol{\hat{\epsilon}}_{T}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT as:

𝒙^T10superscriptsubscriptbold-^𝒙𝑇10\displaystyle\boldsymbol{\hat{x}}_{T-1}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT =\displaystyle=\;= 𝒙^T1αT11αT1αT1ϵ^Tsubscriptbold-^𝒙𝑇1subscript𝛼𝑇11subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-^bold-italic-ϵ𝑇\displaystyle\frac{\boldsymbol{\hat{x}}_{T-1}}{\sqrt{\alpha_{T-1}}}-\frac{% \sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\boldsymbol{\hat{\epsilon}}_{T}divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
=\displaystyle=\;= αT1(𝒙^T1αTϵ^TαT)+1αT1ϵ^TαT11αT1αT1ϵ^Tsubscript𝛼𝑇1subscriptbold-^𝒙𝑇1subscript𝛼𝑇subscriptbold-^bold-italic-ϵ𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-^bold-italic-ϵ𝑇subscript𝛼𝑇11subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-^bold-italic-ϵ𝑇\displaystyle\frac{\sqrt{\alpha_{T-1}}\left(\frac{\boldsymbol{\hat{x}}_{T}-% \sqrt{1-\alpha_{T}}\boldsymbol{\hat{\epsilon}}_{T}}{\sqrt{\alpha_{T}}}\right)+% \sqrt{1-\alpha_{T-1}}\boldsymbol{\hat{\epsilon}}_{T}}{\sqrt{\alpha_{T-1}}}-% \frac{\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\boldsymbol{\hat{\epsilon}}_{T}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
=\displaystyle=\;= 𝒙^T1αTϵ^TαT.subscriptbold-^𝒙𝑇1subscript𝛼𝑇subscriptbold-^bold-italic-ϵ𝑇subscript𝛼𝑇\displaystyle\frac{\boldsymbol{\hat{x}}_{T}-\sqrt{1-\alpha_{T}}\boldsymbol{% \hat{\epsilon}}_{T}}{\sqrt{\alpha_{T}}}.divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG . (20)

For 𝒙¯T10superscriptsubscriptbold-¯𝒙𝑇10\boldsymbol{\bar{x}}_{T-1}^{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT from the fixed trajectory where 𝒙¯t0superscriptsubscriptbold-¯𝒙𝑡0\boldsymbol{\bar{x}}_{t}^{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT always equals to 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT, we have:

𝒙¯T10=𝒙¯T1αTϵ0αT=𝒙orisuperscriptsubscriptbold-¯𝒙𝑇10subscriptbold-¯𝒙𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇subscript𝒙𝑜𝑟𝑖\boldsymbol{\bar{x}}_{T-1}^{0}=\frac{\boldsymbol{\bar{x}}_{T}-\sqrt{1-\alpha_{% T}}\boldsymbol{{\epsilon}}_{0}}{\sqrt{\alpha_{T}}}=\boldsymbol{x}_{ori}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT (21)

With 𝒙^T=𝒙¯Tsubscriptbold-^𝒙𝑇subscriptbold-¯𝒙𝑇\boldsymbol{\hat{x}}_{T}=\boldsymbol{\bar{x}}_{T}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, Eq. (20), Eq. (21), and the relationship of ϵ^Tϵ0αT1αTξsubscriptnormsubscriptbold-^bold-italic-ϵ𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\left\|\boldsymbol{\hat{\epsilon}}_{T}-\boldsymbol{\epsilon}_{0}\right\|_{% \infty}\leq\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\,\xi∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ, we have:

ϵ^Tϵ0αT1αTξsubscriptnormsubscriptbold-^bold-italic-ϵ𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\left\|\boldsymbol{\hat{\epsilon}}_{T}-\boldsymbol{\epsilon}_{0}% \right\|_{\infty}\leq\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\,\xi∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ
1αTϵ^TαT1αTϵ0αTξsubscriptnorm1subscript𝛼𝑇subscriptbold-^bold-italic-ϵ𝑇subscript𝛼𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇𝜉\displaystyle\left\|\frac{\sqrt{1-\alpha_{T}}\boldsymbol{\hat{\epsilon}}_{T}}{% \sqrt{\alpha_{T}}}-\frac{\sqrt{1-\alpha_{T}}\boldsymbol{\epsilon}_{0}}{\sqrt{% \alpha_{T}}}\right\|_{\infty}\leq\xi∥ divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ
𝒙^T1αTϵ^TαT𝒙¯T1αTϵ0αTξsubscriptnormsubscriptbold-^𝒙𝑇1subscript𝛼𝑇subscriptbold-^bold-italic-ϵ𝑇subscript𝛼𝑇subscriptbold-¯𝒙𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇𝜉\displaystyle\left\|\frac{\boldsymbol{\hat{x}}_{T}-\sqrt{1-\alpha_{T}}% \boldsymbol{\hat{\epsilon}}_{T}}{\sqrt{\alpha_{T}}}-\frac{\boldsymbol{\bar{x}}% _{T}-\sqrt{1-\alpha_{T}}\boldsymbol{\epsilon}_{0}}{\sqrt{\alpha_{T}}}\right\|_% {\infty}\leq\xi∥ divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ
𝒙^T10𝒙¯T10ξsubscriptnormsuperscriptsubscriptbold-^𝒙𝑇10superscriptsubscriptbold-¯𝒙𝑇10𝜉\displaystyle\left\|\boldsymbol{\hat{x}}_{T-1}^{0}-\boldsymbol{\bar{x}}_{T-1}^% {0}\right\|_{\infty}\leq\xi∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ (22)

Meanwhile, for the relationship between 𝒙^T1subscriptbold-^𝒙𝑇1\boldsymbol{\hat{x}}_{T-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT and 𝒙¯T1subscriptbold-¯𝒙𝑇1\boldsymbol{\bar{x}}_{T-1}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT at the initial step, we have:

𝒙^T1𝒙¯T1subscriptnormsubscriptbold-^𝒙𝑇1subscriptbold-¯𝒙𝑇1\displaystyle\left\|\boldsymbol{\hat{x}}_{T-1}-\boldsymbol{\bar{x}}_{T-1}% \right\|_{\infty}∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT =\displaystyle=\;= αT1(𝒙^T1αTϵ^TαT)+1αT1ϵ^T\displaystyle\left\|\sqrt{\alpha_{T-1}}\left(\frac{\boldsymbol{\hat{x}}_{T}-% \sqrt{1-\alpha_{T}}\boldsymbol{\hat{\epsilon}}_{T}}{\sqrt{\alpha_{T}}}\right)+% \sqrt{1-\alpha_{T-1}}\boldsymbol{\hat{\epsilon}}_{T}\right.∥ square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
αT1(𝒙¯T1αTϵ0αT)1αT1ϵ0subscript𝛼𝑇1subscriptbold-¯𝒙𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇evaluated-at1subscript𝛼𝑇1subscriptbold-italic-ϵ0\displaystyle\left.-\sqrt{\alpha_{T-1}}\left(\frac{\boldsymbol{\bar{x}}_{T}-% \sqrt{1-\alpha_{T}}\boldsymbol{{\epsilon}}_{0}}{\sqrt{\alpha_{T}}}\right)-% \sqrt{1-\alpha_{T-1}}\boldsymbol{{\epsilon}}_{0}\right\|_{\infty}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
=\displaystyle=\;= (1αT1αT11αTαT)(ϵ^Tϵ0)subscriptnorm1subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇subscript𝛼𝑇subscriptbold-^bold-italic-ϵ𝑇subscriptbold-italic-ϵ0\displaystyle\left\|\left(\sqrt{1-\alpha_{T-1}}-\frac{\sqrt{\alpha_{T-1}}\sqrt% {1-\alpha_{T}}}{\sqrt{\alpha_{T}}}\right)\left(\boldsymbol{\hat{\epsilon}}_{T}% -\boldsymbol{{\epsilon}}_{0}\right)\right\|_{\infty}∥ ( square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG - divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
=\displaystyle=\;= |1αT1αT11αTαT|ϵ^Tϵ01subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇subscript𝛼𝑇subscriptnormsubscriptbold-^bold-italic-ϵ𝑇subscriptbold-italic-ϵ0\displaystyle{\left|\sqrt{1-\alpha_{T-1}}-\frac{\sqrt{\alpha_{T-1}}\sqrt{1-% \alpha_{T}}}{\sqrt{\alpha_{T}}}\right|}\,\left\|\boldsymbol{\hat{\epsilon}}_{T% }-\boldsymbol{{\epsilon}}_{0}\right\|_{\infty}| square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG - divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG | ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (23)
=\displaystyle=\;= (αT11αTαT1αT1)ϵ^Tϵ0subscript𝛼𝑇11subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscriptnormsubscriptbold-^bold-italic-ϵ𝑇subscriptbold-italic-ϵ0\displaystyle{\left(\frac{\sqrt{\alpha_{T-1}}\sqrt{1-\alpha_{T}}}{\sqrt{\alpha% _{T}}}-\sqrt{1-\alpha_{T-1}}\right)}\,\left\|\boldsymbol{\hat{\epsilon}}_{T}-% \boldsymbol{{\epsilon}}_{0}\right\|_{\infty}( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ) ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (24)
\displaystyle\leq\; (αT11αT1αT1αT)ξ,subscript𝛼𝑇11subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\left(\sqrt{\alpha_{T-1}}-\sqrt{1-\alpha_{T-1}}\frac{\sqrt{\alpha% _{T}}}{\sqrt{1-\alpha_{T}}}\right)\xi,( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ , (25)

where the transition from Eq. (23) to Eq. (24) is obtained with the real constant value of αtsubscript𝛼𝑡\alpha_{t}italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. At this point, it can be seen that Theorem 1 holds in the initial case.

Inductive step.

Assuming the theorem holds for some arbitrary step k𝑘kitalic_k, where Tk>1𝑇𝑘1T\geq k>1italic_T ≥ italic_k > 1, we have:

ϵ^k+1ϵ0αT1αTξ,subscriptnormsubscriptbold-^bold-italic-ϵ𝑘1subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\hat{\epsilon}}_{k+1}-\boldsymbol{\epsilon}_{0}\|_{\infty}\leq% \frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi,∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ , (26)
𝒙^k0𝒙ori=𝒙^k0𝒙¯k0ξ,subscriptnormsuperscriptsubscriptbold-^𝒙𝑘0subscript𝒙𝑜𝑟𝑖subscriptnormsuperscriptsubscriptbold-^𝒙𝑘0superscriptsubscriptbold-¯𝒙𝑘0𝜉\|\boldsymbol{\hat{x}}_{k}^{0}-\boldsymbol{x}_{ori}\|_{\infty}=\|\boldsymbol{% \hat{x}}_{k}^{0}-\boldsymbol{\bar{x}}_{k}^{0}\|_{\infty}\leq\xi,∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ , (27)

and

𝒙^k𝒙¯k(αk1αkαT1αT)ξ.subscriptnormsubscriptbold-^𝒙𝑘subscriptbold-¯𝒙𝑘subscript𝛼𝑘1subscript𝛼𝑘subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\hat{x}}_{k}-\boldsymbol{\bar{x}}_{k}\|_{\infty}\leq(\sqrt{% \alpha_{k}}-\sqrt{1-\alpha_{k}}\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}})\xi.∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ ( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ . (28)

Based on the inductive hypothesis, we next show the validity of the theorem at step k1𝑘1k-1italic_k - 1. Similar to Eq.(20), we unfold the calculation of 𝒙^k10superscriptsubscriptbold-^𝒙𝑘10\boldsymbol{\hat{x}}_{k-1}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT with 𝒙^ksubscriptbold-^𝒙𝑘\boldsymbol{\hat{x}}_{k}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and ϵ^ksubscriptbold-^bold-italic-ϵ𝑘\boldsymbol{\hat{\epsilon}}_{k}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT as:

𝒙^k10superscriptsubscriptbold-^𝒙𝑘10\displaystyle\boldsymbol{\hat{x}}_{k-1}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT =\displaystyle=\;= 𝒙^k1αk11αk1αk1ϵ^ksubscriptbold-^𝒙𝑘1subscript𝛼𝑘11subscript𝛼𝑘1subscript𝛼𝑘1subscriptbold-^bold-italic-ϵ𝑘\displaystyle\frac{\boldsymbol{\hat{x}}_{k-1}}{\sqrt{\alpha_{k-1}}}-\frac{% \sqrt{1-\alpha_{k-1}}}{\sqrt{\alpha_{k-1}}}\boldsymbol{\hat{\epsilon}}_{k}divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
=\displaystyle=\;= αk1(𝒙^k1αkϵ^kαk)+1αk1ϵ^kαk11αk1αk1ϵ^ksubscript𝛼𝑘1subscriptbold-^𝒙𝑘1subscript𝛼𝑘subscriptbold-^bold-italic-ϵ𝑘subscript𝛼𝑘1subscript𝛼𝑘1subscriptbold-^bold-italic-ϵ𝑘subscript𝛼𝑘11subscript𝛼𝑘1subscript𝛼𝑘1subscriptbold-^bold-italic-ϵ𝑘\displaystyle\frac{\sqrt{\alpha_{k-1}}\left(\frac{\boldsymbol{\hat{x}}_{k}-% \sqrt{1-\alpha_{k}}\boldsymbol{\hat{\epsilon}}_{k}}{\sqrt{\alpha_{k}}}\right)+% \sqrt{1-\alpha_{k-1}}\boldsymbol{\hat{\epsilon}}_{k}}{\sqrt{\alpha_{k-1}}}-% \frac{\sqrt{1-\alpha_{k-1}}}{\sqrt{\alpha_{k-1}}}\boldsymbol{\hat{\epsilon}}_{k}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
=\displaystyle=\;= 𝒙^k1αkϵ^kαk.subscriptbold-^𝒙𝑘1subscript𝛼𝑘subscriptbold-^bold-italic-ϵ𝑘subscript𝛼𝑘\displaystyle\frac{\boldsymbol{\hat{x}}_{k}-\sqrt{1-\alpha_{k}}\boldsymbol{% \hat{\epsilon}}_{k}}{\sqrt{\alpha_{k}}}.divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG . (29)

And the 𝒙¯k10superscriptsubscriptbold-¯𝒙𝑘10\boldsymbol{\bar{x}}_{k-1}^{0}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT can be denoted as:

𝒙¯k10=𝒙¯k1αkϵ0αk=𝒙orisuperscriptsubscriptbold-¯𝒙𝑘10subscriptbold-¯𝒙𝑘1subscript𝛼𝑘subscriptbold-italic-ϵ0subscript𝛼𝑘subscript𝒙𝑜𝑟𝑖\boldsymbol{\bar{x}}_{k-1}^{0}=\frac{\boldsymbol{\bar{x}}_{k}-\sqrt{1-\alpha_{% k}}\boldsymbol{{\epsilon}}_{0}}{\sqrt{\alpha_{k}}}=\boldsymbol{x}_{ori}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT (30)

Consequently, by substituting Eq. (29) and Eq. (30) into 𝒙^k10𝒙¯k10subscriptnormsuperscriptsubscriptbold-^𝒙𝑘10superscriptsubscriptbold-¯𝒙𝑘10\left\|\boldsymbol{\hat{x}}_{k-1}^{0}-\boldsymbol{\bar{x}}_{k-1}^{0}\right\|_{\infty}∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT, we have:

𝒙^k10𝒙¯k10subscriptnormsuperscriptsubscriptbold-^𝒙𝑘10superscriptsubscriptbold-¯𝒙𝑘10\displaystyle\left\|\boldsymbol{\hat{x}}_{k-1}^{0}-\boldsymbol{\bar{x}}_{k-1}^% {0}\right\|_{\infty}∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT =\displaystyle== 𝒙^k1αkϵ^kαk𝒙¯k1αkϵ0αksubscriptnormsubscriptbold-^𝒙𝑘1subscript𝛼𝑘subscriptbold-^bold-italic-ϵ𝑘subscript𝛼𝑘subscriptbold-¯𝒙𝑘1subscript𝛼𝑘subscriptbold-italic-ϵ0subscript𝛼𝑘\displaystyle\left\|\frac{\boldsymbol{\hat{x}}_{k}-\sqrt{1-\alpha_{k}}% \boldsymbol{\hat{\epsilon}}_{k}}{\sqrt{\alpha_{k}}}-\frac{\boldsymbol{\bar{x}}% _{k}-\sqrt{1-\alpha_{k}}\boldsymbol{\epsilon}_{0}}{\sqrt{\alpha_{k}}}\right\|_% {\infty}∥ divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
=\displaystyle=\;= 1αk(𝒙^k𝒙¯k)+1αkαk(ϵ0ϵ^k)subscriptnorm1subscript𝛼𝑘subscriptbold-^𝒙𝑘subscriptbold-¯𝒙𝑘1subscript𝛼𝑘subscript𝛼𝑘subscriptbold-italic-ϵ0subscriptbold-^bold-italic-ϵ𝑘\displaystyle\left\|\frac{1}{\sqrt{\alpha_{k}}}\left(\boldsymbol{\hat{x}}_{k}-% \boldsymbol{\bar{x}}_{k}\right)+\frac{\sqrt{1-\alpha_{k}}}{\sqrt{\alpha_{k}}}% \left(\boldsymbol{{\epsilon}}_{0}-\boldsymbol{\hat{\epsilon}}_{k}\right)\right% \|_{\infty}∥ divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (31)
\displaystyle\leq\; 1αk𝒙^k𝒙¯k+1αkαkϵ^kϵ01subscript𝛼𝑘subscriptnormsubscriptbold-^𝒙𝑘subscriptbold-¯𝒙𝑘1subscript𝛼𝑘subscript𝛼𝑘subscriptnormsubscriptbold-^bold-italic-ϵ𝑘subscriptbold-italic-ϵ0\displaystyle\frac{1}{\sqrt{\alpha_{k}}}\left\|\boldsymbol{\hat{x}}_{k}-% \boldsymbol{\bar{x}}_{k}\right\|_{\infty}+\frac{\sqrt{1-\alpha_{k}}}{\sqrt{% \alpha_{k}}}\left\|\boldsymbol{\hat{\epsilon}}_{k}-\boldsymbol{{\epsilon}}_{0}% \right\|_{\infty}divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (32)
\displaystyle\leq\; (11αkαkαT1αT)ξ+1αkαkϵ^kϵ0,11subscript𝛼𝑘subscript𝛼𝑘subscript𝛼𝑇1subscript𝛼𝑇𝜉1subscript𝛼𝑘subscript𝛼𝑘subscriptnormsubscriptbold-^bold-italic-ϵ𝑘subscriptbold-italic-ϵ0\displaystyle\left(1-\frac{\sqrt{1-\alpha_{k}}}{\sqrt{\alpha_{k}}}\frac{\sqrt{% \alpha_{T}}}{\sqrt{1-\alpha_{T}}}\right)\xi+\frac{\sqrt{1-\alpha_{k}}}{\sqrt{% \alpha_{k}}}\left\|\boldsymbol{\hat{\epsilon}}_{k}-\boldsymbol{{\epsilon}}_{0}% \right\|_{\infty},( 1 - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT , (33)

where Eq. (31) to Eq. (32) utilizes the triangle inequality property of lpsubscript𝑙𝑝l_{p}italic_l start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT-norm, and Eq. (33) is obtained with Eq. (28). Then, given the imposed condition of Eq. (19), we can get:

ϵ^kϵ0αT1αTξsubscriptnormsubscriptbold-^bold-italic-ϵ𝑘subscriptbold-italic-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\left\|\boldsymbol{\hat{\epsilon}}_{k}-\boldsymbol{{\epsilon}}_{0% }\right\|_{\infty}\leq\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\,\xi∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ
(11αkαkαT1αT)ξ+1αkαkϵ^kϵ0ξ11subscript𝛼𝑘subscript𝛼𝑘subscript𝛼𝑇1subscript𝛼𝑇𝜉1subscript𝛼𝑘subscript𝛼𝑘subscriptnormsubscriptbold-^bold-italic-ϵ𝑘subscriptbold-italic-ϵ0𝜉\displaystyle\left(1-\frac{\sqrt{1-\alpha_{k}}}{\sqrt{\alpha_{k}}}\frac{\sqrt{% \alpha_{T}}}{\sqrt{1-\alpha_{T}}}\right)\xi+\frac{\sqrt{1-\alpha_{k}}}{\sqrt{% \alpha_{k}}}\left\|\boldsymbol{\hat{\epsilon}}_{k}-\boldsymbol{{\epsilon}}_{0}% \right\|_{\infty}\leq\;\xi( 1 - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ
𝒙^k10𝒙¯k10ξsubscriptnormsuperscriptsubscriptbold-^𝒙𝑘10superscriptsubscriptbold-¯𝒙𝑘10𝜉\displaystyle\left\|\boldsymbol{\hat{x}}_{k-1}^{0}-\boldsymbol{\bar{x}}_{k-1}^% {0}\right\|_{\infty}\leq\xi∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ
𝒙^k10𝒙oriξ,subscriptnormsuperscriptsubscriptbold-^𝒙𝑘10subscript𝒙𝑜𝑟𝑖𝜉\displaystyle\left\|\boldsymbol{\hat{x}}_{k-1}^{0}-\boldsymbol{{x}}_{ori}% \right\|_{\infty}\leq\xi,∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ , (34)

And the relationship between 𝒙^k1subscriptbold-^𝒙𝑘1\boldsymbol{\hat{x}}_{k-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT and 𝒙¯k1subscriptbold-¯𝒙𝑘1\boldsymbol{\bar{x}}_{k-1}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT at step k1𝑘1k-1italic_k - 1 can be expressed as:

𝒙^k1𝒙¯k1subscriptnormsubscriptbold-^𝒙𝑘1subscriptbold-¯𝒙𝑘1\displaystyle\left\|\boldsymbol{\hat{x}}_{k-1}-\boldsymbol{\bar{x}}_{k-1}% \right\|_{\infty}∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
=\displaystyle=\;= αk1(𝒙^k1αkϵ^kαk)+1αk1ϵ^k\displaystyle{\left\|\sqrt{\alpha_{k-1}}\left(\frac{\boldsymbol{\hat{x}}_{k}-% \sqrt{1-\alpha_{k}}\boldsymbol{\hat{\epsilon}}_{k}}{\sqrt{\alpha_{k}}}\right)+% \sqrt{1-\alpha_{k-1}}\boldsymbol{\hat{\epsilon}}_{k}\right.}∥ square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ) + square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT
αk1(𝒙¯k1αkϵ0αk)1αk1ϵ0subscript𝛼𝑘1subscriptbold-¯𝒙𝑘1subscript𝛼𝑘subscriptbold-italic-ϵ0subscript𝛼𝑘evaluated-at1subscript𝛼𝑘1subscriptbold-italic-ϵ0\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad% \quad{\left.-\sqrt{\alpha_{k-1}}\left(\frac{\boldsymbol{\bar{x}}_{k}-\sqrt{1-% \alpha_{k}}\boldsymbol{{\epsilon}}_{0}}{\sqrt{\alpha_{k}}}\right)-\sqrt{1-% \alpha_{k-1}}\boldsymbol{{\epsilon}}_{0}\right\|_{\infty}}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ) - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
=\displaystyle=\;= αk1αk(𝒙^k𝒙¯k)+(1αk1αk11αkαk)(ϵ^kϵ0)subscriptnormsubscript𝛼𝑘1subscript𝛼𝑘subscriptbold-^𝒙𝑘subscriptbold-¯𝒙𝑘1subscript𝛼𝑘1subscript𝛼𝑘11subscript𝛼𝑘subscript𝛼𝑘subscriptbold-^bold-italic-ϵ𝑘subscriptbold-italic-ϵ0\displaystyle\left\|\frac{\sqrt{\alpha_{k-1}}}{\sqrt{\alpha_{k}}}\left(% \boldsymbol{\hat{x}}_{k}-\boldsymbol{\bar{x}}_{k}\right)\right.+{\left.\left(% \sqrt{1-\alpha_{k-1}}-\frac{\sqrt{\alpha_{k-1}}\sqrt{1-\alpha_{k}}}{\sqrt{% \alpha_{k}}}\right)\left(\boldsymbol{\hat{\epsilon}}_{k}-\boldsymbol{{\epsilon% }}_{0}\right)\right\|_{\infty}}∥ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + ( square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG - divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (35)
\displaystyle\leq\; αk1αk𝒙^k𝒙¯k+|1αk1αk11αkαk|ϵ^kϵ0subscript𝛼𝑘1subscript𝛼𝑘subscriptnormsubscriptbold-^𝒙𝑘subscriptbold-¯𝒙𝑘1subscript𝛼𝑘1subscript𝛼𝑘11subscript𝛼𝑘subscript𝛼𝑘subscriptnormsubscriptbold-^bold-italic-ϵ𝑘subscriptbold-italic-ϵ0\displaystyle\frac{\sqrt{\alpha_{k-1}}}{\sqrt{\alpha_{k}}}\left\|\boldsymbol{% \hat{x}}_{k}-\boldsymbol{\bar{x}}_{k}\right\|_{\infty}+{\left|\sqrt{1-\alpha_{% k-1}}-\frac{\sqrt{\alpha_{k-1}}\sqrt{1-\alpha_{k}}}{\sqrt{\alpha_{k}}}\right|% \left\|\boldsymbol{\hat{\epsilon}}_{k}-\boldsymbol{{\epsilon}}_{0}\right\|_{% \infty}}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT + | square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG - divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG | ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT (36)
\displaystyle\leq\; αk1αk(αk1αkαT1αT)ξ+(αk11αkαk1αk1)αT1αTξsubscript𝛼𝑘1subscript𝛼𝑘subscript𝛼𝑘1subscript𝛼𝑘subscript𝛼𝑇1subscript𝛼𝑇𝜉subscript𝛼𝑘11subscript𝛼𝑘subscript𝛼𝑘1subscript𝛼𝑘1subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\frac{\sqrt{\alpha_{k-1}}}{\sqrt{\alpha_{k}}}\left(\sqrt{\alpha_{% k}}-\sqrt{1-\alpha_{k}}\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\right)\xi% +{\left(\frac{\sqrt{\alpha_{k-1}}\sqrt{1-\alpha_{k}}}{\sqrt{\alpha_{k}}}-\sqrt% {1-\alpha_{k-1}}\right)\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\,\xi}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG ( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ + ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG ) divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ (37)
=\displaystyle=\;= (αk11αk1αT1αT)ξsubscript𝛼𝑘11subscript𝛼𝑘1subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\left(\sqrt{\alpha_{k-1}}-\sqrt{1-\alpha_{k-1}}\frac{\sqrt{\alpha% _{T}}}{\sqrt{1-\alpha_{T}}}\right)\xi( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ (38)

where the triangle inequality property is utilized again to obtain Eq. (36) from Eq. (35), then Eq. (28) and Eq. (34) is substituted to obtain Eq. (37). Obviously, for the case of step k1𝑘1k-1italic_k - 1, it is also consistent with the theorem.

Conclusion.

Therefore, by extending the truth of the theorem from arbitrary step k𝑘kitalic_k to k1𝑘1k-1italic_k - 1, and given its established validity at the initial case, the principle of mathematical induction allows us to conclude that 𝒙^t𝒙¯t(αt1αtαT1αT)ξ,𝒙^t0𝒙oriξformulae-sequencesubscriptnormsubscriptbold-^𝒙𝑡subscriptbold-¯𝒙𝑡subscript𝛼𝑡1subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉subscriptnormsuperscriptsubscriptbold-^𝒙𝑡0subscript𝒙𝑜𝑟𝑖𝜉\|\boldsymbol{\hat{x}}_{t}-\boldsymbol{\bar{x}}_{t}\|_{\infty}\leq(\sqrt{% \alpha_{t}}-\sqrt{1-\alpha_{t}}\frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}})% \xi,\ \|\boldsymbol{\hat{x}}_{t}^{0}-\boldsymbol{x}_{ori}\|_{\infty}\leq\xi∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ ( square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG ) italic_ξ , ∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ hold true for every step t[1:T]t\in[1:T]italic_t ∈ [ 1 : italic_T ]. For t=0𝑡0t=0italic_t = 0 and α0=1subscript𝛼01\alpha_{0}=1italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1, we have 𝒙^00=𝒙^0superscriptsubscriptbold-^𝒙00subscriptbold-^𝒙0\boldsymbol{\hat{x}}_{0}^{0}=\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and 𝒙¯0=𝒙orisubscriptbold-¯𝒙0subscript𝒙𝑜𝑟𝑖\boldsymbol{\bar{x}}_{0}=\boldsymbol{{x}}_{ori}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT, thus it is obvious that 𝒙^0𝒙oriξsubscriptnormsubscriptbold-^𝒙0subscript𝒙𝑜𝑟𝑖𝜉\|\boldsymbol{\hat{x}}_{0}-\boldsymbol{x}_{ori}\|_{\infty}\leq\xi∥ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_ξ. This concludes the proof of the whole Theorem 1.

B.3 Proof of Proposition 1

Next, we prove Proposition 1 about λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝜹tsubscript𝜹𝑡\boldsymbol{\delta}_{t}bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT by expanding and rearranging the recursive formulas in our diffusion process for attacks.

Proposition 1

Under the conditions of Theorem 1, by denoting constrained ϵ^t=ϵ0𝛅tsubscriptbold-^bold-ϵ𝑡subscriptbold-ϵ0subscript𝛅𝑡\boldsymbol{\hat{\epsilon}}_{t}=\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_% {t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, we have

𝒙adv=𝒙ori+t=1Tλt𝜹t,subscript𝒙𝑎𝑑𝑣subscript𝒙𝑜𝑟𝑖superscriptsubscript𝑡1𝑇subscript𝜆𝑡subscript𝜹𝑡\boldsymbol{x}_{adv}=\boldsymbol{x}_{ori}+\sum_{t=1}^{T}\lambda_{t}\boldsymbol% {\delta}_{t},bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (39)

where λt=1αtαt1αt1αt1subscript𝜆𝑡1subscript𝛼𝑡subscript𝛼𝑡1subscript𝛼𝑡1subscript𝛼𝑡1\lambda_{t}=\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}-\frac{\sqrt{1-{% \alpha_{t-1}}}}{\sqrt{\alpha_{t-1}}}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG end_ARG, and 𝛅tαT1αTξsubscriptnormsubscript𝛅𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\delta}_{t}\|_{\infty}\leq\frac{\sqrt{\alpha_{T}}}{\sqrt{1-% \alpha_{T}}}\xi∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ.

Proof 2

With Eq. (19) in Theorem 1, by denoting ϵ^t=ϵ0𝛅tsubscriptbold-^bold-ϵ𝑡subscriptbold-ϵ0subscript𝛅𝑡\boldsymbol{\hat{\epsilon}}_{t}=\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_% {t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, we have: 𝛅t=ϵ^tϵ0αT1αTξsubscriptnormsubscript𝛅𝑡subscriptnormsubscriptbold-^bold-ϵ𝑡subscriptbold-ϵ0subscript𝛼𝑇1subscript𝛼𝑇𝜉\|\boldsymbol{\delta}_{t}\|_{\infty}=\|\boldsymbol{\hat{\epsilon}}_{t}-% \boldsymbol{\epsilon}_{0}\|_{\infty}\leq\frac{\sqrt{\alpha_{T}}}{\sqrt{1-% \alpha_{T}}}\xi∥ bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT = ∥ overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ, and the 𝐱^T1subscriptbold-^𝐱𝑇1\boldsymbol{\hat{x}}_{T-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT in the initial step can be written as:

𝒙^T1subscriptbold-^𝒙𝑇1\displaystyle\boldsymbol{\hat{x}}_{T-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT =\displaystyle=\;= αT1αT𝒙^T(αT11αTαT1αT1)(ϵ^T)subscript𝛼𝑇1subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑇11subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-^bold-italic-ϵ𝑇\displaystyle\frac{\sqrt{\alpha_{T-1}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}% _{T}-{\left(\frac{\sqrt{\alpha_{T-1}}\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-% \sqrt{1-\alpha_{T-1}}\right)\left(\boldsymbol{\hat{\epsilon}}_{T}\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
=\displaystyle=\;= αT1αT𝒙^TαT1(1αTαT1αT1αT1)(ϵ0𝜹T)subscript𝛼𝑇1subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑇11subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-italic-ϵ0subscript𝜹𝑇\displaystyle\frac{\sqrt{\alpha_{T-1}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}% _{T}-{\sqrt{\alpha_{T-1}}\left(\frac{\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-% \frac{\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\right)\left(\boldsymbol{% \epsilon}_{0}-\boldsymbol{\delta}_{T}\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) (40)

Applying the recursion formula twice, we have:

𝒙^T2subscriptbold-^𝒙𝑇2\displaystyle\boldsymbol{\hat{x}}_{T-2}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT =\displaystyle=\;= αT2αT1(αT1αT𝒙^T(αT11αTαT1αT1)(ϵ^T))subscript𝛼𝑇2subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑇11subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-^bold-italic-ϵ𝑇\displaystyle\frac{\sqrt{\alpha_{T-2}}}{\sqrt{\alpha_{T-1}}}\left(\frac{\sqrt{% \alpha_{T-1}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}_{T}\right.-{\left.\left(% \frac{\sqrt{\alpha_{T-1}}\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-\sqrt{1-% \alpha_{T-1}}\right)\left(\boldsymbol{\hat{\epsilon}}_{T}\right)\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) )
(αT21αT1αT11αT2)(ϵ^T1)subscript𝛼𝑇21subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇2subscriptbold-^bold-italic-ϵ𝑇1\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad-{\left(\frac{\sqrt{% \alpha_{T-2}}\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}-\sqrt{1-\alpha_{T-2}}% \right)\left(\boldsymbol{\hat{\epsilon}}_{T-1}\right)}- ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT )
=\displaystyle=\;= αT2αT𝒙^TαT2(1αTαT1αT1αT1)(ϵ0𝜹T)subscript𝛼𝑇2subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑇21subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-italic-ϵ0subscript𝜹𝑇\displaystyle\frac{\sqrt{\alpha_{T-2}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}% _{T}-{\sqrt{\alpha_{T-2}}\left(\frac{\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-% \frac{\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\right)\left(\boldsymbol{% \epsilon}_{0}-\boldsymbol{\delta}_{T}\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
αT2(1αT1αT11αT2αT2)(ϵ0𝜹T1)subscript𝛼𝑇21subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇2subscript𝛼𝑇2subscriptbold-italic-ϵ0subscript𝜹𝑇1\displaystyle\quad\quad\quad\quad-{\sqrt{\alpha_{T-2}}\left(\frac{\sqrt{1-% \alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}-\frac{\sqrt{1-\alpha_{T-2}}}{\sqrt{\alpha_% {T-2}}}\right)\left(\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{T-1}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT ) (41)

Similarly, for 𝐱^T3subscriptbold-^𝐱𝑇3\boldsymbol{\hat{x}}_{T-3}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT, we have:

𝒙^T3subscriptbold-^𝒙𝑇3\displaystyle\boldsymbol{\hat{x}}_{T-3}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT =\displaystyle=\;= αT3αT2(αT2αT1(αT1αT𝒙^T(αT11αTαT1αT1)(ϵ^T))\displaystyle\frac{\sqrt{\alpha_{T-3}}}{\sqrt{\alpha_{T-2}}}\left(\frac{\sqrt{% \alpha_{T-2}}}{\sqrt{\alpha_{T-1}}}\left(\frac{\sqrt{\alpha_{T-1}}}{\sqrt{% \alpha_{T}}}\boldsymbol{\hat{x}}_{T}\right.\right.-{\left.\left(\frac{\sqrt{% \alpha_{T-1}}\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-\sqrt{1-\alpha_{T-1}}% \right)\left(\boldsymbol{\hat{\epsilon}}_{T}\right)\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) )
(αT21αT1αT11αT2)(ϵ^T1))\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad% -{\left.\left(\frac{\sqrt{\alpha_{T-2}}\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-% 1}}}-\sqrt{1-\alpha_{T-2}}\right)\left(\boldsymbol{\hat{\epsilon}}_{T-1}\right% )\right)}- ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT ) )
(αT31αT2αT21αT3)(ϵ^T2)subscript𝛼𝑇31subscript𝛼𝑇2subscript𝛼𝑇21subscript𝛼𝑇3subscriptbold-^bold-italic-ϵ𝑇2\displaystyle\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad% -{\left(\frac{\sqrt{\alpha_{T-3}}\sqrt{1-\alpha_{T-2}}}{\sqrt{\alpha_{T-2}}}-% \sqrt{1-\alpha_{T-3}}\right)\left(\boldsymbol{\hat{\epsilon}}_{T-2}\right)}- ( divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG ) ( overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT )
=\displaystyle=\;= αT3αT𝒙^TαT3(1αTαT1αT1αT1)(ϵ0𝜹T)subscript𝛼𝑇3subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑇31subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-italic-ϵ0subscript𝜹𝑇\displaystyle\frac{\sqrt{\alpha_{T-3}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}% _{T}-{\sqrt{\alpha_{T-3}}\left(\frac{\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-% \frac{\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\right)\left(\boldsymbol{% \epsilon}_{0}-\boldsymbol{\delta}_{T}\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
αT3(1αT1αT11αT2αT2)(ϵ0𝜹T1)subscript𝛼𝑇31subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇2subscript𝛼𝑇2subscriptbold-italic-ϵ0subscript𝜹𝑇1\displaystyle\quad\quad\quad\quad-{\sqrt{\alpha_{T-3}}\left(\frac{\sqrt{1-% \alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}-\frac{\sqrt{1-\alpha_{T-2}}}{\sqrt{\alpha_% {T-2}}}\right)\left(\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{T-1}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT )
αT3(1αT2αT21αT3αT3)(ϵ0𝜹T2)subscript𝛼𝑇31subscript𝛼𝑇2subscript𝛼𝑇21subscript𝛼𝑇3subscript𝛼𝑇3subscriptbold-italic-ϵ0subscript𝜹𝑇2\displaystyle\quad\quad\quad\quad-{\sqrt{\alpha_{T-3}}\left(\frac{\sqrt{1-% \alpha_{T-2}}}{\sqrt{\alpha_{T-2}}}-\frac{\sqrt{1-\alpha_{T-3}}}{\sqrt{\alpha_% {T-3}}}\right)\left(\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{T-2}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 3 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT ) (42)

It is obvious that the coefficients of each term exhibit a clear regular pattern related to the step t𝑡titalic_t. Following this pattern, we can accordingly get the expression of 𝐱^tsubscriptbold-^𝐱𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as:

𝒙^tsubscriptbold-^𝒙𝑡\displaystyle\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT =\displaystyle=\;= αtαT𝒙^Tαt(1αTαT1αT1αT1)(ϵ0𝜹T)subscript𝛼𝑡subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑡1subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-italic-ϵ0subscript𝜹𝑇\displaystyle\frac{\sqrt{\alpha_{t}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}_{% T}-{\sqrt{\alpha_{t}}\left(\frac{\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-\frac% {\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\right)\left(\boldsymbol{\epsilon}% _{0}-\boldsymbol{\delta}_{T}\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
αt(1αT1αT11αT2αT2)(ϵ0𝜹T1)subscript𝛼𝑡1subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇2subscript𝛼𝑇2subscriptbold-italic-ϵ0subscript𝜹𝑇1\displaystyle-{\sqrt{\alpha_{t}}\left(\frac{\sqrt{1-\alpha_{T-1}}}{\sqrt{% \alpha_{T-1}}}-\frac{\sqrt{1-\alpha_{T-2}}}{\sqrt{\alpha_{T-2}}}\right)\left(% \boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{T-1}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT )
\displaystyle\vdots
αt(1αt+2αt+21αt+1αt+1)(ϵ0𝜹t+2)subscript𝛼𝑡1subscript𝛼𝑡2subscript𝛼𝑡21subscript𝛼𝑡1subscript𝛼𝑡1subscriptbold-italic-ϵ0subscript𝜹𝑡2\displaystyle-{\sqrt{\alpha_{t}}\left(\frac{\sqrt{1-\alpha_{t+2}}}{\sqrt{% \alpha_{t+2}}}-\frac{\sqrt{1-\alpha_{t+1}}}{\sqrt{\alpha_{t+1}}}\right)\left(% \boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{t+2}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t + 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t + 2 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_t + 2 end_POSTSUBSCRIPT )
αt(1αt+1αt+11αtαt)(ϵ0𝜹t+1)subscript𝛼𝑡1subscript𝛼𝑡1subscript𝛼𝑡11subscript𝛼𝑡subscript𝛼𝑡subscriptbold-italic-ϵ0subscript𝜹𝑡1\displaystyle-{\sqrt{\alpha_{t}}\left(\frac{\sqrt{1-\alpha_{t+1}}}{\sqrt{% \alpha_{t+1}}}-\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\right)\left(% \boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{t+1}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ) (43)

And the final 𝐱^0subscriptbold-^𝐱0\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT can be expressed as:

𝒙^0subscriptbold-^𝒙0\displaystyle\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =\displaystyle=\;= α0αT𝒙^Tα0(1αTαT1αT1αT1)(ϵ0𝜹T)subscript𝛼0subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼01subscript𝛼𝑇subscript𝛼𝑇1subscript𝛼𝑇1subscript𝛼𝑇1subscriptbold-italic-ϵ0subscript𝜹𝑇\displaystyle\frac{\sqrt{\alpha_{0}}}{\sqrt{\alpha_{T}}}\boldsymbol{\hat{x}}_{% T}-{\sqrt{\alpha_{0}}\left(\frac{\sqrt{1-\alpha_{T}}}{\sqrt{\alpha_{T}}}-\frac% {\sqrt{1-\alpha_{T-1}}}{\sqrt{\alpha_{T-1}}}\right)\left(\boldsymbol{\epsilon}% _{0}-\boldsymbol{\delta}_{T}\right)}divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
α0(1αT1αT11αT2αT2)(ϵ0𝜹T1)subscript𝛼01subscript𝛼𝑇1subscript𝛼𝑇11subscript𝛼𝑇2subscript𝛼𝑇2subscriptbold-italic-ϵ0subscript𝜹𝑇1\displaystyle-{\sqrt{\alpha_{0}}\left(\frac{\sqrt{1-\alpha_{T-1}}}{\sqrt{% \alpha_{T-1}}}-\frac{\sqrt{1-\alpha_{T-2}}}{\sqrt{\alpha_{T-2}}}\right)\left(% \boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{T-1}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T - 2 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT )
\displaystyle\vdots
α0(1α2α21α1α1)(ϵ0𝜹2)subscript𝛼01subscript𝛼2subscript𝛼21subscript𝛼1subscript𝛼1subscriptbold-italic-ϵ0subscript𝜹2\displaystyle-{\sqrt{\alpha_{0}}\left(\frac{\sqrt{1-\alpha_{2}}}{\sqrt{\alpha_% {2}}}-\frac{\sqrt{1-\alpha_{1}}}{\sqrt{\alpha_{1}}}\right)\left(\boldsymbol{% \epsilon}_{0}-\boldsymbol{\delta}_{2}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT )
α0(1α1α11α0α0)(ϵ0𝜹1)subscript𝛼01subscript𝛼1subscript𝛼11subscript𝛼0subscript𝛼0subscriptbold-italic-ϵ0subscript𝜹1\displaystyle-{\sqrt{\alpha_{0}}\left(\frac{\sqrt{1-\alpha_{1}}}{\sqrt{\alpha_% {1}}}-\frac{\sqrt{1-\alpha_{0}}}{\sqrt{\alpha_{0}}}\right)\left(\boldsymbol{% \epsilon}_{0}-\boldsymbol{\delta}_{1}\right)}- square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_ARG ) ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) (44)

Note that in α0=1subscript𝛼01\alpha_{0}=1italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1 Eq. (44), and the coefficients of ϵ0subscriptbold-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT can be mostly eliminated. Thus, by defining λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as:

λt=1αtαt1αt1αt1,subscript𝜆𝑡1subscript𝛼𝑡subscript𝛼𝑡1subscript𝛼𝑡1subscript𝛼𝑡1\lambda_{t}=\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}-\frac{\sqrt{1-{% \alpha_{t-1}}}}{\sqrt{\alpha_{t-1}}},italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT end_ARG end_ARG , (45)

we can rearrange Eq. (44) into:

𝒙^0subscriptbold-^𝒙0\displaystyle\boldsymbol{\hat{x}}_{0}\;overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT =\displaystyle=\;= 𝒙^T1αTϵ0αT+t=1Tλt𝜹tsubscriptbold-^𝒙𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇superscriptsubscript𝑡1𝑇subscript𝜆𝑡subscript𝜹𝑡\displaystyle\frac{\boldsymbol{\hat{x}}_{T}-\sqrt{1-\alpha_{T}}\boldsymbol{% \epsilon}_{0}}{\sqrt{\alpha_{T}}}+\sum_{t=1}^{T}\lambda_{t}\boldsymbol{\delta}% _{t}divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT
=\displaystyle=\;= 𝒙¯T1αTϵ0αT+t=1Tλt𝜹tsubscriptbold-¯𝒙𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇superscriptsubscript𝑡1𝑇subscript𝜆𝑡subscript𝜹𝑡\displaystyle\frac{\boldsymbol{\bar{x}}_{T}-\sqrt{1-\alpha_{T}}\boldsymbol{% \epsilon}_{0}}{\sqrt{\alpha_{T}}}+\sum_{t=1}^{T}\lambda_{t}\boldsymbol{\delta}% _{t}divide start_ARG overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT
=\displaystyle=\;= 𝒙ori+t=1Tλt𝜹tsubscript𝒙𝑜𝑟𝑖superscriptsubscript𝑡1𝑇subscript𝜆𝑡subscript𝜹𝑡\displaystyle\boldsymbol{{x}}_{ori}+\sum_{t=1}^{T}\lambda_{t}\boldsymbol{% \delta}_{t}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (46)

where 𝐱^0subscriptbold-^𝐱0\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the final output of the diffusing process and 𝐱adv=𝐱^0subscript𝐱𝑎𝑑𝑣subscriptbold-^𝐱0\boldsymbol{{x}}_{adv}=\boldsymbol{\hat{x}}_{0}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. This concludes the proof of Proposition 1.

B.4 Proof of Proposition 2

Finally, we prove Proposition 2 about the validity and convergence of the approximation 𝒙adv𝒙^t0subscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{x}_{adv}\approx\boldsymbol{\hat{x}}_{t}^{0}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT ≈ overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT.

Proposition 2

Under the conditions of Theorem 1 and Proposition 1, the upper bound on the error of the approximation in Eq. (22) can be expressed as

𝒙adv𝒙^t0 21αtαtαT1αT.subscriptnormsubscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡021subscript𝛼𝑡subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇\left\|\boldsymbol{x}_{adv}-\boldsymbol{\hat{x}}_{t}^{0}\right\|_{\infty}\leq% \;2\cdot\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\cdot\frac{\sqrt{\alpha_{% T}}}{\sqrt{1-\alpha_{T}}}.∥ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT - overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ 2 ⋅ divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ⋅ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG . (47)
Proof 3

With Eq. (43) and the definitions of Proposition 1, 𝐱^tsubscriptbold-^𝐱𝑡\boldsymbol{\hat{x}}_{t}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝐱^t0superscriptsubscriptbold-^𝐱𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT can be written as:

𝒙^t=αtαT𝒙^Tαtk=t+1Tλk(ϵ0𝜹k),subscriptbold-^𝒙𝑡subscript𝛼𝑡subscript𝛼𝑇subscriptbold-^𝒙𝑇subscript𝛼𝑡superscriptsubscript𝑘𝑡1𝑇subscript𝜆𝑘subscriptbold-italic-ϵ0subscript𝜹𝑘\boldsymbol{\hat{x}}_{t}=\frac{\sqrt{\alpha_{t}}}{\sqrt{\alpha_{T}}}% \boldsymbol{\hat{x}}_{T}-\sqrt{\alpha_{t}}\sum_{k=t+1}^{T}\lambda_{k}\left(% \boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{k}\right),overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_k = italic_t + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , (48)

and

𝒙^t0=𝒙^t1αtϵ^t+1αt=1αT𝒙^Tk=t+1Tλk(ϵ0𝜹k)1αtαt(ϵ0𝜹t+1).superscriptsubscriptbold-^𝒙𝑡0subscriptbold-^𝒙𝑡1subscript𝛼𝑡subscriptbold-^bold-italic-ϵ𝑡1subscript𝛼𝑡1subscript𝛼𝑇subscriptbold-^𝒙𝑇superscriptsubscript𝑘𝑡1𝑇subscript𝜆𝑘subscriptbold-italic-ϵ0subscript𝜹𝑘1subscript𝛼𝑡subscript𝛼𝑡subscriptbold-italic-ϵ0subscript𝜹𝑡1\boldsymbol{\hat{x}}_{t}^{0}=\frac{\boldsymbol{\hat{x}}_{t}-\sqrt{1-\alpha_{t}% }\boldsymbol{\hat{\epsilon}}_{t+1}}{\sqrt{\alpha_{t}}}=\frac{1}{\sqrt{\alpha_{% T}}}\boldsymbol{\hat{x}}_{T}-\sum_{k=t+1}^{T}\lambda_{k}\left(\boldsymbol{% \epsilon}_{0}-\boldsymbol{\delta}_{k}\right)-\frac{\sqrt{1-\alpha_{t}}}{\sqrt{% \alpha_{t}}}\left(\boldsymbol{\epsilon}_{0}-\boldsymbol{\delta}_{t+1}\right).overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - ∑ start_POSTSUBSCRIPT italic_k = italic_t + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ( bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - bold_italic_δ start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ) . (49)

For 𝐱advsubscript𝐱𝑎𝑑𝑣\boldsymbol{{x}}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT, we have:

𝒙adv=𝒙^0=𝒙^T1αTϵ0αT+t=1Tλt𝜹tsubscript𝒙𝑎𝑑𝑣subscriptbold-^𝒙0subscriptbold-^𝒙𝑇1subscript𝛼𝑇subscriptbold-italic-ϵ0subscript𝛼𝑇superscriptsubscript𝑡1𝑇subscript𝜆𝑡subscript𝜹𝑡\boldsymbol{{x}}_{adv}=\boldsymbol{\hat{x}}_{0}=\frac{\boldsymbol{\hat{x}}_{T}% -\sqrt{1-\alpha_{T}}\boldsymbol{\epsilon}_{0}}{\sqrt{\alpha_{T}}}+\sum_{t=1}^{% T}\lambda_{t}\boldsymbol{\delta}_{t}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = divide start_ARG overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT - square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG + ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (50)

With Eq. (49) and Eq. (50), we can obtain:

𝒙adv𝒙^t0=k=1tλk𝜹k1αtαt𝜹t+1subscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡0superscriptsubscript𝑘1𝑡subscript𝜆𝑘subscript𝜹𝑘1subscript𝛼𝑡subscript𝛼𝑡subscript𝜹𝑡1\boldsymbol{{x}}_{adv}-\boldsymbol{\hat{x}}_{t}^{0}=\sum_{k=1}^{t}\lambda_{k}% \boldsymbol{\delta}_{k}-\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}% \boldsymbol{\delta}_{t+1}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT - overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG bold_italic_δ start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT (51)

Thus, we have:

𝒙adv𝒙^t0subscriptnormsubscript𝒙𝑎𝑑𝑣superscriptsubscriptbold-^𝒙𝑡0\displaystyle\left\|\boldsymbol{{x}}_{adv}-\boldsymbol{\hat{x}}_{t}^{0}\right% \|_{\infty}∥ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT - overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT =\displaystyle=\;= k=1tλk𝜹k1αtαt𝜹t+1subscriptnormsuperscriptsubscript𝑘1𝑡subscript𝜆𝑘subscript𝜹𝑘1subscript𝛼𝑡subscript𝛼𝑡subscript𝜹𝑡1\displaystyle\left\|\sum_{k=1}^{t}\lambda_{k}\boldsymbol{\delta}_{k}-\frac{% \sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\boldsymbol{\delta}_{t+1}\right\|_{\infty}∥ ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG bold_italic_δ start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
\displaystyle\leq\; k=1tλk𝜹k+1αtαt𝜹t+1subscriptnormsuperscriptsubscript𝑘1𝑡subscript𝜆𝑘subscript𝜹𝑘subscriptnorm1subscript𝛼𝑡subscript𝛼𝑡subscript𝜹𝑡1\displaystyle\left\|\sum_{k=1}^{t}\lambda_{k}\boldsymbol{\delta}_{k}\right\|_{% \infty}+\left\|\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\boldsymbol{\delta% }_{t+1}\right\|_{\infty}∥ ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT + ∥ divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG bold_italic_δ start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
=\displaystyle=\;= k=1tλk𝜹k+1αtαt𝜹t+1superscriptsubscript𝑘1𝑡subscript𝜆𝑘subscriptnormsubscript𝜹𝑘1subscript𝛼𝑡subscript𝛼𝑡subscriptnormsubscript𝜹𝑡1\displaystyle\sum_{k=1}^{t}\lambda_{k}\left\|\boldsymbol{\delta}_{k}\right\|_{% \infty}+\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\left\|\boldsymbol{\delta% }_{t+1}\right\|_{\infty}∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ bold_italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG ∥ bold_italic_δ start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT
\displaystyle\leq\; k=1t(λkαT1αTξ)+1αtαtαT1αTξsuperscriptsubscript𝑘1𝑡subscript𝜆𝑘subscript𝛼𝑇1subscript𝛼𝑇𝜉1subscript𝛼𝑡subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\sum_{k=1}^{t}\left(\lambda_{k}\cdot\frac{\sqrt{\alpha_{T}}}{% \sqrt{1-\alpha_{T}}}\xi\right)+\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}% \frac{\sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ ) + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ
=\displaystyle=\;= (k=1tλk)αT1αTξ+1αtαtαT1αTξsuperscriptsubscript𝑘1𝑡subscript𝜆𝑘subscript𝛼𝑇1subscript𝛼𝑇𝜉1subscript𝛼𝑡subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\left(\sum_{k=1}^{t}\lambda_{k}\right)\cdot\frac{\sqrt{\alpha_{T}% }}{\sqrt{1-\alpha_{T}}}\xi+\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\frac{% \sqrt{\alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi( ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ⋅ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ
=\displaystyle=\;= (1αtαt1α0α0)αT1αTξ+1αtαtαT1αTξ1subscript𝛼𝑡subscript𝛼𝑡1subscript𝛼0subscript𝛼0subscript𝛼𝑇1subscript𝛼𝑇𝜉1subscript𝛼𝑡subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle\left(\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}-\frac{\sqrt{1% -\alpha_{0}}}{\sqrt{\alpha_{0}}}\right)\cdot\frac{\sqrt{\alpha_{T}}}{\sqrt{1-% \alpha_{T}}}\xi+\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\frac{\sqrt{% \alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi( divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG - divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG end_ARG ) ⋅ divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ + divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ
=\displaystyle=\;= 21αtαtαT1αTξ21subscript𝛼𝑡subscript𝛼𝑡subscript𝛼𝑇1subscript𝛼𝑇𝜉\displaystyle 2\cdot\frac{\sqrt{1-\alpha_{t}}}{\sqrt{\alpha_{t}}}\frac{\sqrt{% \alpha_{T}}}{\sqrt{1-\alpha_{T}}}\xi2 ⋅ divide start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG square-root start_ARG italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG 1 - italic_α start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG end_ARG italic_ξ (52)

This concludes the proof of Proposition 2.

Appendix C Algorithm of AdvAD-X

Algorithm 2 AdvAD-X

Input: Attacked model f()𝑓f(\cdot)italic_f ( ⋅ ), image 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT with label ygtsubscript𝑦𝑔𝑡y_{gt}italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT, budget ξ𝜉\xiitalic_ξ, step T𝑇Titalic_T;
Output: Adversarial example 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT

1:  Initialize pre-defined diffusion coefficients α0:T(0,1]T+1subscript𝛼:0𝑇superscript01𝑇1\alpha_{0:T}\in(0,1]^{T+1}italic_α start_POSTSUBSCRIPT 0 : italic_T end_POSTSUBSCRIPT ∈ ( 0 , 1 ] start_POSTSUPERSCRIPT italic_T + 1 end_POSTSUPERSCRIPT;
2:  Initialize ϵ0𝒩(𝟎,𝑰)similar-tosubscriptbold-italic-ϵ0𝒩0𝑰\boldsymbol{\epsilon}_{0}\sim\mathcal{N}(\boldsymbol{0},\boldsymbol{\mathit{I}})bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∼ caligraphic_N ( bold_0 , bold_italic_I ); \triangleright Initialize and fix diffusion noise ϵ0subscriptbold-italic-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.
3:  Transform the range of 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT to [-1, 1]; \triangleright Align with data range of diffusion process.
4:  Calculate 𝒙¯Tsubscriptbold-¯𝒙𝑇\boldsymbol{\bar{x}}_{T}overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT via Eq. (4); \triangleright Forward process of adding noise ϵ0subscriptbold-italic-ϵ0\boldsymbol{\epsilon}_{0}bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT.
5:  Set 𝒙^T:=𝒙¯Tassignsubscriptbold-^𝒙𝑇subscriptbold-¯𝒙𝑇\boldsymbol{\hat{x}}_{T}:=\boldsymbol{\bar{x}}_{T}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT := overbold_¯ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, ϵ^T+1:=ϵ0assignsubscriptbold-^bold-italic-ϵ𝑇1subscriptbold-italic-ϵ0\boldsymbol{\hat{\epsilon}}_{T+1}:=\boldsymbol{\epsilon}_{0}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_T + 1 end_POSTSUBSCRIPT := bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; \triangleright Non-parametric diffusion process.
6:  Calculate mask 𝒎𝒎\boldsymbol{m}bold_italic_m of 𝒙orisubscript𝒙𝑜𝑟𝑖\boldsymbol{x}_{ori}bold_italic_x start_POSTSUBSCRIPT italic_o italic_r italic_i end_POSTSUBSCRIPT using GradCAM; \triangleright Mask 𝒎𝒎\boldsymbol{m}bold_italic_m for the CA strategy.
7:  for t=T𝑡𝑇t=Titalic_t = italic_T to 1111 do
8:     Calculate 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT via Eq. (7); \triangleright Approximation of 𝒙^t0𝒙advsuperscriptsubscriptbold-^𝒙𝑡0subscript𝒙𝑎𝑑𝑣\boldsymbol{\hat{x}}_{t}^{0}\approx\boldsymbol{x}_{adv}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ≈ bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT.
9:     Transform the range of 𝒙^t0superscriptsubscriptbold-^𝒙𝑡0\boldsymbol{\hat{x}}_{t}^{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT to [0, 255]; \triangleright Align with data range of image.
10:     // DGI strategy for performing AMG and PC dynamically.
11:     if f(𝒙^t0)==ygtf(\boldsymbol{\hat{x}}_{t}^{0})==y_{gt}italic_f ( overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ) = = italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT then
12:        Calculate ϵ^tsuperscriptsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}^{\prime}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT using 𝒎𝒎\boldsymbol{m}bold_italic_m with AMG via Eq. (14); \triangleright AMG module and CA strategy.
13:        Calculate ϵ^tsubscriptbold-^bold-italic-ϵ𝑡\boldsymbol{\hat{\epsilon}}_{t}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT with PC via Eq. (10); \triangleright Same PC module as AdvAD.  
14:     else
15:        Set ϵ^t=ϵ0subscriptbold-^bold-italic-ϵ𝑡subscriptbold-italic-ϵ0\boldsymbol{\hat{\epsilon}}_{t}=\boldsymbol{\epsilon}_{0}overbold_^ start_ARG bold_italic_ϵ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; \triangleright Skip the operations of current step.
16:     Calculate 𝒙^t1subscriptbold-^𝒙𝑡1\boldsymbol{\hat{x}}_{t-1}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT via Eq. (11); \triangleright One step backward from t𝑡titalic_t to t1𝑡1t-1italic_t - 1. 
17:  Transform the range of 𝒙^0subscriptbold-^𝒙0\boldsymbol{\hat{x}}_{0}overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT to [0, 255]; \triangleright Endpoint of the process.
18:  return 𝒙adv=𝒙^0subscript𝒙𝑎𝑑𝑣subscriptbold-^𝒙0\boldsymbol{x}_{adv}=\boldsymbol{\hat{x}}_{0}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT = overbold_^ start_ARG bold_italic_x end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; \triangleright Directly return 𝒙advsubscript𝒙𝑎𝑑𝑣\boldsymbol{x}_{adv}bold_italic_x start_POSTSUBSCRIPT italic_a italic_d italic_v end_POSTSUBSCRIPT in raw floating-point data for the ideal scenario.

Appendix D Additional Experiments

D.1 Additional Quantitative Comparisons

Table 4 reports the untargeted attack performance and imperceptibility of ten methods on Vgg-19, MobileNet-V2, and WideResNet-50 models. The results indicate that the proposed AdvAD and AdvAD-X, leveraging a novel modeling framework, consistently achieve superior performance. These findings further underscore the effectiveness of the proposed approach.

Table 4: Additional results of untargeted white-box attack success rate (ASR) and other evaluation metrics for imperceptibility when employing different attacks and attacked models. The reported running times are obtained using a RTX 3090 GPU on a same machine. bold-†\boldsymbol{{\dagger}}bold_† and blue mean the results of AdvAD-X are obtained with floating-point data type in the ideal scenario as described in Sec 3.4.
Model Attack Method Time (s) \downarrow ASR (%percent\%%) \uparrow lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT \downarrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT \downarrow  PSNR \uparrow  SSIM \uparrow  FID \downarrow  LPIPS \downarrow  MUSIQ \uparrow
Vgg-19 PGD 47 100.0 0.031 8.47 33.23 0.8771 43.15 0.0508 53.51
NCF 3288 92.8 0.794 75.21 14.77 0.6391 57.45 0.3077 49.27
ACA 83123 93.4 0.832 51.22 18.22 0.5767 66.78 0.3277 55.61
DiffAttack 34163 97.0 0.769 31.92 22.23 0.6632 59.08 0.1235 57.22
DiffPGD 5770 93.9 0.246 11.46 30.93 0.8888 20.72 0.0317 55.23
AdvDrop 268 97.5 0.062 3.23 41.79 0.9867 5.90 0.0061 54.93
PerC-AL 8671 100.0 0.142 2.12 45.92 0.9885 10.78 0.0028 55.91
SSAH 948 85.5 0.027 2.35 44.62 0.9920 4.25 0.0017 55.45
AdvAD (ours) 4370 99.5 0.009 1.05 52.13 0.9979 2.62 0.0005 56.31
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 1967 99.9 0.001 0.32 64.76 0.9997 0.27 0.0001 56.56
MobileNet-V2 PGD 10 99.9 0.031 8.29 33.41 0.8803 34.57 0.0500 52.00
NCF 2503 92.5 0.784 76.02 14.69 0.6373 56.23 0.3090 49.37
ACA 83118 92.8 0.835 50.70 18.30 0.5786 64.90 0.3254 56.17
DiffAttack 34723 98.2 0.739 30.51 22.61 0.6733 55.77 0.1143 56.01
DiffPGD 5941 92.6 0.246 11.43 30.95 0.8887 19.22 0.0309 54.87
AdvDrop 116 97.7 0.063 3.16 41.94 0.9873 4.88 0.0064 54.91
PerC-AL 3187 99.8 0.118 2.16 45.67 0.9879 8.77 0.0032 55.59
SSAH 265 97.8 0.026 2.18 45.24 0.9930 2.94 0.0016 55.78
AdvAD (ours) 992 99.6 0.008 0.94 53.07 0.9982 1.46 0.0004 56.37
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 388 100.0 0.001 0.24 66.8 0.9998 0.11 0.0001 56.59
WideResNet-50 PGD 42 96.0 0.031 8.2 33.5 0.8830 35.594 0.0521 52.43
NCF 2971 89.7 0.777 74.05 14.98 0.6473 56.01 0.2965 49.45
ACA 84163 88.0 0.838 53.17 17.89 0.5619 68.27 0.3442 55.47
DiffAttack 34072 95.1 0.747 30.61 22.60 0.6737 54.71 0.1137 55.68
DiffPGD 5965 91.4 0.245 11.44 30.95 0.8905 21.24 0.0317 55.16
AdvDrop 353 96.5 0.062 3.28 41.64 0.9863 6.21 0.0060 54.917
PerC-AL 6655 97.8 0.133 1.91 46.80 0.9906 9.28 0.0025 56.07
SSAH 738 95.7 0.028 2.21 45.21 0.9933 3.95 0.0015 55.88
AdvAD (ours) 3845 99.9 0.010 1.10 51.54 0.9979 2.84 0.0006 56.33
AdvAD-X(ours)superscriptAdvAD-Xbold-†(ours)\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}\text{(ours)}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT (ours) 1477 100.0 0.002 0.38 62.54 0.9996 0.33 0.0001 56.58

D.2 Additional Visualizations

More visualizations of adversarial examples and perturbations under the attacked model of ResNet-50 are displayed in Figure 8. The visualizations provide a clear insight into how different methods accomplish imperceptible attacks. Our AdvAD and AdvAD-X methods execute imperceptible attacks with lower overall intensity of perturbations. Notably, for images with salient objects (e.g., the bridge in the second examples), the perturbation intensity of AdvAD also naturally increases in the object regions during the gradient-based adversarial guidance calculation, while AdvAD-X, equipped with the dynamic strategy, still shows uniform and lower overall perturbation intensity.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: Additional imperceptible adversarial examples and corresponding perturbations generated by various methods.Perturbations are amplified and shown for the convenience of observation. Please zoom in to observe the details of the images.

D.3 Ablation Study of AdvAD-X

From AdvAD to AdvAD-X, Table 5 shows the effect of the two strategies of CA and DGI. It can be observed that adding CA in each step of AdvAD slightly improves impercepbility while maintaining the attack success rate of 100%. However, the DGI strategy significantly reduces the iterations of performing AMG and PC from 1000 to 3.97, which indicates that our framework theoretically only requires very little injected adversarial guidance to successfully perform attacks, proving the performance of our modeling method as well as the effectiveness of the adversarial guidance. In AdvAD-X, which finally uses both CA and DGI, the guidance strength in each step is further suppressed, resulting in a slight increase in the total number of iterations required adaptively, but the final perturbation strength continues to decrease to a more extreme level.

Table 5: Ablation study of the proposed CAM Assistance (CA) and Dynamic Gradient Injection (DGI) strategies in AdvAD-X. As marked with bold-†\boldsymbol{{\dagger}}bold_†, all the results in this experiment are obtained with attacking normal ResNet-50 using the floating-point raw data to align with the setting of AdvAD-X. The term of Iter. indicates the number of iterations that the AMG and PC are performed.
Attack Iter. ASR\uparrow l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT\downarrow PSNR\uparrow SSIM\uparrow FID\downarrow
AdvADsuperscriptAdvADbold-†\text{AdvAD}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 1000 100.0 0.97 52.60 0.9984 2.3894
AdvAD+CAsuperscriptAdvAD+CAbold-†\text{AdvAD+CA}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD+CA start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 1000 100.0 0.89 53.27 0.9987 2.2033
AdvAD+DGIsuperscriptAdvAD+DGIbold-†\text{AdvAD+DGI}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD+DGI start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 3.97 100.0 0.34 63.60 0.9997 0.2317
AdvAD-XsuperscriptAdvAD-Xbold-†\text{AdvAD-X}^{{\color[rgb]{0,0,1}\boldsymbol{{\dagger}}}}AdvAD-X start_POSTSUPERSCRIPT bold_† end_POSTSUPERSCRIPT 4.05 100.0 0.34 63.62 0.9997 0.2301

D.4 Additional Discussions

Discussion on Proposition 1. In the previous sections, we obtained Proposition 1 through extensive derivations, which reformulates the AdvAD attack process using λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝜹tsubscript𝜹𝑡\boldsymbol{\delta}_{t}bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. While this formulation does not represent the actual attack procedure, it enables post-analysis after the completion of attacks. In Proposition 1, although the upper bound of 𝜹tsubscript𝜹𝑡\boldsymbol{\delta}_{t}bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is theoretically independent of the step t𝑡titalic_t, both 𝜹tsubscript𝜹𝑡\boldsymbol{\delta}_{t}bold_italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and the coefficient λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT gradually decrease with t𝑡titalic_t in quantitative results of Figure 7 due to the unique properties of AdvAD. Thus, it emerges a hypothesis that whether modifying the coefficient of gradient term of traditional attacks like PGD to decay incrementally could also achieve the imperceptibility. To isolate the impact of this hypothesis, we conduct experiments with a modified version of PGD with step size decay as:

𝒙t1=Πξ{𝒙t+λtηsign(𝒙tCE(f(𝒙t),ygt))},subscript𝒙𝑡1subscriptΠ𝜉subscript𝒙𝑡subscript𝜆𝑡𝜂signsubscriptsubscript𝒙𝑡subscriptCE𝑓subscript𝒙𝑡subscript𝑦𝑔𝑡\boldsymbol{x}_{t-1}=\Pi_{\xi}\{\boldsymbol{x}_{t}+\lambda_{t}\cdot\eta\cdot% \text{sign}(\nabla_{\boldsymbol{x}_{t}}\mathcal{L}_{\text{CE}}(f(\boldsymbol{x% }_{t}),y_{gt}))\},bold_italic_x start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT = roman_Π start_POSTSUBSCRIPT italic_ξ end_POSTSUBSCRIPT { bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⋅ italic_η ⋅ sign ( ∇ start_POSTSUBSCRIPT bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT caligraphic_L start_POSTSUBSCRIPT CE end_POSTSUBSCRIPT ( italic_f ( bold_italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) , italic_y start_POSTSUBSCRIPT italic_g italic_t end_POSTSUBSCRIPT ) ) } , (53)

where λtsubscript𝜆𝑡\lambda_{t}italic_λ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the same coefficient as in Eq. (39) of Proposition 1 for alignment, and η𝜂\etaitalic_η is a fixed small factor for the initial step size. We have searched a lot of values of η𝜂\etaitalic_η to determine the optimal range, and the results of attacking three models with different architectures under three typical values of η𝜂\etaitalic_η are presented in Table 6.

Table 6: Results of PGD + step size decay strategy and the proposed AdvAD.
Model Attack Method Param. Time ASR lsubscript𝑙l_{\infty}italic_l start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT l2subscript𝑙2l_{2}italic_l start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT PSNR \uparrow SSIM \uparrow
ResNet-50 PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 5e-5 T𝑇Titalic_T=1000, ξ𝜉\xiitalic_ξ=8/255 2272 99.9 0.016 1.80 46.75 0.9947
PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 3e-5 2228 99.0 0.008 1.17 50.41 0.9974
PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 1e-5 2306 7.1 - - - -
AdvAD (ours) 2201 99.7 0.010 1.06 51.84 0.998
Swin-Base PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 5e-5 T𝑇Titalic_T=1000, ξ𝜉\xiitalic_ξ=8/255 8725 98.0 0.008 1.28 49.88 0.9975
PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 3e-5 8728 89.1 0.004 0.94 52.47 0.9985
PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 1e-5 8715 3.9 - - - -
AdvAD (ours) 9729 100 0.013 1.19 50.57 0.9978
VisionMamba-Small PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 5e-5 T𝑇Titalic_T=1000, ξ𝜉\xiitalic_ξ=8/255 6350 89.2 0.008 1.63 47.76 0.9959
PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 3e-5 6393 78.3 0.004 1.10 51.05 0.9979
PGD + Step size decay in Eq. (53), η𝜂\etaitalic_η = 1e-5 6348 2.5 - - - -
AdvAD (ours) 6154 99.7 0.016 1.62 47.94 0.9960

It can be observed that for PGD with this strategy, the ASR is clearly proportional to η𝜂\etaitalic_η, the imperceptibility is inversely proportional to η𝜂\etaitalic_η. However, regardless of how η𝜂\etaitalic_η is adjusted, this strategy can not simultaneously match AdvAD in both ASR and imperceptibility. Firstly, for η𝜂\etaitalic_η = 5e-5, when attacking VisionMamba, ASR of this strategy is 10.5%percent\%% lower than AdvAD with close PSNR, and the strategy has a 0.2%percent\%% higher ASR but a 5.09 dB lower PSNR for ResNet50. For η𝜂\etaitalic_η = 3e-5, the ASR against VisionMamba and Swin further degrade, being 10.9%percent\%% and 21.4%percent\%% lower than AdvAD, respectively. Finally, for η𝜂\etaitalic_η = 1e-5, the modified PGD with step size decay fails to attack all the models. Nevertheless, although this step size decay strategy performs worse than our AdvAD, it indeed enhances the imperceptibility of attacks compared to the original PGD in some cases, which further validates our motivation and modeling approach. This is because, while this strategy follows a completely different technical route than AdvAD, it similarly uses subtler perturbations to progressively push adversarial examples closer to the model’s decision boundary. To this end, we leave further research on the potential of this strategy to future work.

Limitation. As the primary focus of AdvAD is the imperceptibility, although it achieves better transferability at lower perturbation strength compared with other restricted imperceptible attacks, its transferability is inevitably weaker than other black-box attack methods that operate in larger perturbation spaces and are specifically designed for transferability (like the unrestricted ones). However, the proposed AdvAD is essentially a general attack paradigm with a novel modeling approach and a solid theoretical foundation. By relaxing the constraint of perturbation strength and incorporating enhanced designs for the transferability into the proposed framework of non-parametric diffusion process, AdvAD also has significant potential to be modified into a specific black-box attack, and we also leave this aspect for future research.

NeurIPS Paper Checklist

  1. 1.

    Claims

  2. Question: Do the main claims made in the abstract and introduction accurately reflect the paper’s contributions and scope?

  3. Answer: [Yes]

  4. Justification: We present the motivation, innovation, overview of methods, experimental performance and the main contributions of our paper in the abstract and introduction. These claims are further explained and verified in Sec. 3 and Sec. 4, and detailed proofs are given in Appendix.

  5. Guidelines:

    • The answer NA means that the abstract and introduction do not include the claims made in the paper.

    • The abstract and/or introduction should clearly state the claims made, including the contributions made in the paper and important assumptions and limitations. A No or NA answer to this question will not be perceived well by the reviewers.

    • The claims made should match theoretical and experimental results, and reflect how much the results can be expected to generalize to other settings.

    • It is fine to include aspirational goals as motivation as long as it is clear that these goals are not attained by the paper.

  6. 2.

    Limitations

  7. Question: Does the paper discuss the limitations of the work performed by the authors?

  8. Answer: [Yes]

  9. Justification: For the proposed AdvAD that mainly focuses on the imperceptibility, we point that there is a trade-off between the imperceptibility and transferability for the restricted attacks. The relevant discussions and experimental results are given in Sec. 3.4, Sec. 4.3, and Appendix D.4. Additionally, for the proposed AdvAD-X, although it shows amazing performance under an ideal scenario using raw floating-point data for attacking and brings theoretical value, the very small floating-point perturbations will be easily erased by the quantization process when the image is actually stored.

  10. Guidelines:

    • The answer NA means that the paper has no limitation while the answer No means that the paper has limitations, but those are not discussed in the paper.

    • The authors are encouraged to create a separate "Limitations" section in their paper.

    • The paper should point out any strong assumptions and how robust the results are to violations of these assumptions (e.g., independence assumptions, noiseless settings, model well-specification, asymptotic approximations only holding locally). The authors should reflect on how these assumptions might be violated in practice and what the implications would be.

    • The authors should reflect on the scope of the claims made, e.g., if the approach was only tested on a few datasets or with a few runs. In general, empirical results often depend on implicit assumptions, which should be articulated.

    • The authors should reflect on the factors that influence the performance of the approach. For example, a facial recognition algorithm may perform poorly when image resolution is low or images are taken in low lighting. Or a speech-to-text system might not be used reliably to provide closed captions for online lectures because it fails to handle technical jargon.

    • The authors should discuss the computational efficiency of the proposed algorithms and how they scale with dataset size.

    • If applicable, the authors should discuss possible limitations of their approach to address problems of privacy and fairness.

    • While the authors might fear that complete honesty about limitations might be used by reviewers as grounds for rejection, a worse outcome might be that reviewers discover limitations that aren’t acknowledged in the paper. The authors should use their best judgment and recognize that individual actions in favor of transparency play an important role in developing norms that preserve the integrity of the community. Reviewers will be specifically instructed to not penalize honesty concerning limitations.

  11. 3.

    Theory Assumptions and Proofs

  12. Question: For each theoretical result, does the paper provide the full set of assumptions and a complete (and correct) proof?

  13. Answer: [Yes]

  14. Justification: The proposed attack method is built on a solid theoretical foundation, which is derived from the derivation of the diffusion models. The specific methods and theoretical properties are introduced in detail in Sec. 3, and the detailed proofs of the proposed theorem and two propositions are given in Appendix.

  15. Guidelines:

    • The answer NA means that the paper does not include theoretical results.

    • All the theorems, formulas, and proofs in the paper should be numbered and cross-referenced.

    • All assumptions should be clearly stated or referenced in the statement of any theorems.

    • The proofs can either appear in the main paper or the supplemental material, but if they appear in the supplemental material, the authors are encouraged to provide a short proof sketch to provide intuition.

    • Inversely, any informal proof provided in the core of the paper should be complemented by formal proofs provided in appendix or supplemental material.

    • Theorems and Lemmas that the proof relies upon should be properly referenced.

  16. 4.

    Experimental Result Reproducibility

  17. Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper (regardless of whether the code and data are provided or not)?

  18. Answer: [Yes]

  19. Justification: We present all the formulas, calculation processes, algorithms, and hyperparameters of the proposed methods in detail. And we have tested that by fixing the random seed, our method can produce consistent results with the same input on our machine, providing good reproducibility.

  20. Guidelines:

    • The answer NA means that the paper does not include experiments.

    • If the paper includes experiments, a No answer to this question will not be perceived well by the reviewers: Making the paper reproducible is important, regardless of whether the code and data are provided or not.

    • If the contribution is a dataset and/or model, the authors should describe the steps taken to make their results reproducible or verifiable.

    • Depending on the contribution, reproducibility can be accomplished in various ways. For example, if the contribution is a novel architecture, describing the architecture fully might suffice, or if the contribution is a specific model and empirical evaluation, it may be necessary to either make it possible for others to replicate the model with the same dataset, or provide access to the model. In general. releasing code and data is often one good way to accomplish this, but reproducibility can also be provided via detailed instructions for how to replicate the results, access to a hosted model (e.g., in the case of a large language model), releasing of a model checkpoint, or other means that are appropriate to the research performed.

    • While NeurIPS does not require releasing code, the conference does require all submissions to provide some reasonable avenue for reproducibility, which may depend on the nature of the contribution. For example

      1. (a)

        If the contribution is primarily a new algorithm, the paper should make it clear how to reproduce that algorithm.

      2. (b)

        If the contribution is primarily a new model architecture, the paper should describe the architecture clearly and fully.

      3. (c)

        If the contribution is a new model (e.g., a large language model), then there should either be a way to access this model for reproducing the results or a way to reproduce the model (e.g., with an open-source dataset or instructions for how to construct the dataset).

      4. (d)

        We recognize that reproducibility may be tricky in some cases, in which case authors are welcome to describe the particular way they provide for reproducibility. In the case of closed-source models, it may be that access to the model is limited in some way (e.g., to registered users), but it should be possible for other researchers to have some path to reproducing or verifying the results.

  21. 5.

    Open access to data and code

  22. Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material?

  23. Answer: [Yes]

  24. Justification: We open-source the code on GitHub and give the Python environment requirements, dataset preparation, running commands, etc. Following our instructions, the experimental results given in the paper can be easily reproduced.

  25. Guidelines:

    • The answer NA means that paper does not include experiments requiring code.

    • Please see the NeurIPS code and data submission guidelines (https://nips.cc/public/guides/CodeSubmissionPolicy) for more details.

    • While we encourage the release of code and data, we understand that this might not be possible, so “No” is an acceptable answer. Papers cannot be rejected simply for not including code, unless this is central to the contribution (e.g., for a new open-source benchmark).

    • The instructions should contain the exact command and environment needed to run to reproduce the results. See the NeurIPS code and data submission guidelines (https://nips.cc/public/guides/CodeSubmissionPolicy) for more details.

    • The authors should provide instructions on data access and preparation, including how to access the raw data, preprocessed data, intermediate data, and generated data, etc.

    • The authors should provide scripts to reproduce all experimental results for the new proposed method and baselines. If only a subset of experiments are reproducible, they should state which ones are omitted from the script and why.

    • At submission time, to preserve anonymity, the authors should release anonymized versions (if applicable).

    • Providing as much information as possible in supplemental material (appended to the paper) is recommended, but including URLs to data and code is permitted.

  26. 6.

    Experimental Setting/Details

  27. Question: Does the paper specify all the training and test details (e.g., data splits, hyperparameters, how they were chosen, type of optimizer, etc.) necessary to understand the results?

  28. Answer: [Yes]

  29. Justification: One of the main contributions of our paper is to propose a novel adversarial attack modeling framework based on a non-parametric diffusion process. Benefiting from the proposed modeling approach, our attack method only needs two simple hyperparameters to accurately control the entire attack process, without the need for complex training or optimization processes in other paradigms.

  30. Guidelines:

    • The answer NA means that the paper does not include experiments.

    • The experimental setting should be presented in the core of the paper to a level of detail that is necessary to appreciate the results and make sense of them.

    • The full details can be provided either with the code, in appendix, or as supplemental material.

  31. 7.

    Experiment Statistical Significance

  32. Question: Does the paper report error bars suitably and correctly defined or other appropriate information about the statistical significance of the experiments?

  33. Answer: [No]

  34. Justification: Only Figure 5 reports a actual statistic numerical curve of a theoretical upper bound in the actual execution process, which is calculated from actual data using a confidence level of 0.85 and shows the confidence interval. In other experiments, due to limited computing resources, the error bar under multiple runs is not included. However, as mentioned above, the methods proposed in this paper can obtain consistent results in multiple runs by fixing the random seed.

  35. Guidelines:

    • The answer NA means that the paper does not include experiments.

    • The authors should answer "Yes" if the results are accompanied by error bars, confidence intervals, or statistical significance tests, at least for the experiments that support the main claims of the paper.

    • The factors of variability that the error bars are capturing should be clearly stated (for example, train/test split, initialization, random drawing of some parameter, or overall run with given experimental conditions).

    • The method for calculating the error bars should be explained (closed form formula, call to a library function, bootstrap, etc.)

    • The assumptions made should be given (e.g., Normally distributed errors).

    • It should be clear whether the error bar is the standard deviation or the standard error of the mean.

    • It is OK to report 1-sigma error bars, but one should state it. The authors should preferably report a 2-sigma error bar than state that they have a 96% CI, if the hypothesis of Normality of errors is not verified.

    • For asymmetric distributions, the authors should be careful not to show in tables or figures symmetric error bars that would yield results that are out of range (e.g. negative error rates).

    • If error bars are reported in tables or plots, The authors should explain in the text how they were calculated and reference the corresponding figures or tables in the text.

  36. 8.

    Experiments Compute Resources

  37. Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments?

  38. Answer: [Yes]

  39. Justification: We have indicated that all the experiments are conducted on a single NVIDIA RTX 3090 GPU, and the running time required for all methods to attack different models are included in Table 1, Table 5, and Table 8 to compare the computational complexity while providing a reference.

  40. Guidelines:

    • The answer NA means that the paper does not include experiments.

    • The paper should indicate the type of compute workers CPU or GPU, internal cluster, or cloud provider, including relevant memory and storage.

    • The paper should provide the amount of compute required for each of the individual experimental runs as well as estimate the total compute.

    • The paper should disclose whether the full research project required more compute than the experiments reported in the paper (e.g., preliminary or failed experiments that didn’t make it into the paper).

  41. 9.

    Code Of Ethics

  42. Question: Does the research conducted in the paper conform, in every respect, with the NeurIPS Code of Ethics https://neurips.cc/public/EthicsGuidelines?

  43. Answer: [Yes]

  44. Justification: We ensure that the research conducted in the paper complies with the NeurIPS Code of Ethics in all respects.

  45. Guidelines:

    • The answer NA means that the authors have not reviewed the NeurIPS Code of Ethics.

    • If the authors answer No, they should explain the special circumstances that require a deviation from the Code of Ethics.

    • The authors should make sure to preserve anonymity (e.g., if there is a special consideration due to laws or regulations in their jurisdiction).

  46. 10.

    Broader Impacts

  47. Question: Does the paper discuss both potential positive societal impacts and negative societal impacts of the work performed?

  48. Answer: [Yes]

  49. Justification: As a research on adversarial attacks of deep neural networks, the significance lies in revealing possible attack algorithms and the vulnerabilities of the models in advance, in order to help promote corresponding defense methods or the model robustness, and improve the safety of deep neural networks in real-world applications.

  50. Guidelines:

    • The answer NA means that there is no societal impact of the work performed.

    • If the authors answer NA or No, they should explain why their work has no societal impact or why the paper does not address societal impact.

    • Examples of negative societal impacts include potential malicious or unintended uses (e.g., disinformation, generating fake profiles, surveillance), fairness considerations (e.g., deployment of technologies that could make decisions that unfairly impact specific groups), privacy considerations, and security considerations.

    • The conference expects that many papers will be foundational research and not tied to particular applications, let alone deployments. However, if there is a direct path to any negative applications, the authors should point it out. For example, it is legitimate to point out that an improvement in the quality of generative models could be used to generate deepfakes for disinformation. On the other hand, it is not needed to point out that a generic algorithm for optimizing neural networks could enable people to train models that generate Deepfakes faster.

    • The authors should consider possible harms that could arise when the technology is being used as intended and functioning correctly, harms that could arise when the technology is being used as intended but gives incorrect results, and harms following from (intentional or unintentional) misuse of the technology.

    • If there are negative societal impacts, the authors could also discuss possible mitigation strategies (e.g., gated release of models, providing defenses in addition to attacks, mechanisms for monitoring misuse, mechanisms to monitor how a system learns from feedback over time, improving the efficiency and accessibility of ML).

  51. 11.

    Safeguards

  52. Question: Does the paper describe safeguards that have been put in place for responsible release of data or models that have a high risk for misuse (e.g., pretrained language models, image generators, or scraped datasets)?

  53. Answer: [N/A]

  54. Justification: The paper poses no such risks.

  55. Guidelines:

    • The answer NA means that the paper poses no such risks.

    • Released models that have a high risk for misuse or dual-use should be released with necessary safeguards to allow for controlled use of the model, for example by requiring that users adhere to usage guidelines or restrictions to access the model or implementing safety filters.

    • Datasets that have been scraped from the Internet could pose safety risks. The authors should describe how they avoided releasing unsafe images.

    • We recognize that providing effective safeguards is challenging, and many papers do not require this, but we encourage authors to take this into account and make a best faith effort.

  56. 12.

    Licenses for existing assets

  57. Question: Are the creators or original owners of assets (e.g., code, data, models), used in the paper, properly credited and are the license and terms of use explicitly mentioned and properly respected?

  58. Answer: [Yes]

  59. Justification: The dataset we used is under the MIT License, and it has been properly cited in the paper with its URL.

  60. Guidelines:

    • The answer NA means that the paper does not use existing assets.

    • The authors should cite the original paper that produced the code package or dataset.

    • The authors should state which version of the asset is used and, if possible, include a URL.

    • The name of the license (e.g., CC-BY 4.0) should be included for each asset.

    • For scraped data from a particular source (e.g., website), the copyright and terms of service of that source should be provided.

    • If assets are released, the license, copyright information, and terms of use in the package should be provided. For popular datasets, paperswithcode.com/datasets has curated licenses for some datasets. Their licensing guide can help determine the license of a dataset.

    • For existing datasets that are re-packaged, both the original license and the license of the derived asset (if it has changed) should be provided.

    • If this information is not available online, the authors are encouraged to reach out to the asset’s creators.

  61. 13.

    New Assets

  62. Question: Are new assets introduced in the paper well documented and is the documentation provided alongside the assets?

  63. Answer: [N/A]

  64. Justification: The paper does not release new assets.

  65. Guidelines:

    • The answer NA means that the paper does not release new assets.

    • Researchers should communicate the details of the dataset/code/model as part of their submissions via structured templates. This includes details about training, license, limitations, etc.

    • The paper should discuss whether and how consent was obtained from people whose asset is used.

    • At submission time, remember to anonymize your assets (if applicable). You can either create an anonymized URL or include an anonymized zip file.

  66. 14.

    Crowdsourcing and Research with Human Subjects

  67. Question: For crowdsourcing experiments and research with human subjects, does the paper include the full text of instructions given to participants and screenshots, if applicable, as well as details about compensation (if any)?

  68. Answer: [N/A]

  69. Justification: The paper does not involve crowdsourcing nor research with human subjects.

  70. Guidelines:

    • The answer NA means that the paper does not involve crowdsourcing nor research with human subjects.

    • Including this information in the supplemental material is fine, but if the main contribution of the paper involves human subjects, then as much detail as possible should be included in the main paper.

    • According to the NeurIPS Code of Ethics, workers involved in data collection, curation, or other labor should be paid at least the minimum wage in the country of the data collector.

  71. 15.

    Institutional Review Board (IRB) Approvals or Equivalent for Research with Human Subjects

  72. Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or institution) were obtained?

  73. Answer: [N/A]

  74. Justification: The paper does not involve crowdsourcing nor research with human subjects.

  75. Guidelines:

    • The answer NA means that the paper does not involve crowdsourcing nor research with human subjects.

    • Depending on the country in which research is conducted, IRB approval (or equivalent) may be required for any human subjects research. If you obtained IRB approval, you should clearly state this in the paper.

    • We recognize that the procedures for this may vary significantly between institutions and locations, and we expect authors to adhere to the NeurIPS Code of Ethics and the guidelines for their institution.

    • For initial submissions, do not include any information that would break anonymity (if applicable), such as the institution conducting the review.