I Introduction
Channel coding is a fundamental problem focused on the reliable transmission of information over a noisy channel. Information transmission with arbitrarily small error probability is possible at all rates below the capacity C 𝐶 C italic_C of the channel, if the number n 𝑛 n italic_n of channel uses (also called the blocklength) is permitted to grow without bound [1 ] . At finite blocklengths, there is an unavoidable backoff from capacity due to the random nature of the channel. The second-order coding rate (SOCR) ([2 , 3 , 4 , 5 , 6 ] ) quantifies the O ( n − 1 / 2 ) 𝑂 superscript 𝑛 1 2 O(n^{-1/2}) italic_O ( italic_n start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT ) convergence to the capacity.
In many practical scenarios, the channel input is subject to some cost constraints which limit the amount of resources that can be used for transmission. With a cost constraint present, the role of capacity is replaced by the capacity-cost function [1 , Theorem 6.11] . One common form of the cost constraint is the almost-sure (a.s.) cost constraint ([3 , 7 ] ) which bounds the time-average cost of the channel input X n superscript 𝑋 𝑛 X^{n} italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over all messages, realizations
of any side randomness, channel noise (if there is feedback), etc.:
1 n ∑ i = 1 n c ( X i ) ≤ Γ almost surely, 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 Γ almost surely,
\displaystyle\frac{1}{n}\sum_{i=1}^{n}c(X_{i})\leq\Gamma\quad\text{almost %
surely,} divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ≤ roman_Γ almost surely,
(1)
where c ( ⋅ ) 𝑐 ⋅ c(\cdot) italic_c ( ⋅ ) is the cost function. Under the almost-sure (a.s.) cost constraint, the optimal first-order coding rate is the capacity-cost function, the strong converse holds [1 , Theorem 6.11] , and the optimal SOCR is also known [3 , Theorem 3] .
The a.s. cost constraint is quite unforgiving, never allowing the
cost to exceed the threshold under any circumstances.
Our first result (Theorem 1 ) shows that the SOCR can be strictly improved by merely allowing the cost to fluctuate above the
threshold in a manner consistent with a noise process, i.e.,
the fluctuations have a variance of O ( 1 / n ) 𝑂 1 𝑛 O(1/n) italic_O ( 1 / italic_n ) .
Our second result (Theorem 2 ) shows that the a.s. cost framework does not allow feedback improvement to SOCR for simple-dispersion DMCs. This again contrasts with the scenario where random fluctuations with variance O ( 1 / n ) 𝑂 1 𝑛 O(1/n) italic_O ( 1 / italic_n ) above the threshold
are allowed, as shown in [8 , Theorem 3] . This highlights the important role cost variability plays in enabling feedback mechanisms to improve coding performance.
These findings raise the question of whether it is necessary to
impose a constraint as stringent as (1 ).
Cost constraints in communication systems are typically imposed to achieve goals such as operating circuitry in the linear regime, minimizing power consumption, and reducing interference with other terminals. It is worth noting that these goals do not always necessitate the use of the strict a.s. cost constraint. For example, the expected cost constraint is often used in wireless communication literature (see, e.g., [9 , 10 , 11 , 12 ] ) because it allows for a dynamic allocation of power based on the current channel state. The expected cost constraint bounds the cost averaged over time and the ensemble:
𝔼 [ 1 n ∑ i = 1 n c ( X i ) ] ≤ Γ . 𝔼 delimited-[] 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 Γ \displaystyle\mathbb{E}\left[\frac{1}{n}\sum_{i=1}^{n}c(X_{i})\right]\leq\Gamma. blackboard_E [ divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] ≤ roman_Γ .
(2)
Yet, if the a.s. constraint is too strict, the expectation constraint
is arguably too weak.
The expectation constraint allows highly non-ergodic use of power,
as shown in Section II-A ,
which is problematic both from the vantage points of operating circuitry
in the linear regime and interference management.
The O ( 1 / n ) 𝑂 1 𝑛 O(1/n) italic_O ( 1 / italic_n ) variance allowance is a feature of a new cost formulation, referred to as mean and variance cost constraints in [8 ] . This formulation replaces (1 ) with the following conditions:
𝔼 [ 1 n ∑ i = 1 n c ( X i ) ] 𝔼 delimited-[] 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 \displaystyle\mathbb{E}\left[\frac{1}{n}\sum_{i=1}^{n}c(X_{i})\right] blackboard_E [ divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ]
≤ Γ , absent Γ \displaystyle\leq\Gamma, ≤ roman_Γ ,
(3)
Var ( 1 n ∑ i = 1 n c ( X i ) ) Var 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 \displaystyle\text{Var}\left(\frac{1}{n}\sum_{i=1}^{n}c(X_{i})\right) Var ( divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) )
≤ V n . absent 𝑉 𝑛 \displaystyle\leq\frac{V}{n}. ≤ divide start_ARG italic_V end_ARG start_ARG italic_n end_ARG .
(4)
The mean and variance cost constraints were introduced as a relaxed version of the a.s. cost constraint that permits a small amount of stochastic
fluctuation above the threshold Γ Γ \Gamma roman_Γ while providing an
ergodicity guarantee.
Consider a random channel codebook whose codewords satisfy ( 3 ) 3 (\ref{exp01}) ( ) with equality. For a given input x n superscript 𝑥 𝑛 x^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , define an ergodicity metric ℰ m subscript ℰ 𝑚 \mathcal{E}_{m} caligraphic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT as
ℰ m ( x n ) := max ( 1 n ∑ i = 1 n c ( x i ) − Γ , 0 ) Γ . assign subscript ℰ 𝑚 superscript 𝑥 𝑛 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑥 𝑖 Γ 0 Γ \displaystyle\mathcal{E}_{m}(x^{n}):=\frac{\max\left(\frac{1}{n}\sum_{i=1}^{n}%
c(x_{i})-\Gamma,0\right)}{\Gamma}. caligraphic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) := divide start_ARG roman_max ( divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - roman_Γ , 0 ) end_ARG start_ARG roman_Γ end_ARG .
(5)
The definition in ( 5 ) 5 (\ref{h23}) ( ) only penalizes cost variation above the threshold and normalizes by the mean cost Γ Γ \Gamma roman_Γ . Let α > 0 𝛼 0 \alpha>0 italic_α > 0 be the desired ergodicity parameter. We say that a transmission x n superscript 𝑥 𝑛 x^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is α 𝛼 \alpha italic_α -ergodic if ℰ m ( x n ) ≤ α subscript ℰ 𝑚 superscript 𝑥 𝑛 𝛼 \mathcal{E}_{m}(x^{n})\leq\alpha caligraphic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ italic_α . Let β 𝛽 \beta italic_β be the desired uncertainty parameter. We say that a random codebook is ( α , β ) 𝛼 𝛽 (\alpha,\beta) ( italic_α , italic_β ) -ergodic if ℙ ( ℰ m ( X n ) ≤ α ) ≥ 1 − β ℙ subscript ℰ 𝑚 superscript 𝑋 𝑛 𝛼 1 𝛽 \mathbb{P}\left(\mathcal{E}_{m}(X^{n})\leq\alpha\right)\geq 1-\beta blackboard_P ( caligraphic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ italic_α ) ≥ 1 - italic_β , where X n superscript 𝑋 𝑛 X^{n} italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is a random transmission from the codebook.
Under the mean and variance cost formulation, we have ℙ ( ℰ m ( X n ) ≤ α ) ≥ 1 − β ℙ subscript ℰ 𝑚 superscript 𝑋 𝑛 𝛼 1 𝛽 \mathbb{P}\left(\mathcal{E}_{m}(X^{n})\leq\alpha\right)\geq 1-\beta blackboard_P ( caligraphic_E start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ italic_α ) ≥ 1 - italic_β if
n ≥ n c := V β α 2 Γ 2 , 𝑛 subscript 𝑛 𝑐 assign 𝑉 𝛽 superscript 𝛼 2 superscript Γ 2 \displaystyle n\geq n_{c}:=\frac{V}{\beta\alpha^{2}\Gamma^{2}}, italic_n ≥ italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT := divide start_ARG italic_V end_ARG start_ARG italic_β italic_α start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_Γ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,
(6)
where we call n c subscript 𝑛 𝑐 n_{c} italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT the critical blocklength . Thus, the critical blocklength specifies the minimum blocklength of a channel code for which transmission behaves ergodically with high probability.
For fixed α 𝛼 \alpha italic_α , β 𝛽 \beta italic_β , and Γ Γ \Gamma roman_Γ , the parameters n c subscript 𝑛 𝑐 n_{c} italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT
and V 𝑉 V italic_V are in one-to-one correspondence, so one can view the
choice of V 𝑉 V italic_V in (4 ) as specifying the critical blocklength.
Note that with an
expectation-only constraint, we effectively have V = ∞ 𝑉 V=\infty italic_V = ∞ ,
so the transmission is not guaranteed to be ergodic at any blocklength.
Furthermore, unlike the expected cost constraint, the mean and variance cost formulation:
•
allows for a strong converse [13 , Theorem 77] , [8 ] ,
•
allows for a finite second-order coding rate [8 ] ,
•
does not allow blasting power on errors in the feedback case [14 ] .
The results of this paper also have significance in the context of previous works. Our result in Theorem 2 extends the previously known result that feedback does not improve the second-order performance for simple-dispersion DMCs without cost constraints [4 ] . It is also similar to the result in [15 ] that feedback does not improve the second-order performance for AWGN channels.
Random channel coding schemes often use independent and identically distributed (i.i.d.) codewords. It was noted in [16 ] that the a.s. cost constraint, which is the most commonly considered cost constraint in the context of discrete memoryless channels (DMCs), prohibits the use of i.i.d. codewords. It was shown in [16 ] that a feedback scheme that uses both i.i.d. and constant-composition codewords leads to an improved SOCR compared to the best non-feedback SOCR
achievable under the a.s. cost constraint. Our result in Theorem 2 strengthens the result in [16 ] by showing that the aforementioned improvement also holds compared to the best feedback SOCR achievable under the a.s. cost constraint.
I-A Related Work
The second- and third-order asymptotics for DMCs with the a.s. cost constraint in the non-feedback setting have been characterized in [3 ] and [17 ] , respectively. The second-order asymptotics in the feedback setting of DMCs that are not simple-dispersion are studied in [4 ] without cost constraints. There are more feedback results available for AWGN channels compared to DMCs under the a.s. cost constraint. For example, the result in [15 ] also addresses the third-order performance with feedback while [18 ] gives the result that feedback does not improve the second-order performance for parallel Gaussian channels. The second-order performance for the AWGN channel with an expected cost constraint is characterized in [19 ] . Table I summarizes these results across different settings in channel coding.
Paper
Channel
Performance
Cost Constraint
Feedback
Non-feedback
Hayashi [3 ]
DMC, AWGN
2nd order
a.s.
No
Yes
Tan and Tomamichel [20 ]
AWGN
3rd order
a.s.
No
Yes
Kostina and Verdú [17 ]
DMC
3rd order
a.s.
No
Yes
Fong and Tan [15 ]
AWGN
2nd and 3rd order
a.s.
Yes
No
Wagner, Shende and Altuğ [4 ]
DMC
2nd order
none
Yes
No
Mahmood and Wagner [8 ]
DMC
2nd order
mean and variance
Yes
Yes
This paper
DMC
2nd order
mean and variance, a.s.
Yes
Yes
Polyanskiy [13 , Th. 78]
Parallel AWGN
2nd order
a.s.
No
Yes
Fong and Tan [18 ]
Parallel AWGN
2nd order
a.s.
Yes
No
Polyanskiy [13 , Th. 77]
AWGN
1st order
expected cost
No
Yes
Yang et al. [19 ]
AWGN
2nd order
expected cost
No
Yes
TABLE I: Relevant results across different settings in channel coding.
Our proof technique for Theorem 2 is more closely aligned with that used in [4 ] for DMCs than in [15 ] for AWGN channels. Both proofs show converse bounds with feedback that match the previously known non-feedback achievability results for DMCs and AWGN channels, respectively. A common technique used in both converse proofs is a result from binary hypothesis testing, which is used in the derivation of Lemma 1 in our paper and a similar result in [15 , (17)] . We then proceed with the proof by using a Berry-Esseen-type result for
bounded martingale difference sequences whereas [15 ] uses the usual Berry-Esseen theorem by first showing equality in distribution of the information density with a sum of i.i.d. random variables.
II Preliminaries
Let 𝒜 𝒜 \mathcal{A} caligraphic_A and ℬ ℬ \mathcal{B} caligraphic_B be finite input and output alphabets, respectively, of the DMC W 𝑊 W italic_W , where W 𝑊 W italic_W is a stochastic matrix from 𝒜 𝒜 \mathcal{A} caligraphic_A to ℬ ℬ \mathcal{B} caligraphic_B . For a given sequence x n ∈ 𝒜 n superscript 𝑥 𝑛 superscript 𝒜 𝑛 x^{n}\in\mathcal{A}^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , the n 𝑛 n italic_n -type t = t ( x n ) 𝑡 𝑡 superscript 𝑥 𝑛 t=t(x^{n}) italic_t = italic_t ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) of x n superscript 𝑥 𝑛 x^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is defined as
t ( a ) 𝑡 𝑎 \displaystyle t(a) italic_t ( italic_a )
= 1 n ∑ i = 1 n 𝟙 ( x i = a ) absent 1 𝑛 superscript subscript 𝑖 1 𝑛 1 subscript 𝑥 𝑖 𝑎 \displaystyle=\frac{1}{n}\sum_{i=1}^{n}\mathds{1}(x_{i}=a) = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_1 ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_a )
for all a ∈ 𝒜 𝑎 𝒜 a\in\mathcal{A} italic_a ∈ caligraphic_A , where 𝟙 ( . ) \mathds{1}(.) blackboard_1 ( . ) is the standard indicator function. For a given sequence x n ∈ 𝒜 n superscript 𝑥 𝑛 superscript 𝒜 𝑛 x^{n}\in\mathcal{A}^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , we will use t ( x n ) 𝑡 superscript 𝑥 𝑛 t(x^{n}) italic_t ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) or P x n subscript 𝑃 superscript 𝑥 𝑛 P_{x^{n}} italic_P start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT to denote its type. Let 𝒫 n ( 𝒜 ) subscript 𝒫 𝑛 𝒜 \mathcal{P}_{n}(\mathcal{A}) caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) be the set of n 𝑛 n italic_n -types on 𝒜 𝒜 \mathcal{A} caligraphic_A . For a given t ∈ 𝒫 n ( 𝒜 ) 𝑡 subscript 𝒫 𝑛 𝒜 t\in\mathcal{P}_{n}(\mathcal{A}) italic_t ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) , T 𝒜 n ( t ) subscript superscript 𝑇 𝑛 𝒜 𝑡 T^{n}_{\mathcal{A}}(t) italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) denotes the type class, i.e., the set of sequences x n ∈ 𝒜 n superscript 𝑥 𝑛 superscript 𝒜 𝑛 x^{n}\in\mathcal{A}^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT with empirical distribution equal to t 𝑡 t italic_t . For a random variable Z 𝑍 Z italic_Z , ‖ Z ‖ ∞ subscript norm 𝑍 ||Z||_{\infty} | | italic_Z | | start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT denotes its
essential supremum (that is, the infimum of those numbers z 𝑧 z italic_z such that ℙ ( Z ≤ z ) = 1 ℙ 𝑍 𝑧 1 \mathbb{P}(Z\leq z)=1 blackboard_P ( italic_Z ≤ italic_z ) = 1 ). We will write log \log roman_log to denote logarithm to the base e 𝑒 e italic_e and exp ( x ) 𝑥 \exp(x) roman_exp ( italic_x ) to denote e 𝑒 e italic_e to the power of x 𝑥 x italic_x . The cost function is denoted by c ( ⋅ ) 𝑐 ⋅ c(\cdot) italic_c ( ⋅ ) where c : 𝒜 → [ 0 , c max ] : 𝑐 → 𝒜 0 subscript 𝑐 c:\mathcal{A}\to[0,c_{\max}] italic_c : caligraphic_A → [ 0 , italic_c start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ] and c max > 0 subscript 𝑐 0 c_{\max}>0 italic_c start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT > 0 is a constant. Let Γ 0 = min a ∈ 𝒜 c ( a ) subscript Γ 0 subscript 𝑎 𝒜 𝑐 𝑎 \Gamma_{0}=\min_{a\in\mathcal{A}}c(a) roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_c ( italic_a ) . Let Γ ∗ superscript Γ \Gamma^{*} roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT denote the smallest Γ Γ \Gamma roman_Γ such that the capacity-cost function C ( Γ ) 𝐶 Γ C(\Gamma) italic_C ( roman_Γ ) is equal to the unconstrained capacity. We assume Γ ∗ > Γ 0 superscript Γ subscript Γ 0 \Gamma^{*}>\Gamma_{0} roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT > roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and Γ ∈ ( Γ 0 , Γ ∗ ) Γ subscript Γ 0 superscript Γ \Gamma\in(\Gamma_{0},\Gamma^{*}) roman_Γ ∈ ( roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) throughout the paper. For Γ ∈ ( Γ 0 , Γ ∗ ) Γ subscript Γ 0 superscript Γ \Gamma\in(\Gamma_{0},\Gamma^{*}) roman_Γ ∈ ( roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , the capacity-cost function is defined as
C ( Γ ) 𝐶 Γ \displaystyle C(\Gamma) italic_C ( roman_Γ )
= max P ∈ 𝒫 ( 𝒜 ) c ( P ) ≤ Γ I ( P , W ) , absent subscript 𝑃 𝒫 𝒜 𝑐 𝑃 Γ
𝐼 𝑃 𝑊 \displaystyle=\max_{\begin{subarray}{c}P\in\mathcal{P}(\mathcal{A})\\
c(P)\leq\Gamma\end{subarray}}I(P,W), = roman_max start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_P ∈ caligraphic_P ( caligraphic_A ) end_CELL end_ROW start_ROW start_CELL italic_c ( italic_P ) ≤ roman_Γ end_CELL end_ROW end_ARG end_POSTSUBSCRIPT italic_I ( italic_P , italic_W ) ,
(7)
where c ( P ) := ∑ a ∈ 𝒜 P ( a ) c ( a ) assign 𝑐 𝑃 subscript 𝑎 𝒜 𝑃 𝑎 𝑐 𝑎 c(P):=\sum_{a\in\mathcal{A}}P(a)c(a) italic_c ( italic_P ) := ∑ start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_P ( italic_a ) italic_c ( italic_a ) . The function C ( Γ ) 𝐶 Γ C(\Gamma) italic_C ( roman_Γ ) is strictly increasing and differentiable [1 , Problem 8.4] in the interval ( Γ 0 , Γ ∗ ) subscript Γ 0 superscript Γ (\Gamma_{0},\Gamma^{*}) ( roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) . For a given x n ∈ 𝒜 n superscript 𝑥 𝑛 superscript 𝒜 𝑛 x^{n}\in\mathcal{A}^{n} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , we define
c ( x n ) := 1 n ∑ i = 1 n c ( x i ) . assign 𝑐 superscript 𝑥 𝑛 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑥 𝑖 \displaystyle c(x^{n}):=\frac{1}{n}\sum_{i=1}^{n}c(x_{i}). italic_c ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) := divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) .
Let Π W , Γ ∗ superscript subscript Π 𝑊 Γ
\Pi_{W,\Gamma}^{*} roman_Π start_POSTSUBSCRIPT italic_W , roman_Γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT be the set of all capacity-cost-achieving distributions, i.e., the set of maximizing distributions in ( 7 ) 7 (\ref{main_form}) ( ) . For any P ∗ ∈ Π W , Γ ∗ superscript 𝑃 superscript subscript Π 𝑊 Γ
P^{*}\in\Pi_{W,\Gamma}^{*} italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ roman_Π start_POSTSUBSCRIPT italic_W , roman_Γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , let Q ∗ = P ∗ W superscript 𝑄 superscript 𝑃 𝑊 Q^{*}=P^{*}W italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT italic_W be the marginal distribution on ℬ ℬ \mathcal{B} caligraphic_B . Note that the output distribution Q ∗ superscript 𝑄 Q^{*} italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is always unique, and without loss of generality, Q ∗ superscript 𝑄 Q^{*} italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT can be assumed to satisfy Q ∗ ( b ) > 0 superscript 𝑄 𝑏 0 Q^{*}(b)>0 italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_b ) > 0 for all b ∈ ℬ 𝑏 ℬ b\in\mathcal{B} italic_b ∈ caligraphic_B [21 , Corollaries 1 and 2 to Theorem 4.5.1] .
The following definitions will remain in effect throughout the paper:
ν a subscript 𝜈 𝑎 \displaystyle\nu_{a} italic_ν start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT
:= Var ( log W ( Y | a ) Q ∗ ( Y ) ) , where Y ∼ W ( ⋅ | a ) , \displaystyle:=\text{Var}\left(\log\frac{W(Y|a)}{Q^{*}(Y)}\right),\quad\text{ %
where }Y\sim W(\cdot|a), := Var ( roman_log divide start_ARG italic_W ( italic_Y | italic_a ) end_ARG start_ARG italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_Y ) end_ARG ) , where italic_Y ∼ italic_W ( ⋅ | italic_a ) ,
ν max subscript 𝜈 \displaystyle\nu_{\max} italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT
:= max a ∈ 𝒜 ν a , assign absent subscript 𝑎 𝒜 subscript 𝜈 𝑎 \displaystyle:=\max_{a\in\mathcal{A}}\nu_{a}, := roman_max start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_ν start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ,
i ( a , b ) 𝑖 𝑎 𝑏 \displaystyle i(a,b) italic_i ( italic_a , italic_b )
:= log W ( b | a ) Q ∗ ( b ) . assign absent 𝑊 conditional 𝑏 𝑎 superscript 𝑄 𝑏 \displaystyle:=\log\frac{W(b|a)}{Q^{*}(b)}. := roman_log divide start_ARG italic_W ( italic_b | italic_a ) end_ARG start_ARG italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_b ) end_ARG .
Definition 1 (cf. [4 ] )
A DMC W 𝑊 W italic_W is called simple-dispersion at the cost Γ ∈ ( Γ 0 , Γ ∗ ) Γ subscript Γ 0 superscript Γ \Gamma\in(\Gamma_{0},\Gamma^{*}) roman_Γ ∈ ( roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) if
min P ∗ ∈ Π W , Γ ∗ ∑ a ∈ 𝒜 P ∗ ( a ) ν a = max P ∗ ∈ Π W , Γ ∗ ∑ a ∈ 𝒜 P ∗ ( a ) ν a . subscript superscript 𝑃 superscript subscript Π 𝑊 Γ
subscript 𝑎 𝒜 superscript 𝑃 𝑎 subscript 𝜈 𝑎 subscript superscript 𝑃 superscript subscript Π 𝑊 Γ
subscript 𝑎 𝒜 superscript 𝑃 𝑎 subscript 𝜈 𝑎 \displaystyle\min_{P^{*}\in\Pi_{W,\Gamma}^{*}}\sum_{a\in\mathcal{A}}P^{*}(a)%
\nu_{a}=\max_{P^{*}\in\Pi_{W,\Gamma}^{*}}\sum_{a\in\mathcal{A}}P^{*}(a)\nu_{a}. roman_min start_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ roman_Π start_POSTSUBSCRIPT italic_W , roman_Γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_a ) italic_ν start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT = roman_max start_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ roman_Π start_POSTSUBSCRIPT italic_W , roman_Γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_a ) italic_ν start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT .
We will only focus on simple-dispersion channels for a fixed cost Γ ∈ ( Γ 0 , Γ ∗ ) Γ subscript Γ 0 superscript Γ \Gamma\in(\Gamma_{0},\Gamma^{*}) roman_Γ ∈ ( roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) and thus define
V ( Γ ) := ∑ a ∈ 𝒜 P ∗ ( a ) ν a assign 𝑉 Γ subscript 𝑎 𝒜 superscript 𝑃 𝑎 subscript 𝜈 𝑎 V(\Gamma):=\sum_{a\in\mathcal{A}}P^{*}(a)\nu_{a} italic_V ( roman_Γ ) := ∑ start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_a ) italic_ν start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT
for any P ∗ ∈ Π W , Γ ∗ superscript 𝑃 superscript subscript Π 𝑊 Γ
P^{*}\in\Pi_{W,\Gamma}^{*} italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∈ roman_Π start_POSTSUBSCRIPT italic_W , roman_Γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT .
With a blocklength n 𝑛 n italic_n and a fixed rate R > 0 𝑅 0 R>0 italic_R > 0 , let ℳ R = { 1 , … , ⌈ exp ( n R ) ⌉ } subscript ℳ 𝑅 1 … 𝑛 𝑅 \mathcal{M}_{R}=\{1,\ldots,\lceil\exp(nR)\rceil\} caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT = { 1 , … , ⌈ roman_exp ( italic_n italic_R ) ⌉ } denote the message set. Let M ∈ ℳ R 𝑀 subscript ℳ 𝑅 M\in\mathcal{M}_{R} italic_M ∈ caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT denote the random message drawn uniformly from the message set.
Definition 2
An ( n , R ) 𝑛 𝑅 (n,R) ( italic_n , italic_R ) code for a DMC consists of an encoder f 𝑓 f italic_f which, for each message m ∈ ℳ R 𝑚 subscript ℳ 𝑅 m\in\mathcal{M}_{R} italic_m ∈ caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , chooses an input X n = f ( m ) ∈ 𝒜 n superscript 𝑋 𝑛 𝑓 𝑚 superscript 𝒜 𝑛 X^{n}=f(m)\in\mathcal{A}^{n} italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT = italic_f ( italic_m ) ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , and a decoder g 𝑔 g italic_g which maps the output Y n superscript 𝑌 𝑛 Y^{n} italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT to m ^ ∈ ℳ R ^ 𝑚 subscript ℳ 𝑅 \hat{m}\in\mathcal{M}_{R} over^ start_ARG italic_m end_ARG ∈ caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT . The code ( f , g ) 𝑓 𝑔 (f,g) ( italic_f , italic_g ) is random if f 𝑓 f italic_f or g 𝑔 g italic_g is random.
Definition 3
An ( n , R ) 𝑛 𝑅 (n,R) ( italic_n , italic_R ) code with ideal feedback for a DMC consists of an encoder f 𝑓 f italic_f which, at each time instant k 𝑘 k italic_k (1 ≤ k ≤ n 1 𝑘 𝑛 1\leq k\leq n 1 ≤ italic_k ≤ italic_n ) and for each message m ∈ ℳ R 𝑚 subscript ℳ 𝑅 m\in\mathcal{M}_{R} italic_m ∈ caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , chooses an input x k = f ( m , x k − 1 , y k − 1 ) ∈ 𝒜 subscript 𝑥 𝑘 𝑓 𝑚 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1 𝒜 x_{k}=f(m,x^{k-1},y^{k-1})\in\mathcal{A} italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_f ( italic_m , italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) ∈ caligraphic_A , and a decoder g 𝑔 g italic_g which maps the output y n superscript 𝑦 𝑛 y^{n} italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT to m ^ ∈ ℳ R ^ 𝑚 subscript ℳ 𝑅 \hat{m}\in\mathcal{M}_{R} over^ start_ARG italic_m end_ARG ∈ caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT . The code ( f , g ) 𝑓 𝑔 (f,g) ( italic_f , italic_g ) is random if f 𝑓 f italic_f or g 𝑔 g italic_g is random.
Definition 4
An ( n , R , Γ ) 𝑛 𝑅 Γ (n,R,\Gamma) ( italic_n , italic_R , roman_Γ ) code for a DMC is an ( n , R ) 𝑛 𝑅 (n,R) ( italic_n , italic_R ) code such that c ( X n ) ≤ Γ 𝑐 superscript 𝑋 𝑛 Γ c(X^{n})\leq\Gamma italic_c ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ roman_Γ almost surely, where the message M ∼ Unif ( ℳ R ) similar-to 𝑀 Unif subscript ℳ 𝑅 M\sim\text{Unif}(\mathcal{M}_{R}) italic_M ∼ Unif ( caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) has a uniform distribution over the message set ℳ R subscript ℳ 𝑅 \mathcal{M}_{R} caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT .
Definition 5
An ( n , R , Γ ) 𝑛 𝑅 Γ (n,R,\Gamma) ( italic_n , italic_R , roman_Γ ) code with ideal feedback for a DMC is an ( n , R ) 𝑛 𝑅 (n,R) ( italic_n , italic_R ) code with ideal feedback such that c ( X n ) ≤ Γ 𝑐 superscript 𝑋 𝑛 Γ c(X^{n})\leq\Gamma italic_c ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ roman_Γ almost surely, where the message M ∼ Unif ( ℳ R ) similar-to 𝑀 Unif subscript ℳ 𝑅 M\sim\text{Unif}(\mathcal{M}_{R}) italic_M ∼ Unif ( caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) has a uniform distribution over the message set ℳ R subscript ℳ 𝑅 \mathcal{M}_{R} caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT .
Definition 6
An ( n , R , Γ , V ) 𝑛 𝑅 Γ 𝑉 (n,R,\Gamma,V) ( italic_n , italic_R , roman_Γ , italic_V ) code for a DMC is an ( n , R ) 𝑛 𝑅 (n,R) ( italic_n , italic_R ) code such that 𝔼 [ ∑ i = 1 n c ( X i ) ] ≤ n Γ 𝔼 delimited-[] superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 𝑛 Γ \mathbb{E}\left[\sum_{i=1}^{n}c(X_{i})\right]\leq n\Gamma blackboard_E [ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] ≤ italic_n roman_Γ and Var ( ∑ i = 1 n c ( X i ) ) ≤ n V Var superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 𝑛 𝑉 \text{Var}\left(\sum_{i=1}^{n}c(X_{i})\right)\leq nV Var ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ≤ italic_n italic_V , where the message M ∼ Unif ( ℳ R ) similar-to 𝑀 Unif subscript ℳ 𝑅 M\sim\text{Unif}(\mathcal{M}_{R}) italic_M ∼ Unif ( caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) has a uniform distribution over the message set ℳ R subscript ℳ 𝑅 \mathcal{M}_{R} caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT .
Definition 7
An ( n , R , Γ , V ) 𝑛 𝑅 Γ 𝑉 (n,R,\Gamma,V) ( italic_n , italic_R , roman_Γ , italic_V ) code with ideal feedback for a DMC is an ( n , R ) 𝑛 𝑅 (n,R) ( italic_n , italic_R ) code with ideal feedback such that 𝔼 [ ∑ i = 1 n c ( X i ) ] ≤ n Γ 𝔼 delimited-[] superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 𝑛 Γ \mathbb{E}\left[\sum_{i=1}^{n}c(X_{i})\right]\leq n\Gamma blackboard_E [ ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] ≤ italic_n roman_Γ and Var ( ∑ i = 1 n c ( X i ) ) ≤ n V Var superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 𝑛 𝑉 \text{Var}\left(\sum_{i=1}^{n}c(X_{i})\right)\leq nV Var ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ≤ italic_n italic_V , where the message M ∼ Unif ( ℳ R ) similar-to 𝑀 Unif subscript ℳ 𝑅 M\sim\text{Unif}(\mathcal{M}_{R}) italic_M ∼ Unif ( caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) has a uniform distribution over the message set ℳ R subscript ℳ 𝑅 \mathcal{M}_{R} caligraphic_M start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT .
Given ϵ ∈ ( 0 , 1 ) italic-ϵ 0 1 \epsilon\in(0,1) italic_ϵ ∈ ( 0 , 1 ) , define
M fb ∗ ( n , ϵ , Γ ) := max { ⌈ exp ( n R ) ⌉ : P ¯ e,fb ( n , R , Γ ) ≤ ϵ } , assign subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ : 𝑛 𝑅 subscript ¯ 𝑃 e,fb 𝑛 𝑅 Γ italic-ϵ \displaystyle M^{*}_{\text{fb}}(n,\epsilon,\Gamma):=\max\{\lceil\exp(nR)\rceil%
:\bar{P}_{\text{e,fb}}(n,R,\Gamma)\leq\epsilon\}, italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ ) := roman_max { ⌈ roman_exp ( italic_n italic_R ) ⌉ : over¯ start_ARG italic_P end_ARG start_POSTSUBSCRIPT e,fb end_POSTSUBSCRIPT ( italic_n , italic_R , roman_Γ ) ≤ italic_ϵ } ,
where P ¯ e,fb ( n , R , Γ ) subscript ¯ 𝑃 e,fb 𝑛 𝑅 Γ \bar{P}_{\text{e,fb}}(n,R,\Gamma) over¯ start_ARG italic_P end_ARG start_POSTSUBSCRIPT e,fb end_POSTSUBSCRIPT ( italic_n , italic_R , roman_Γ ) denotes the minimum average error probability attainable by any random ( n , R , Γ ) 𝑛 𝑅 Γ (n,R,\Gamma) ( italic_n , italic_R , roman_Γ ) code with feedback. Similarly, define
M ∗ ( n , ϵ , Γ ) := max { ⌈ exp ( n R ) ⌉ : P ¯ e ( n , R , Γ ) ≤ ϵ } , assign superscript 𝑀 𝑛 italic-ϵ Γ : 𝑛 𝑅 subscript ¯ 𝑃 e 𝑛 𝑅 Γ italic-ϵ \displaystyle M^{*}(n,\epsilon,\Gamma):=\max\{\lceil\exp(nR)\rceil:\bar{P}_{%
\text{e}}(n,R,\Gamma)\leq\epsilon\}, italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_n , italic_ϵ , roman_Γ ) := roman_max { ⌈ roman_exp ( italic_n italic_R ) ⌉ : over¯ start_ARG italic_P end_ARG start_POSTSUBSCRIPT e end_POSTSUBSCRIPT ( italic_n , italic_R , roman_Γ ) ≤ italic_ϵ } ,
where P ¯ e ( n , R , Γ ) subscript ¯ 𝑃 e 𝑛 𝑅 Γ \bar{P}_{\text{e}}(n,R,\Gamma) over¯ start_ARG italic_P end_ARG start_POSTSUBSCRIPT e end_POSTSUBSCRIPT ( italic_n , italic_R , roman_Γ ) denotes the minimum average error probability attainable by any random ( n , R , Γ ) 𝑛 𝑅 Γ (n,R,\Gamma) ( italic_n , italic_R , roman_Γ ) code without feedback. Define M fb ∗ ( n , ϵ , Γ , V ) subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ 𝑉 M^{*}_{\text{fb}}(n,\epsilon,\Gamma,V) italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ , italic_V ) and M ∗ ( n , ϵ , Γ , V ) superscript 𝑀 𝑛 italic-ϵ Γ 𝑉 M^{*}(n,\epsilon,\Gamma,V) italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_n , italic_ϵ , roman_Γ , italic_V ) similarly for codes with mean and variance cost constraints.
II-A Expectation-only cost constraint
Under this cost formulation, the average cost of the codewords is constrained in expectation only:
𝔼 [ 1 n ∑ i = 1 n c ( X i ) ] 𝔼 delimited-[] 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 \displaystyle\mathbb{E}\left[\frac{1}{n}\sum_{i=1}^{n}c(X_{i})\right] blackboard_E [ divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ]
≤ Γ . absent Γ \displaystyle\leq\Gamma. ≤ roman_Γ .
(8)
We now illustrate a codebook construction (adapted from [17 ] ) with an average error probability at most ϵ ∈ ( 0 , 1 ) italic-ϵ 0 1 \epsilon\in(0,1) italic_ϵ ∈ ( 0 , 1 ) that meets the cost threshold Γ Γ \Gamma roman_Γ according to ( 8 ) 8 (\ref{exp_cost}) ( ) but the cost of its codewords is non-ergodic, i.e., 1 n ∑ i = 1 n c ( X i ) 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 \frac{1}{n}\sum_{i=1}^{n}c(X_{i}) divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) does not converge to Γ Γ \Gamma roman_Γ . Consider a codebook 𝒞 n subscript 𝒞 𝑛 \mathcal{C}_{n} caligraphic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT with rate C ( Γ ) < R < C ( Γ 1 − ϵ ) 𝐶 Γ 𝑅 𝐶 Γ 1 italic-ϵ C(\Gamma)<R<C\left(\frac{\Gamma}{1-\epsilon}\right) italic_C ( roman_Γ ) < italic_R < italic_C ( divide start_ARG roman_Γ end_ARG start_ARG 1 - italic_ϵ end_ARG ) whose average error probability ϵ n → 0 → subscript italic-ϵ 𝑛 0 \epsilon_{n}\to 0 italic_ϵ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT → 0 and each of whose codewords has average cost equal to Γ 1 − ϵ Γ 1 italic-ϵ \frac{\Gamma}{1-\epsilon} divide start_ARG roman_Γ end_ARG start_ARG 1 - italic_ϵ end_ARG . Such a codebook exists because R < C ( Γ 1 − ϵ ) 𝑅 𝐶 Γ 1 italic-ϵ R<C\left(\frac{\Gamma}{1-\epsilon}\right) italic_R < italic_C ( divide start_ARG roman_Γ end_ARG start_ARG 1 - italic_ϵ end_ARG ) . Assuming Γ 0 = min a ∈ 𝒜 c ( a ) = 0 subscript Γ 0 subscript 𝑎 𝒜 𝑐 𝑎 0 \Gamma_{0}=\min_{a\in\mathcal{A}}c(a)=0 roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_c ( italic_a ) = 0 without loss of generality, one could modify the codebook 𝒞 n subscript 𝒞 𝑛 \mathcal{C}_{n} caligraphic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT by replacing an ϵ italic-ϵ \epsilon italic_ϵ -fraction of its codewords with the all-zero codeword. The modified codebook 𝒞 n ′ superscript subscript 𝒞 𝑛 ′ \mathcal{C}_{n}^{\prime} caligraphic_C start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT has average error probability at most ϵ n ′ → ϵ → superscript subscript italic-ϵ 𝑛 ′ italic-ϵ \epsilon_{n}^{\prime}\to\epsilon italic_ϵ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT → italic_ϵ and meets the cost threshold Γ Γ \Gamma roman_Γ according to ( 8 ) 8 (\ref{exp_cost}) ( ) . But 1 n ∑ i = 1 n c ( X i ) 1 𝑛 superscript subscript 𝑖 1 𝑛 𝑐 subscript 𝑋 𝑖 \frac{1}{n}\sum_{i=1}^{n}c(X_{i}) divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is either 0 0 or Γ 1 − ϵ Γ 1 italic-ϵ \frac{\Gamma}{1-\epsilon} divide start_ARG roman_Γ end_ARG start_ARG 1 - italic_ϵ end_ARG . This construction also shows that the strong converse does not hold under the expected cost constraint.
The mean and variance cost constraints ensure that the average cost of the codewords concentrate around the cost threshold Γ Γ \Gamma roman_Γ , thereby disallowing codebook constructions with irregular or non-ergodic power consumption.
IV Proof of Theorem 1
Since 𝒦 ( r , V ) 𝒦 𝑟 𝑉 \mathcal{K}(r,V) caligraphic_K ( italic_r , italic_V ) is a continuous function [8 , Lemma 3] , it suffices to show that for all r ∈ ℝ 𝑟 ℝ r\in\mathbb{R} italic_r ∈ blackboard_R and V > 0 𝑉 0 V>0 italic_V > 0 ,
min Π : 𝔼 [ Π ] = r Var ( Π ) ≤ V | supp ( Π ) | ≤ 2 𝔼 [ Φ ( Π ) ] < Φ ( r ) . subscript : Π absent 𝔼 delimited-[] Π 𝑟 Var Π 𝑉 supp Π 2
𝔼 delimited-[] Φ Π Φ 𝑟 \displaystyle\min_{\begin{subarray}{c}\Pi:\\
\mathbb{E}[\Pi]=r\\
\text{Var}(\Pi)\leq V\\
|\text{supp}(\Pi)|\leq 2\end{subarray}}\mathbb{E}\left[\Phi(\Pi)\right]<\Phi(r). roman_min start_POSTSUBSCRIPT start_ARG start_ROW start_CELL roman_Π : end_CELL end_ROW start_ROW start_CELL blackboard_E [ roman_Π ] = italic_r end_CELL end_ROW start_ROW start_CELL Var ( roman_Π ) ≤ italic_V end_CELL end_ROW start_ROW start_CELL | supp ( roman_Π ) | ≤ 2 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT blackboard_E [ roman_Φ ( roman_Π ) ] < roman_Φ ( italic_r ) .
(17)
The LHS of ( 17 ) 17 (\ref{bv}) ( ) can be written as
min p , π : 0 ≤ p ≤ 1 p 1 − p ( π − r ) 2 ≤ V [ p Φ ( π ) + ( 1 − p ) Φ ( r − p π 1 − p ) ] , subscript : 𝑝 𝜋
absent 0 𝑝 1 𝑝 1 𝑝 superscript 𝜋 𝑟 2 𝑉
𝑝 Φ 𝜋 1 𝑝 Φ 𝑟 𝑝 𝜋 1 𝑝 \displaystyle\min_{\begin{subarray}{c}p,\pi:\\
0\leq p\leq 1\\
\frac{p}{1-p}(\pi-r)^{2}\leq V\end{subarray}}\left[p\Phi(\pi)+(1-p)\Phi\left(%
\frac{r-p\pi}{1-p}\right)\right], roman_min start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_p , italic_π : end_CELL end_ROW start_ROW start_CELL 0 ≤ italic_p ≤ 1 end_CELL end_ROW start_ROW start_CELL divide start_ARG italic_p end_ARG start_ARG 1 - italic_p end_ARG ( italic_π - italic_r ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ italic_V end_CELL end_ROW end_ARG end_POSTSUBSCRIPT [ italic_p roman_Φ ( italic_π ) + ( 1 - italic_p ) roman_Φ ( divide start_ARG italic_r - italic_p italic_π end_ARG start_ARG 1 - italic_p end_ARG ) ] ,
where we used the constraint 𝔼 [ Π ] = r 𝔼 delimited-[] Π 𝑟 \mathbb{E}[\Pi]=r blackboard_E [ roman_Π ] = italic_r to eliminate one of the decision variables.
Assume by contradiction that
p Φ ( π ) + ( 1 − p ) Φ ( r − p π 1 − p ) ≥ Φ ( r ) 𝑝 Φ 𝜋 1 𝑝 Φ 𝑟 𝑝 𝜋 1 𝑝 Φ 𝑟 \displaystyle p\Phi(\pi)+(1-p)\Phi\left(\frac{r-p\pi}{1-p}\right)\geq\Phi(r) italic_p roman_Φ ( italic_π ) + ( 1 - italic_p ) roman_Φ ( divide start_ARG italic_r - italic_p italic_π end_ARG start_ARG 1 - italic_p end_ARG ) ≥ roman_Φ ( italic_r )
(18)
for all π ≥ r 𝜋 𝑟 \pi\geq r italic_π ≥ italic_r , p ∈ [ 0 , 1 ] 𝑝 0 1 p\in[0,1] italic_p ∈ [ 0 , 1 ] and p 1 − p ( π − r ) 2 ≤ V 𝑝 1 𝑝 superscript 𝜋 𝑟 2 𝑉 \frac{p}{1-p}(\pi-r)^{2}\leq V divide start_ARG italic_p end_ARG start_ARG 1 - italic_p end_ARG ( italic_π - italic_r ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ italic_V . The assumption π ≥ r 𝜋 𝑟 \pi\geq r italic_π ≥ italic_r is without loss of generality since one of the two point masses must be greater than or equal to r 𝑟 r italic_r .
If ( 18 ) 18 (\ref{eq}) ( ) holds, then
p Φ ( π ) + ( 1 − p ) Φ ( r − p π 1 − p ) ≥ Φ ( r ) 𝑝 Φ 𝜋 1 𝑝 Φ 𝑟 𝑝 𝜋 1 𝑝 Φ 𝑟 \displaystyle p\Phi(\pi)+(1-p)\Phi\left(\frac{r-p\pi}{1-p}\right)\geq\Phi(r) italic_p roman_Φ ( italic_π ) + ( 1 - italic_p ) roman_Φ ( divide start_ARG italic_r - italic_p italic_π end_ARG start_ARG 1 - italic_p end_ARG ) ≥ roman_Φ ( italic_r )
for all π ≥ r 𝜋 𝑟 \pi\geq r italic_π ≥ italic_r , p ∈ [ 0 , 1 ] 𝑝 0 1 p\in[0,1] italic_p ∈ [ 0 , 1 ] and p 1 − p ( π − r ) 2 = V 𝑝 1 𝑝 superscript 𝜋 𝑟 2 𝑉 \frac{p}{1-p}(\pi-r)^{2}=V divide start_ARG italic_p end_ARG start_ARG 1 - italic_p end_ARG ( italic_π - italic_r ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_V . Since π = r + V ( 1 − p ) p 𝜋 𝑟 𝑉 1 𝑝 𝑝 \pi=r+\sqrt{\frac{V(1-p)}{p}} italic_π = italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG in this case, we must have
p Φ ( r + V ( 1 − p ) p ) + ( 1 − p ) Φ ( r 1 − p − p 1 − p ( r + V ( 1 − p ) p ) ) 𝑝 Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 Φ 𝑟 1 𝑝 𝑝 1 𝑝 𝑟 𝑉 1 𝑝 𝑝 \displaystyle p\,\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)+(1-p)\Phi\left(%
\frac{r}{1-p}-\frac{p}{1-p}\left(r+\sqrt{\frac{V(1-p)}{p}}\right)\right) italic_p roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + ( 1 - italic_p ) roman_Φ ( divide start_ARG italic_r end_ARG start_ARG 1 - italic_p end_ARG - divide start_ARG italic_p end_ARG start_ARG 1 - italic_p end_ARG ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) )
≥ Φ ( r ) absent Φ 𝑟 \displaystyle\geq\Phi(r) ≥ roman_Φ ( italic_r )
p Φ ( r + V ( 1 − p ) p ) + ( 1 − p ) Φ ( r − p 1 − p V ( 1 − p ) p ) 𝑝 Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 Φ 𝑟 𝑝 1 𝑝 𝑉 1 𝑝 𝑝 \displaystyle p\,\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)+(1-p)\Phi\left(r-%
\frac{p}{1-p}\sqrt{\frac{V(1-p)}{p}}\right) italic_p roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + ( 1 - italic_p ) roman_Φ ( italic_r - divide start_ARG italic_p end_ARG start_ARG 1 - italic_p end_ARG square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG )
≥ Φ ( r ) absent Φ 𝑟 \displaystyle\geq\Phi(r) ≥ roman_Φ ( italic_r )
p Φ ( r + V ( 1 − p ) p ) + ( 1 − p ) Φ ( r − V p 1 − p ) 𝑝 Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 Φ 𝑟 𝑉 𝑝 1 𝑝 \displaystyle p\,\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)+(1-p)\Phi\left(r-%
\sqrt{\frac{Vp}{1-p}}\right) italic_p roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + ( 1 - italic_p ) roman_Φ ( italic_r - square-root start_ARG divide start_ARG italic_V italic_p end_ARG start_ARG 1 - italic_p end_ARG end_ARG )
≥ Φ ( r ) absent Φ 𝑟 \displaystyle\geq\Phi(r) ≥ roman_Φ ( italic_r )
(19)
for all p ∈ [ 0 , 1 ] 𝑝 0 1 p\in[0,1] italic_p ∈ [ 0 , 1 ] .
Consider the function
f ( p ) 𝑓 𝑝 \displaystyle f(p) italic_f ( italic_p )
= p Φ ( r + V ( 1 − p ) p ) + ( 1 − p ) Φ ( r − V p 1 − p ) − Φ ( r ) absent 𝑝 Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 Φ 𝑟 𝑉 𝑝 1 𝑝 Φ 𝑟 \displaystyle=p\,\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)+(1-p)\Phi\left(r-%
\sqrt{\frac{Vp}{1-p}}\right)-\Phi(r) = italic_p roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + ( 1 - italic_p ) roman_Φ ( italic_r - square-root start_ARG divide start_ARG italic_V italic_p end_ARG start_ARG 1 - italic_p end_ARG end_ARG ) - roman_Φ ( italic_r )
with domain p ∈ [ 0 , 1 ] 𝑝 0 1 p\in[0,1] italic_p ∈ [ 0 , 1 ] . For any p ∈ ( 0 , 1 ) 𝑝 0 1 p\in(0,1) italic_p ∈ ( 0 , 1 ) ,
f ( p ) − f ( 0 ) p 𝑓 𝑝 𝑓 0 𝑝 \displaystyle\frac{f(p)-f(0)}{p} divide start_ARG italic_f ( italic_p ) - italic_f ( 0 ) end_ARG start_ARG italic_p end_ARG
= Φ ( r + V ( 1 − p ) p ) + 1 − p p Φ ( r − V p 1 − p ) − Φ ( r ) p absent Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 𝑝 Φ 𝑟 𝑉 𝑝 1 𝑝 Φ 𝑟 𝑝 \displaystyle=\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)+\frac{1-p}{p}\Phi%
\left(r-\sqrt{\frac{Vp}{1-p}}\right)-\frac{\Phi(r)}{p} = roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + divide start_ARG 1 - italic_p end_ARG start_ARG italic_p end_ARG roman_Φ ( italic_r - square-root start_ARG divide start_ARG italic_V italic_p end_ARG start_ARG 1 - italic_p end_ARG end_ARG ) - divide start_ARG roman_Φ ( italic_r ) end_ARG start_ARG italic_p end_ARG
≤ Φ ( r + V ( 1 − p ) p ) + 1 p Φ ( r − V p 1 − p ) − Φ ( r ) p absent Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 Φ 𝑟 𝑉 𝑝 1 𝑝 Φ 𝑟 𝑝 \displaystyle\leq\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)+\frac{1}{p}\Phi%
\left(r-\sqrt{\frac{Vp}{1-p}}\right)-\frac{\Phi(r)}{p} ≤ roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + divide start_ARG 1 end_ARG start_ARG italic_p end_ARG roman_Φ ( italic_r - square-root start_ARG divide start_ARG italic_V italic_p end_ARG start_ARG 1 - italic_p end_ARG end_ARG ) - divide start_ARG roman_Φ ( italic_r ) end_ARG start_ARG italic_p end_ARG
= ( a ) Φ ( r + V ( 1 − p ) p ) + 1 p [ Φ ( r ) − ϕ ( r ~ ) V p 1 − p − Φ ( r ) ] superscript 𝑎 absent Φ 𝑟 𝑉 1 𝑝 𝑝 1 𝑝 delimited-[] Φ 𝑟 italic-ϕ ~ 𝑟 𝑉 𝑝 1 𝑝 Φ 𝑟 \displaystyle\stackrel{{\scriptstyle(a)}}{{=}}\Phi\left(r+\sqrt{\frac{V(1-p)}{%
p}}\right)+\frac{1}{p}\left[\Phi(r)-\phi(\tilde{r})\sqrt{\frac{Vp}{1-p}}-\Phi(%
r)\right] start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG ( italic_a ) end_ARG end_RELOP roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) + divide start_ARG 1 end_ARG start_ARG italic_p end_ARG [ roman_Φ ( italic_r ) - italic_ϕ ( over~ start_ARG italic_r end_ARG ) square-root start_ARG divide start_ARG italic_V italic_p end_ARG start_ARG 1 - italic_p end_ARG end_ARG - roman_Φ ( italic_r ) ]
= Φ ( r + V ( 1 − p ) p ) − ϕ ( r ~ ) V p ( 1 − p ) . absent Φ 𝑟 𝑉 1 𝑝 𝑝 italic-ϕ ~ 𝑟 𝑉 𝑝 1 𝑝 \displaystyle=\Phi\left(r+\sqrt{\frac{V(1-p)}{p}}\right)-\phi(\tilde{r})\sqrt{%
\frac{V}{p(1-p)}}. = roman_Φ ( italic_r + square-root start_ARG divide start_ARG italic_V ( 1 - italic_p ) end_ARG start_ARG italic_p end_ARG end_ARG ) - italic_ϕ ( over~ start_ARG italic_r end_ARG ) square-root start_ARG divide start_ARG italic_V end_ARG start_ARG italic_p ( 1 - italic_p ) end_ARG end_ARG .
(20)
In equality ( a ) 𝑎 (a) ( italic_a ) , we have r − V p 1 − p < r ~ < r 𝑟 𝑉 𝑝 1 𝑝 ~ 𝑟 𝑟 r-\sqrt{\frac{Vp}{1-p}}<\tilde{r}<r italic_r - square-root start_ARG divide start_ARG italic_V italic_p end_ARG start_ARG 1 - italic_p end_ARG end_ARG < over~ start_ARG italic_r end_ARG < italic_r by the mean value theorem. It is easy to see that for sufficiently small p > 0 𝑝 0 p>0 italic_p > 0 , the expression in ( 20 ) 20 (\ref{co}) ( ) is negative. Since f ( 0 ) = 0 𝑓 0 0 f(0)=0 italic_f ( 0 ) = 0 , we have f ( p ) < 0 𝑓 𝑝 0 f(p)<0 italic_f ( italic_p ) < 0 for some p > 0 𝑝 0 p>0 italic_p > 0 , which contradicts ( 19 ) 19 (\ref{gp}) ( ) .
V Proof of Theorem 2
For any t ∈ 𝒫 n ( 𝒜 ) 𝑡 subscript 𝒫 𝑛 𝒜 t\in\mathcal{P}_{n}(\mathcal{A}) italic_t ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) , define
d W ( t ) := inf P ∈ Π W , Γ ∗ ‖ t − P ‖ 1 . assign subscript 𝑑 𝑊 𝑡 subscript infimum 𝑃 superscript subscript Π 𝑊 Γ
subscript norm 𝑡 𝑃 1 \displaystyle d_{W}(t):=\inf_{P\in\Pi_{W,\Gamma}^{*}}||t-P||_{1}. italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ) := roman_inf start_POSTSUBSCRIPT italic_P ∈ roman_Π start_POSTSUBSCRIPT italic_W , roman_Γ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | | italic_t - italic_P | | start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT .
For any 0 < γ ≤ V ( Γ ) 4 | 𝒜 | ν max 0 𝛾 𝑉 Γ 4 𝒜 subscript 𝜈 0<\gamma\leq\frac{V(\Gamma)}{4|\mathcal{A}|\nu_{\max}} 0 < italic_γ ≤ divide start_ARG italic_V ( roman_Γ ) end_ARG start_ARG 4 | caligraphic_A | italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG , define
𝒫 n γ = { x n ∈ 𝒜 n : c ( x n ) ≤ Γ and d W ( t ( x n ) ) > γ } 𝒫 n γ , c = { x n ∈ 𝒜 n : c ( x n ) ≤ Γ and d W ( t ( x n ) ) ≤ γ } . superscript subscript 𝒫 𝑛 𝛾 conditional-set superscript 𝑥 𝑛 superscript 𝒜 𝑛 𝑐 superscript 𝑥 𝑛 Γ and subscript 𝑑 𝑊 𝑡 superscript 𝑥 𝑛 𝛾 superscript subscript 𝒫 𝑛 𝛾 𝑐
conditional-set superscript 𝑥 𝑛 superscript 𝒜 𝑛 𝑐 superscript 𝑥 𝑛 Γ and subscript 𝑑 𝑊 𝑡 superscript 𝑥 𝑛 𝛾 \displaystyle\begin{split}\mathcal{P}_{n}^{\gamma}&=\left\{x^{n}\in\mathcal{A}%
^{n}:c(x^{n})\leq\Gamma\text{ and }d_{W}(t(x^{n}))>\gamma\right\}\\
\mathcal{P}_{n}^{\gamma,c}&=\left\{x^{n}\in\mathcal{A}^{n}:c(x^{n})\leq\Gamma%
\text{ and }d_{W}(t(x^{n}))\leq\gamma\right\}.\end{split} start_ROW start_CELL caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_CELL start_CELL = { italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_c ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ roman_Γ and italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) > italic_γ } end_CELL end_ROW start_ROW start_CELL caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ , italic_c end_POSTSUPERSCRIPT end_CELL start_CELL = { italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : italic_c ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ roman_Γ and italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) ≤ italic_γ } . end_CELL end_ROW
(21)
Definition 9
For any distribution P ∈ 𝒫 ( 𝒜 ) 𝑃 𝒫 𝒜 P\in\mathcal{P}(\mathcal{A}) italic_P ∈ caligraphic_P ( caligraphic_A ) and S ⊂ 𝒜 𝑆 𝒜 S\subset\mathcal{A} italic_S ⊂ caligraphic_A such that P ( S ) > 0 𝑃 𝑆 0 P(S)>0 italic_P ( italic_S ) > 0 , define the probability measure
P | S ( x ) = { P ( x ) P ( S ) x ∈ S 0 otherwise . evaluated-at 𝑃 𝑆 𝑥 cases 𝑃 𝑥 𝑃 𝑆 𝑥 𝑆 0 otherwise \displaystyle P|_{S}(x)=\begin{cases}\frac{P(x)}{P(S)}&x\in S\\
0&\text{otherwise}.\end{cases} italic_P | start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) = { start_ROW start_CELL divide start_ARG italic_P ( italic_x ) end_ARG start_ARG italic_P ( italic_S ) end_ARG end_CELL start_CELL italic_x ∈ italic_S end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL otherwise . end_CELL end_ROW
Definition 10
For any k ≥ 0 𝑘 0 k\geq 0 italic_k ≥ 0 and any x k ∈ 𝒜 k superscript 𝑥 𝑘 superscript 𝒜 𝑘 x^{k}\in\mathcal{A}^{k} italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , let
𝒜 x k = { x ∈ 𝒜 : ( x k , x ) is a prefix of some x n ∈ 𝒫 n γ , c } . subscript 𝒜 superscript 𝑥 𝑘 conditional-set 𝑥 𝒜 superscript 𝑥 𝑘 𝑥 is a prefix of some superscript 𝑥 𝑛 superscript subscript 𝒫 𝑛 𝛾 𝑐
\displaystyle\mathcal{A}_{x^{k}}=\left\{x\in\mathcal{A}:(x^{k},x)\text{ is a %
prefix of some }x^{n}\in\mathcal{P}_{n}^{\gamma,c}\right\}. caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = { italic_x ∈ caligraphic_A : ( italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_x ) is a prefix of some italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ , italic_c end_POSTSUPERSCRIPT } .
Fix ( a 0 , a 1 , … , a n − 1 ) ∈ 𝒫 n γ , c subscript 𝑎 0 subscript 𝑎 1 … subscript 𝑎 𝑛 1 superscript subscript 𝒫 𝑛 𝛾 𝑐
(a_{0},a_{1},\ldots,a_{n-1})\in\mathcal{P}_{n}^{\gamma,c} ( italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT ) ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ , italic_c end_POSTSUPERSCRIPT arbitrarily and let 𝟙 a i ∈ 𝒫 ( 𝒜 ) subscript 1 subscript 𝑎 𝑖 𝒫 𝒜 \mathds{1}_{a_{i}}\in\mathcal{P}(\mathcal{A}) blackboard_1 start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ caligraphic_P ( caligraphic_A ) denote a single point-mass distribution at a i subscript 𝑎 𝑖 a_{i} italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT . Then for any 0 ≤ k ≤ n − 1 0 𝑘 𝑛 1 0\leq k\leq n-1 0 ≤ italic_k ≤ italic_n - 1 , x k ∈ 𝒜 k , y k ∈ ℬ k formulae-sequence superscript 𝑥 𝑘 superscript 𝒜 𝑘 superscript 𝑦 𝑘 superscript ℬ 𝑘 x^{k}\in\mathcal{A}^{k},y^{k}\in\mathcal{B}^{k} italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ caligraphic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and a controller F 𝐹 F italic_F satisfying ( 15 ) 15 (\ref{vc}) ( ) , we define the controller F γ subscript 𝐹 𝛾 F_{\gamma} italic_F start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT as
F γ ( x k , y k ) subscript 𝐹 𝛾 superscript 𝑥 𝑘 superscript 𝑦 𝑘 \displaystyle F_{\gamma}(x^{k},y^{k}) italic_F start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT )
:= { F ( x k , y k ) | 𝒜 x k if F ( 𝒜 x k | x k , y k ) > 0 Unif ( A x k ) if F ( 𝒜 x k | x k , y k ) = 0 and | 𝒜 x k | ≠ 0 𝟙 a k otherwise . assign absent cases evaluated-at 𝐹 superscript 𝑥 𝑘 superscript 𝑦 𝑘 subscript 𝒜 superscript 𝑥 𝑘 if 𝐹 conditional subscript 𝒜 superscript 𝑥 𝑘 superscript 𝑥 𝑘 superscript 𝑦 𝑘
0 Unif subscript 𝐴 superscript 𝑥 𝑘 if 𝐹 conditional subscript 𝒜 superscript 𝑥 𝑘 superscript 𝑥 𝑘 superscript 𝑦 𝑘
0 and subscript 𝒜 superscript 𝑥 𝑘 0 subscript 1 subscript 𝑎 𝑘 otherwise \displaystyle:=\begin{cases}F(x^{k},y^{k})|_{\mathcal{A}_{x^{k}}}&\text{ if }F%
(\mathcal{A}_{x^{k}}|x^{k},y^{k})>0\\
\text{Unif}(A_{x^{k}})&\text{ if }F(\mathcal{A}_{x^{k}}|x^{k},y^{k})=0\text{ %
and }|\mathcal{A}_{x^{k}}|\neq 0\\
\mathds{1}_{a_{k}}&\text{ otherwise}.\end{cases} := { start_ROW start_CELL italic_F ( italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) | start_POSTSUBSCRIPT caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL if italic_F ( caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) > 0 end_CELL end_ROW start_ROW start_CELL Unif ( italic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) end_CELL start_CELL if italic_F ( caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) = 0 and | caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | ≠ 0 end_CELL end_ROW start_ROW start_CELL blackboard_1 start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL otherwise . end_CELL end_ROW
Definition 11
For any type t ∈ 𝒫 n ( 𝒜 ) 𝑡 subscript 𝒫 𝑛 𝒜 t\in\mathcal{P}_{n}(\mathcal{A}) italic_t ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) such that T 𝒜 n ( t ) ⊂ 𝒫 n γ subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma} italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT , k ≥ 0 𝑘 0 k\geq 0 italic_k ≥ 0 and any x k ∈ 𝒜 k superscript 𝑥 𝑘 superscript 𝒜 𝑘 x^{k}\in\mathcal{A}^{k} italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , let
𝒜 x k t = { x ∈ 𝒜 : ( x k , x ) is a prefix of some x n ∈ T 𝒜 n ( t ) } . superscript subscript 𝒜 superscript 𝑥 𝑘 𝑡 conditional-set 𝑥 𝒜 superscript 𝑥 𝑘 𝑥 is a prefix of some superscript 𝑥 𝑛 subscript superscript 𝑇 𝑛 𝒜 𝑡 \displaystyle\mathcal{A}_{x^{k}}^{t}=\left\{x\in\mathcal{A}:(x^{k},x)\text{ is%
a prefix of some }x^{n}\in T^{n}_{\mathcal{A}}(t)\right\}. caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = { italic_x ∈ caligraphic_A : ( italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_x ) is a prefix of some italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) } .
Fix ( a 0 , a 1 , … , a n − 1 ) ∈ T 𝒜 n ( t ) subscript 𝑎 0 subscript 𝑎 1 … subscript 𝑎 𝑛 1 subscript superscript 𝑇 𝑛 𝒜 𝑡 (a_{0},a_{1},\ldots,a_{n-1})\in T^{n}_{\mathcal{A}}(t) ( italic_a start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT ) ∈ italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) arbitrarily and let 𝟙 a i ∈ 𝒫 ( 𝒜 ) subscript 1 subscript 𝑎 𝑖 𝒫 𝒜 \mathds{1}_{a_{i}}\in\mathcal{P}(\mathcal{A}) blackboard_1 start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ caligraphic_P ( caligraphic_A ) denote a single point-mass distribution at a i subscript 𝑎 𝑖 a_{i} italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT . Then for any 0 ≤ k ≤ n − 1 0 𝑘 𝑛 1 0\leq k\leq n-1 0 ≤ italic_k ≤ italic_n - 1 , x k ∈ 𝒜 k , y k ∈ ℬ k formulae-sequence superscript 𝑥 𝑘 superscript 𝒜 𝑘 superscript 𝑦 𝑘 superscript ℬ 𝑘 x^{k}\in\mathcal{A}^{k},y^{k}\in\mathcal{B}^{k} italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ caligraphic_A start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ∈ caligraphic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT and a controller F 𝐹 F italic_F satisfying ( 15 ) 15 (\ref{vc}) ( ) , we define the controller F t subscript 𝐹 𝑡 F_{t} italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as
F t ( x k , y k ) subscript 𝐹 𝑡 superscript 𝑥 𝑘 superscript 𝑦 𝑘 \displaystyle F_{t}(x^{k},y^{k}) italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT )
:= { F ( x k , y k ) | 𝒜 x k t if F ( 𝒜 x k t | x k , y k ) > 0 Unif ( A x k t ) if F ( 𝒜 x k t | x k , y k ) = 0 and | 𝒜 x k | ≠ 0 𝟙 a k otherwise . assign absent cases evaluated-at 𝐹 superscript 𝑥 𝑘 superscript 𝑦 𝑘 superscript subscript 𝒜 superscript 𝑥 𝑘 𝑡 if 𝐹 conditional subscript superscript 𝒜 𝑡 superscript 𝑥 𝑘 superscript 𝑥 𝑘 superscript 𝑦 𝑘
0 Unif subscript superscript 𝐴 𝑡 superscript 𝑥 𝑘 if 𝐹 conditional subscript superscript 𝒜 𝑡 superscript 𝑥 𝑘 superscript 𝑥 𝑘 superscript 𝑦 𝑘
0 and subscript 𝒜 superscript 𝑥 𝑘 0 subscript 1 subscript 𝑎 𝑘 otherwise \displaystyle:=\begin{cases}F(x^{k},y^{k})|_{\mathcal{A}_{x^{k}}^{t}}&\text{ %
if }F(\mathcal{A}^{t}_{x^{k}}|x^{k},y^{k})>0\\
\text{Unif}(A^{t}_{x^{k}})&\text{ if }F(\mathcal{A}^{t}_{x^{k}}|x^{k},y^{k})=0%
\text{ and }|\mathcal{A}_{x^{k}}|\neq 0\\
\mathds{1}_{a_{k}}&\text{ otherwise}.\end{cases} := { start_ROW start_CELL italic_F ( italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) | start_POSTSUBSCRIPT caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL if italic_F ( caligraphic_A start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) > 0 end_CELL end_ROW start_ROW start_CELL Unif ( italic_A start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) end_CELL start_CELL if italic_F ( caligraphic_A start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) = 0 and | caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | ≠ 0 end_CELL end_ROW start_ROW start_CELL blackboard_1 start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL otherwise . end_CELL end_ROW
Now let
q ( y n ) = 1 2 ∏ i = 1 n Q ∗ ( y i ) + 1 2 1 | 𝒫 n ( 𝒜 ) | ∑ t ∈ 𝒫 n ( 𝒜 ) ∏ i = 1 n q t ( y i ) , 𝑞 superscript 𝑦 𝑛 1 2 superscript subscript product 𝑖 1 𝑛 superscript 𝑄 subscript 𝑦 𝑖 1 2 1 subscript 𝒫 𝑛 𝒜 subscript 𝑡 subscript 𝒫 𝑛 𝒜 superscript subscript product 𝑖 1 𝑛 subscript 𝑞 𝑡 subscript 𝑦 𝑖 \displaystyle q(y^{n})=\frac{1}{2}\prod_{i=1}^{n}Q^{*}(y_{i})+\frac{1}{2}\frac%
{1}{|\mathcal{P}_{n}(\mathcal{A})|}\sum_{t\in\mathcal{P}_{n}(\mathcal{A})}%
\prod_{i=1}^{n}q_{t}(y_{i}), italic_q ( italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG divide start_ARG 1 end_ARG start_ARG | caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) | end_ARG ∑ start_POSTSUBSCRIPT italic_t ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) end_POSTSUBSCRIPT ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ,
(23)
where
q t ( b ) := ∑ a ∈ 𝒜 t ( a ) W ( b | a ) . assign subscript 𝑞 𝑡 𝑏 subscript 𝑎 𝒜 𝑡 𝑎 𝑊 conditional 𝑏 𝑎 \displaystyle q_{t}(b):=\sum_{a\in\mathcal{A}}t(a)W(b|a). italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) := ∑ start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_t ( italic_a ) italic_W ( italic_b | italic_a ) .
Let P 𝑃 P italic_P denote the distribution F ∘ W 𝐹 𝑊 F\circ W italic_F ∘ italic_W . Let P γ subscript 𝑃 𝛾 P_{\gamma} italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT denote the distribution F γ ∘ W subscript 𝐹 𝛾 𝑊 F_{\gamma}\circ W italic_F start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ∘ italic_W . Let P t subscript 𝑃 𝑡 P_{t} italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT denote the distribution F t ∘ W subscript 𝐹 𝑡 𝑊 F_{t}\circ W italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∘ italic_W for each t ∈ 𝒫 n ( 𝒜 ) 𝑡 subscript 𝒫 𝑛 𝒜 t\in\mathcal{P}_{n}(\mathcal{A}) italic_t ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) such that T 𝒜 n ( t ) ⊂ 𝒫 n γ subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma} italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT . Note that all controllers F , F γ 𝐹 subscript 𝐹 𝛾
F,F_{\gamma} italic_F , italic_F start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT and F t subscript 𝐹 𝑡 F_{t} italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT satisfy ( 15 ) 15 (\ref{vc}) ( ) . We have
P ( W ( Y n | X n ) q ( Y n ) > ρ ) 𝑃 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 \displaystyle P\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\right) italic_P ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ )
= P ( W ( Y n | X n ) q ( Y n ) > ρ ∩ d W ( t ( X n ) ) ≤ γ ) + P ( W ( Y n | X n ) q ( Y n ) > ρ ∩ d W ( t ( X n ) ) > γ ) absent 𝑃 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 subscript 𝑑 𝑊 𝑡 superscript 𝑋 𝑛 𝛾 𝑃 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 subscript 𝑑 𝑊 𝑡 superscript 𝑋 𝑛 𝛾 \displaystyle=P\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\cap d_{W}(t(X^{n}))%
\leq\gamma\right)+P\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\cap d_{W}(t(X^{n%
}))>\gamma\right) = italic_P ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ ∩ italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) ≤ italic_γ ) + italic_P ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ ∩ italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) > italic_γ )
= P ( W ( Y n | X n ) q ( Y n ) > ρ ∩ d W ( t ( X n ) ) ≤ γ ) + ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P ( W ( Y n | X n ) q ( Y n ) > ρ ∩ t ( X n ) = t ) absent 𝑃 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 subscript 𝑑 𝑊 𝑡 superscript 𝑋 𝑛 𝛾 subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 𝑃 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 𝑡 superscript 𝑋 𝑛 𝑡 \displaystyle=P\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\cap d_{W}(t(X^{n}))%
\leq\gamma\right)+\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma%
}}P\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\cap t(X^{n})=t\right) = italic_P ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ ∩ italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) ≤ italic_γ ) + ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ ∩ italic_t ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = italic_t )
≤ ( a ) P γ ( W ( Y n | X n ) q ( Y n ) > ρ ) + ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P t ( W ( Y n | X n ) q ( Y n ) > ρ ) . superscript 𝑎 absent subscript 𝑃 𝛾 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑃 𝑡 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 \displaystyle\stackrel{{\scriptstyle(a)}}{{\leq}}P_{\gamma}\left(\frac{W(Y^{n}%
|X^{n})}{q(Y^{n})}>\rho\right)+\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P%
}_{n}^{\gamma}}P_{t}\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\right). start_RELOP SUPERSCRIPTOP start_ARG ≤ end_ARG start_ARG ( italic_a ) end_ARG end_RELOP italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ ) + ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ ) .
(24)
Inequality ( a ) 𝑎 (a) ( italic_a ) follows from the following argument. For any ( x n , y n ) superscript 𝑥 𝑛 superscript 𝑦 𝑛 (x^{n},y^{n}) ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) such that x n ∈ 𝒫 n γ , c superscript 𝑥 𝑛 superscript subscript 𝒫 𝑛 𝛾 𝑐
x^{n}\in\mathcal{P}_{n}^{\gamma,c} italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∈ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ , italic_c end_POSTSUPERSCRIPT , note that for all 1 ≤ k ≤ n 1 𝑘 𝑛 1\leq k\leq n 1 ≤ italic_k ≤ italic_n , 𝒜 x k − 1 ≠ ∅ subscript 𝒜 superscript 𝑥 𝑘 1 \mathcal{A}_{x^{k-1}}\neq\emptyset caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ≠ ∅ and x k ∈ 𝒜 x k − 1 subscript 𝑥 𝑘 subscript 𝒜 superscript 𝑥 𝑘 1 x_{k}\in\mathcal{A}_{x^{k-1}} italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT so that
F ( x k | x k − 1 , y k − 1 ) ≤ F ( 𝒜 x k − 1 | x k − 1 , y k − 1 ) . 𝐹 conditional subscript 𝑥 𝑘 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
𝐹 conditional subscript 𝒜 superscript 𝑥 𝑘 1 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
\displaystyle F(x_{k}|x^{k-1},y^{k-1})\leq F(\mathcal{A}_{x^{k-1}}|x^{k-1},y^{%
k-1}). italic_F ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) ≤ italic_F ( caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) .
(25)
Then
P γ ( ( x n , y n ) ) subscript 𝑃 𝛾 superscript 𝑥 𝑛 superscript 𝑦 𝑛 \displaystyle P_{\gamma}\left((x^{n},y^{n})\right) italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) )
= ∏ k = 1 n F γ ( x k | x k − 1 , y k − 1 ) W ( y k | x k ) absent superscript subscript product 𝑘 1 𝑛 subscript 𝐹 𝛾 conditional subscript 𝑥 𝑘 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
𝑊 conditional subscript 𝑦 𝑘 subscript 𝑥 𝑘 \displaystyle=\prod_{k=1}^{n}F_{\gamma}(x_{k}|x^{k-1},y^{k-1})W(y_{k}|x_{k}) = ∏ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_F start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) italic_W ( italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )
≥ ( b ) ∏ k = 1 n F ( x k | x k − 1 , y k − 1 ) F ( 𝒜 x k − 1 | x k − 1 , y k − 1 ) W ( y k | x k ) superscript 𝑏 absent superscript subscript product 𝑘 1 𝑛 𝐹 conditional subscript 𝑥 𝑘 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
𝐹 conditional subscript 𝒜 superscript 𝑥 𝑘 1 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
𝑊 conditional subscript 𝑦 𝑘 subscript 𝑥 𝑘 \displaystyle\stackrel{{\scriptstyle(b)}}{{\geq}}\prod_{k=1}^{n}\frac{F(x_{k}|%
x^{k-1},y^{k-1})}{F(\mathcal{A}_{x^{k-1}}|x^{k-1},y^{k-1})}W(y_{k}|x_{k}) start_RELOP SUPERSCRIPTOP start_ARG ≥ end_ARG start_ARG ( italic_b ) end_ARG end_RELOP ∏ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG italic_F ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_F ( caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) end_ARG italic_W ( italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )
≥ ∏ k = 1 n F ( x k | x k − 1 , y k − 1 ) W ( y k | x k ) absent superscript subscript product 𝑘 1 𝑛 𝐹 conditional subscript 𝑥 𝑘 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
𝑊 conditional subscript 𝑦 𝑘 subscript 𝑥 𝑘 \displaystyle\geq\prod_{k=1}^{n}F(x_{k}|x^{k-1},y^{k-1})W(y_{k}|x_{k}) ≥ ∏ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_F ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) italic_W ( italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )
= P ( ( x n , y n ) ) . absent 𝑃 superscript 𝑥 𝑛 superscript 𝑦 𝑛 \displaystyle=P((x^{n},y^{n})). = italic_P ( ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) .
With some abuse of notation, we assume in inequality ( b ) 𝑏 (b) ( italic_b ) above that if F ( 𝒜 x k − 1 | x k − 1 , y k − 1 ) = 0 𝐹 conditional subscript 𝒜 superscript 𝑥 𝑘 1 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
0 F(\mathcal{A}_{x^{k-1}}|x^{k-1},y^{k-1})=0 italic_F ( caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) = 0 , then
F ( x k | x k − 1 , y k − 1 ) F ( 𝒜 x k − 1 | x k − 1 , y k − 1 ) = 0 𝐹 conditional subscript 𝑥 𝑘 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
𝐹 conditional subscript 𝒜 superscript 𝑥 𝑘 1 superscript 𝑥 𝑘 1 superscript 𝑦 𝑘 1
0 \frac{F(x_{k}|x^{k-1},y^{k-1})}{F(\mathcal{A}_{x^{k-1}}|x^{k-1},y^{k-1})}=0 divide start_ARG italic_F ( italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_F ( caligraphic_A start_POSTSUBSCRIPT italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_x start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT ) end_ARG = 0
which is justified by ( 25 ) 25 (\ref{fw}) ( ) .
A similar derivation gives
P t ( ( x n , y n ) ) ≥ P ( x n , y n ) subscript 𝑃 𝑡 superscript 𝑥 𝑛 superscript 𝑦 𝑛 𝑃 superscript 𝑥 𝑛 superscript 𝑦 𝑛 \displaystyle P_{t}((x^{n},y^{n}))\geq P(x^{n},y^{n}) italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ) ≥ italic_P ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT )
for all ( x n , y n ) superscript 𝑥 𝑛 superscript 𝑦 𝑛 (x^{n},y^{n}) ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) such that c ( x n ) ≤ Γ 𝑐 superscript 𝑥 𝑛 Γ c(x^{n})\leq\Gamma italic_c ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ roman_Γ and t ( x n ) = t 𝑡 superscript 𝑥 𝑛 𝑡 t(x^{n})=t italic_t ( italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = italic_t , where T 𝒜 n ( t ) ⊂ 𝒫 n γ subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma} italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT .
Let ρ = exp ( n C ( Γ ) + n r ) 𝜌 𝑛 𝐶 Γ 𝑛 𝑟 \rho=\exp\left(nC(\Gamma)+\sqrt{n}r\right) italic_ρ = roman_exp ( italic_n italic_C ( roman_Γ ) + square-root start_ARG italic_n end_ARG italic_r ) , where r 𝑟 r italic_r will be specified later. Define
𝒢 i subscript 𝒢 𝑖 \displaystyle\mathcal{G}_{i} caligraphic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
:= σ ( X 1 , … , X i + 1 , Y 1 , … , Y i ) assign absent 𝜎 subscript 𝑋 1 … subscript 𝑋 𝑖 1 subscript 𝑌 1 … subscript 𝑌 𝑖 \displaystyle:=\sigma(X_{1},\ldots,X_{i+1},Y_{1},\ldots,Y_{i}) := italic_σ ( italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )
Z i subscript 𝑍 𝑖 \displaystyle Z_{i} italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
:= i ( X i , Y i ) − 𝔼 [ i ( X i , Y i ) | 𝒢 i − 1 ] assign absent 𝑖 subscript 𝑋 𝑖 subscript 𝑌 𝑖 𝔼 delimited-[] conditional 𝑖 subscript 𝑋 𝑖 subscript 𝑌 𝑖 subscript 𝒢 𝑖 1 \displaystyle:=i(X_{i},Y_{i})-\mathbb{E}\left[i(X_{i},Y_{i})|\mathcal{G}_{i-1}\right] := italic_i ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - blackboard_E [ italic_i ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | caligraphic_G start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ]
ℱ i subscript ℱ 𝑖 \displaystyle\mathcal{F}_{i} caligraphic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
:= σ ( Z 1 , Z 2 , … , Z i ) . assign absent 𝜎 subscript 𝑍 1 subscript 𝑍 2 … subscript 𝑍 𝑖 \displaystyle:=\sigma(Z_{1},Z_{2},\ldots,Z_{i}). := italic_σ ( italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) .
Two things are important to note here. First,
by the Markov property ( X i − 1 , Y i − 1 ) − X i − Y i superscript 𝑋 𝑖 1 superscript 𝑌 𝑖 1 subscript 𝑋 𝑖 subscript 𝑌 𝑖 (X^{i-1},Y^{i-1})-X_{i}-Y_{i} ( italic_X start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT , italic_Y start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT ) - italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ,
we have 𝔼 [ i ( X i , Y i ) | 𝒢 i − 1 ] = 𝔼 [ i ( X i , Y i ) | X i ] 𝔼 delimited-[] conditional 𝑖 subscript 𝑋 𝑖 subscript 𝑌 𝑖 subscript 𝒢 𝑖 1 𝔼 delimited-[] conditional 𝑖 subscript 𝑋 𝑖 subscript 𝑌 𝑖 subscript 𝑋 𝑖 \mathbb{E}\left[i(X_{i},Y_{i})|\mathcal{G}_{i-1}\right]=\mathbb{E}\left[i(X_{i%
},Y_{i})|X_{i}\right] blackboard_E [ italic_i ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | caligraphic_G start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ] = blackboard_E [ italic_i ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] a.s. Second, ℱ i ⊂ 𝒢 i subscript ℱ 𝑖 subscript 𝒢 𝑖 \mathcal{F}_{i}\subset\mathcal{G}_{i} caligraphic_F start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⊂ caligraphic_G start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .
For the first term in ( 24 ) 24 (\ref{q}) ( ) , we can upper bound it as follows:
P γ ( W ( Y n | X n ) q ( Y n ) > ρ ) subscript 𝑃 𝛾 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 \displaystyle P_{\gamma}\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\right) italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ )
≤ P γ ( ∏ i = 1 n W ( Y i | X i ) Q ∗ ( Y i ) > ρ 2 ) absent subscript 𝑃 𝛾 superscript subscript product 𝑖 1 𝑛 𝑊 conditional subscript 𝑌 𝑖 subscript 𝑋 𝑖 superscript 𝑄 subscript 𝑌 𝑖 𝜌 2 \displaystyle\leq P_{\gamma}\left(\prod_{i=1}^{n}\frac{W(Y_{i}|X_{i})}{Q^{*}(Y%
_{i})}>\frac{\rho}{2}\right) ≤ italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG > divide start_ARG italic_ρ end_ARG start_ARG 2 end_ARG )
= P γ ( ∑ i = 1 n [ log ( W ( Y i | X i ) Q ∗ ( Y i ) ) − C ( Γ ) ] > n r − log ( 2 ) ) absent subscript 𝑃 𝛾 superscript subscript 𝑖 1 𝑛 delimited-[] 𝑊 conditional subscript 𝑌 𝑖 subscript 𝑋 𝑖 superscript 𝑄 subscript 𝑌 𝑖 𝐶 Γ 𝑛 𝑟 2 \displaystyle=P_{\gamma}\left(\sum_{i=1}^{n}\left[\log\left(\frac{W(Y_{i}|X_{i%
})}{Q^{*}(Y_{i})}\right)-C(\Gamma)\right]>\sqrt{n}r-\log(2)\right) = italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ roman_log ( divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG ) - italic_C ( roman_Γ ) ] > square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) )
≤ ( a ) P γ ( ∑ i = 1 n [ i ( X i , Y i ) − 𝔼 [ i ( X i , Y i ) | 𝒢 i − 1 ] ] > n r − log ( 2 ) ) superscript 𝑎 absent subscript 𝑃 𝛾 superscript subscript 𝑖 1 𝑛 delimited-[] 𝑖 subscript 𝑋 𝑖 subscript 𝑌 𝑖 𝔼 delimited-[] conditional 𝑖 subscript 𝑋 𝑖 subscript 𝑌 𝑖 subscript 𝒢 𝑖 1 𝑛 𝑟 2 \displaystyle\stackrel{{\scriptstyle(a)}}{{\leq}}P_{\gamma}\left(\sum_{i=1}^{n%
}\left[i(X_{i},Y_{i})-\mathbb{E}\left[i(X_{i},Y_{i})|\mathcal{G}_{i-1}\right]%
\right]>\sqrt{n}r-\log(2)\right) start_RELOP SUPERSCRIPTOP start_ARG ≤ end_ARG start_ARG ( italic_a ) end_ARG end_RELOP italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ italic_i ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - blackboard_E [ italic_i ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) | caligraphic_G start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ] ] > square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) )
= P γ ( ∑ i = 1 n Z i > n r − log ( 2 ) ) . absent subscript 𝑃 𝛾 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑛 𝑟 2 \displaystyle=P_{\gamma}\left(\sum_{i=1}^{n}Z_{i}>\sqrt{n}r-\log(2)\right). = italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) ) .
(26)
In inequality ( a ) 𝑎 (a) ( italic_a ) , we used the following lemma and the fact that c ( X n ) ≤ Γ 𝑐 superscript 𝑋 𝑛 Γ c(X^{n})\leq\Gamma italic_c ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) ≤ roman_Γ almost surely.
Lemma 2
For Γ ∈ ( Γ 0 , Γ ∗ ) Γ subscript Γ 0 superscript Γ \Gamma\in(\Gamma_{0},\Gamma^{*}) roman_Γ ∈ ( roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_Γ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ,
𝔼 [ i ( X , Y ) | X ] ≤ C ( Γ ) − C ′ ( Γ ) ( Γ − c ( X ) ) 𝔼 delimited-[] conditional 𝑖 𝑋 𝑌 𝑋 𝐶 Γ superscript 𝐶 ′ Γ Γ 𝑐 𝑋 \displaystyle\mathbb{E}\left[i(X,Y)|X\right]\leq C(\Gamma)-C^{\prime}(\Gamma)%
\left(\Gamma-c(X)\right) blackboard_E [ italic_i ( italic_X , italic_Y ) | italic_X ] ≤ italic_C ( roman_Γ ) - italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( roman_Γ ) ( roman_Γ - italic_c ( italic_X ) )
(27)
almost surely, where X 𝑋 X italic_X has an arbitrary distribution and Y 𝑌 Y italic_Y is the output of the channel W 𝑊 W italic_W when X 𝑋 X italic_X is the input.
Proof: See [24 , Proposition 1] and its references.
We will now apply a martingale central limit theorem [25 , Corollary to Theorem 2] to the expression in ( 26 ) 26 (\ref{MCLT}) ( ) . We first verify that the hypotheses of [25 , Corollary to Theorem 2] are satisfied:
1.
First, we require that
max 1 ≤ k ≤ n | Z k | < ∞ . subscript 1 𝑘 𝑛 subscript 𝑍 𝑘 \max_{1\leq k\leq n}|Z_{k}|<\infty. roman_max start_POSTSUBSCRIPT 1 ≤ italic_k ≤ italic_n end_POSTSUBSCRIPT | italic_Z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | < ∞ .
Since Q ∗ ( b ) > 0 superscript 𝑄 𝑏 0 Q^{*}(b)>0 italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_b ) > 0 for all b ∈ ℬ 𝑏 ℬ b\in\mathcal{B} italic_b ∈ caligraphic_B by assumption and W ( Y k | X k ) > 0 𝑊 conditional subscript 𝑌 𝑘 subscript 𝑋 𝑘 0 W(Y_{k}|X_{k})>0 italic_W ( italic_Y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) > 0 almost surely for each channel input and output pair ( X k , Y k ) subscript 𝑋 𝑘 subscript 𝑌 𝑘 (X_{k},Y_{k}) ( italic_X start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) , we have
| Z k | subscript 𝑍 𝑘 \displaystyle|Z_{k}| | italic_Z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT |
≤ max a ∈ 𝒜 , b ∈ ℬ : W ( b | a ) > 0 2 | i ( a , b ) | absent subscript : formulae-sequence 𝑎 𝒜 𝑏 ℬ 𝑊 conditional 𝑏 𝑎 0 2 𝑖 𝑎 𝑏 \displaystyle\leq\max_{a\in\mathcal{A},b\in\mathcal{B}:W(b|a)>0}2\,|i(a,b)| ≤ roman_max start_POSTSUBSCRIPT italic_a ∈ caligraphic_A , italic_b ∈ caligraphic_B : italic_W ( italic_b | italic_a ) > 0 end_POSTSUBSCRIPT 2 | italic_i ( italic_a , italic_b ) |
:= 2 i max < ∞ assign absent 2 subscript 𝑖 \displaystyle:=2i_{\max}<\infty := 2 italic_i start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT < ∞
for all 1 ≤ k ≤ n 1 𝑘 𝑛 1\leq k\leq n 1 ≤ italic_k ≤ italic_n .
2.
Second, we require that
𝔼 [ Z k | ℱ k − 1 ] = 0 𝔼 delimited-[] conditional subscript 𝑍 𝑘 subscript ℱ 𝑘 1 0 \displaystyle\mathbb{E}\left[Z_{k}|\mathcal{F}_{k-1}\right]=0 blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | caligraphic_F start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ] = 0
almost surely for all 1 ≤ k ≤ n 1 𝑘 𝑛 1\leq k\leq n 1 ≤ italic_k ≤ italic_n [25 , p. 672] . This is true because 𝔼 [ Z k | 𝒢 k − 1 ] = 0 𝔼 delimited-[] conditional subscript 𝑍 𝑘 subscript 𝒢 𝑘 1 0 \mathbb{E}\left[Z_{k}|\mathcal{G}_{k-1}\right]=0 blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | caligraphic_G start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ] = 0 implies
𝔼 [ 𝔼 [ Z k | 𝒢 k − 1 ] | ℱ k − 1 ] 𝔼 delimited-[] conditional 𝔼 delimited-[] conditional subscript 𝑍 𝑘 subscript 𝒢 𝑘 1 subscript ℱ 𝑘 1 \displaystyle\mathbb{E}\left[\mathbb{E}\left[Z_{k}|\mathcal{G}_{k-1}\right]|%
\mathcal{F}_{k-1}\right] blackboard_E [ blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | caligraphic_G start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ] | caligraphic_F start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ]
= 0 absent 0 \displaystyle=0 = 0
𝔼 [ Z k | ℱ k − 1 ] 𝔼 delimited-[] conditional subscript 𝑍 𝑘 subscript ℱ 𝑘 1 \displaystyle\mathbb{E}\left[Z_{k}|\mathcal{F}_{k-1}\right] blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | caligraphic_F start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT ]
= 0 . absent 0 \displaystyle=0. = 0 .
Under the above two conditions, it follows from [25 , Corollary to Theorem 2] that there exists a constant κ > 0 𝜅 0 \kappa>0 italic_κ > 0 depending only on i max subscript 𝑖 i_{\max} italic_i start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT such that for any s ∈ ℝ 𝑠 ℝ s\in\mathbb{R} italic_s ∈ blackboard_R ,
P γ ( 1 ∑ i = 1 n 𝔼 [ Z i 2 ] ∑ i = 1 n Z i ≤ s ) ≥ Φ ( s ) − κ [ n log n ( ∑ i = 1 n 𝔼 γ [ Z i 2 ] ) 3 2 + ‖ ∑ i = 1 n 𝔼 γ [ Z i 2 | ℱ i − 1 ] ∑ i = 1 n 𝔼 γ [ Z i 2 ] − 1 ‖ ∞ 1 / 2 ] . subscript 𝑃 𝛾 1 superscript subscript 𝑖 1 𝑛 𝔼 delimited-[] superscript subscript 𝑍 𝑖 2 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑠 Φ 𝑠 𝜅 delimited-[] 𝑛 𝑛 superscript superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 3 2 superscript subscript norm superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] conditional superscript subscript 𝑍 𝑖 2 subscript ℱ 𝑖 1 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 1 1 2 \displaystyle P_{\gamma}\left(\frac{1}{\sqrt{\sum_{i=1}^{n}\mathbb{E}\left[Z_{%
i}^{2}\right]}}\sum_{i=1}^{n}Z_{i}\leq s\right)\geq\Phi\left(s\right)-\kappa%
\left[\frac{n\log n}{\left(\sum_{i=1}^{n}\mathbb{E}_{\gamma}\left[Z_{i}^{2}%
\right]\right)^{\frac{3}{2}}}+\Bigg{|}\Bigg{|}\frac{\sum_{i=1}^{n}\mathbb{E}_{%
\gamma}[Z_{i}^{2}|\mathcal{F}_{i-1}]}{\sum_{i=1}^{n}\mathbb{E}_{\gamma}[Z_{i}^%
{2}]}-1\Bigg{|}\Bigg{|}_{\infty}^{1/2}\right]. italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ italic_s ) ≥ roman_Φ ( italic_s ) - italic_κ [ divide start_ARG italic_n roman_log italic_n end_ARG start_ARG ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ) start_POSTSUPERSCRIPT divide start_ARG 3 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_ARG + | | divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | caligraphic_F start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ] end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG - 1 | | start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ] .
(28)
Using Lemma 3 in ( 28 ) 28 (\ref{conv}) ( ) , we obtain
P γ ( 1 ∑ i = 1 n 𝔼 [ Z i 2 ] ∑ i = 1 n Z i ≤ s ) subscript 𝑃 𝛾 1 superscript subscript 𝑖 1 𝑛 𝔼 delimited-[] superscript subscript 𝑍 𝑖 2 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑠 \displaystyle P_{\gamma}\left(\frac{1}{\sqrt{\sum_{i=1}^{n}\mathbb{E}\left[Z_{%
i}^{2}\right]}}\sum_{i=1}^{n}Z_{i}\leq s\right) italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ italic_s )
≥ Φ ( s ) − κ [ n log n ( n V ( Γ ) − n | 𝒜 | γ ν max ) 3 2 + ( 4 | 𝒜 | γ ν max V ( Γ ) ) 1 / 2 ] absent Φ 𝑠 𝜅 delimited-[] 𝑛 𝑛 superscript 𝑛 𝑉 Γ 𝑛 𝒜 𝛾 subscript 𝜈 3 2 superscript 4 𝒜 𝛾 subscript 𝜈 𝑉 Γ 1 2 \displaystyle\geq\Phi\left(s\right)-\kappa\left[\frac{n\log n}{\left(nV(\Gamma%
)-n|\mathcal{A}|\gamma\nu_{\max}\right)^{\frac{3}{2}}}+\left(\frac{4|\mathcal{%
A}|\gamma\nu_{\max}}{V(\Gamma)}\right)^{1/2}\right] ≥ roman_Φ ( italic_s ) - italic_κ [ divide start_ARG italic_n roman_log italic_n end_ARG start_ARG ( italic_n italic_V ( roman_Γ ) - italic_n | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 3 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_ARG + ( divide start_ARG 4 | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG start_ARG italic_V ( roman_Γ ) end_ARG ) start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT ]
≥ Φ ( s ) − β γ , absent Φ 𝑠 subscript 𝛽 𝛾 \displaystyle\geq\Phi(s)-\beta_{\gamma}, ≥ roman_Φ ( italic_s ) - italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ,
(29)
where the last inequality holds for sufficiently large n 𝑛 n italic_n for some constant β γ > 0 subscript 𝛽 𝛾 0 \beta_{\gamma}>0 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT > 0 which can be chosen such that β γ → 0 → subscript 𝛽 𝛾 0 \beta_{\gamma}\to 0 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT → 0 as γ → 0 → 𝛾 0 \gamma\to 0 italic_γ → 0 .
Lemma 3
We have
V ( Γ ) − 2 γ ν max ≤ 1 n ∑ i = 1 n 𝔼 γ [ Z i 2 ] ≤ V ( Γ ) + 2 γ ν max . 𝑉 Γ 2 𝛾 subscript 𝜈 1 𝑛 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 𝑉 Γ 2 𝛾 subscript 𝜈 \displaystyle V(\Gamma)-2\gamma\nu_{\max}\leq\frac{1}{n}\sum_{i=1}^{n}\mathbb{%
E}_{\gamma}\left[Z_{i}^{2}\right]\leq V(\Gamma)+2\gamma\nu_{\max}. italic_V ( roman_Γ ) - 2 italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] ≤ italic_V ( roman_Γ ) + 2 italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT .
Furthermore, for γ ≤ V ( Γ ) 4 ν max 𝛾 𝑉 Γ 4 subscript 𝜈 \gamma\leq\frac{V(\Gamma)}{4\nu_{\max}} italic_γ ≤ divide start_ARG italic_V ( roman_Γ ) end_ARG start_ARG 4 italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG ,
‖ ∑ i = 1 n 𝔼 γ [ Z i 2 | ℱ i − 1 ] ∑ i = 1 n 𝔼 γ [ Z i 2 ] − 1 ‖ ∞ ≤ 8 γ ν max V ( Γ ) subscript norm superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] conditional superscript subscript 𝑍 𝑖 2 subscript ℱ 𝑖 1 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 1 8 𝛾 subscript 𝜈 𝑉 Γ \displaystyle\Bigg{|}\Bigg{|}\frac{\sum_{i=1}^{n}\mathbb{E}_{\gamma}[Z_{i}^{2}%
|\mathcal{F}_{i-1}]}{\sum_{i=1}^{n}\mathbb{E}_{\gamma}[Z_{i}^{2}]}-1\Bigg{|}%
\Bigg{|}_{\infty}\leq\frac{8\gamma\nu_{\max}}{V(\Gamma)} | | divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | caligraphic_F start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ] end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG - 1 | | start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ divide start_ARG 8 italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG start_ARG italic_V ( roman_Γ ) end_ARG
almost surely according to the probability measure P γ subscript 𝑃 𝛾 P_{\gamma} italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT .
Proof : The proof of Lemma 3 is given in Appendix A .
Using the result in ( 29 ) 29 (\ref{finres}) ( ) and Lemma 3 in the expression in ( 26 ) 26 (\ref{MCLT}) ( ) , we obtain
P γ ( ∑ i = 1 n Z i > n r − log ( 2 ) ) subscript 𝑃 𝛾 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑛 𝑟 2 \displaystyle P_{\gamma}\left(\sum_{i=1}^{n}Z_{i}>\sqrt{n}r-\log(2)\right) italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) )
= P γ ( 1 ∑ i = 1 n 𝔼 γ [ Z i 2 ] ∑ i = 1 n Z i > n r − log ( 2 ) ∑ i = 1 n 𝔼 γ [ Z i 2 ] ) absent subscript 𝑃 𝛾 1 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑛 𝑟 2 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 \displaystyle=P_{\gamma}\left(\frac{1}{\sqrt{\sum_{i=1}^{n}\mathbb{E}_{\gamma}%
\left[Z_{i}^{2}\right]}}\sum_{i=1}^{n}Z_{i}>\frac{\sqrt{n}r-\log(2)}{\sqrt{%
\sum_{i=1}^{n}\mathbb{E}_{\gamma}\left[Z_{i}^{2}\right]}}\right) = italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > divide start_ARG square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG )
= 1 − P γ ( 1 ∑ i = 1 n 𝔼 γ [ Z i 2 ] ∑ i = 1 n Z i ≤ n r − log ( 2 ) ∑ i = 1 n 𝔼 γ [ Z i 2 ] ) absent 1 subscript 𝑃 𝛾 1 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑛 𝑟 2 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 \displaystyle=1-P_{\gamma}\left(\frac{1}{\sqrt{\sum_{i=1}^{n}\mathbb{E}_{%
\gamma}\left[Z_{i}^{2}\right]}}\sum_{i=1}^{n}Z_{i}\leq\frac{\sqrt{n}r-\log(2)}%
{\sqrt{\sum_{i=1}^{n}\mathbb{E}_{\gamma}\left[Z_{i}^{2}\right]}}\right) = 1 - italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ divide start_ARG square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG )
≤ { 1 − P γ ( 1 ∑ i = 1 n 𝔼 γ [ Z i 2 ] ∑ i = 1 n Z i ≤ r − log ( 2 ) n V ( Γ ) + | 𝒜 | γ ν max ) if r ≥ log 2 n 1 − P γ ( 1 ∑ i = 1 n 𝔼 γ [ Z i 2 ] ∑ i = 1 n Z i ≤ r − log ( 2 ) n V ( Γ ) − | 𝒜 | γ ν max ) if r < log 2 n absent cases 1 subscript 𝑃 𝛾 1 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑟 2 𝑛 𝑉 Γ 𝒜 𝛾 subscript 𝜈 if 𝑟 2 𝑛 1 subscript 𝑃 𝛾 1 superscript subscript 𝑖 1 𝑛 subscript 𝔼 𝛾 delimited-[] superscript subscript 𝑍 𝑖 2 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑟 2 𝑛 𝑉 Γ 𝒜 𝛾 subscript 𝜈 if 𝑟 2 𝑛 \displaystyle\leq\begin{cases}1-P_{\gamma}\left(\frac{1}{\sqrt{\sum_{i=1}^{n}%
\mathbb{E}_{\gamma}\left[Z_{i}^{2}\right]}}\sum_{i=1}^{n}Z_{i}\leq\frac{r-%
\frac{\log(2)}{\sqrt{n}}}{\sqrt{V(\Gamma)+|\mathcal{A}|\gamma\nu_{\max}}}%
\right)&\text{ if }r\geq\frac{\log 2}{\sqrt{n}}\\
1-P_{\gamma}\left(\frac{1}{\sqrt{\sum_{i=1}^{n}\mathbb{E}_{\gamma}\left[Z_{i}^%
{2}\right]}}\sum_{i=1}^{n}Z_{i}\leq\frac{r-\frac{\log(2)}{\sqrt{n}}}{\sqrt{V(%
\Gamma)-|\mathcal{A}|\gamma\nu_{\max}}}\right)&\text{ if }r<\frac{\log 2}{%
\sqrt{n}}\end{cases} ≤ { start_ROW start_CELL 1 - italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ divide start_ARG italic_r - divide start_ARG roman_log ( 2 ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_ARG start_ARG square-root start_ARG italic_V ( roman_Γ ) + | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG end_ARG ) end_CELL start_CELL if italic_r ≥ divide start_ARG roman_log 2 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_CELL end_ROW start_ROW start_CELL 1 - italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( divide start_ARG 1 end_ARG start_ARG square-root start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT blackboard_E start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT [ italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≤ divide start_ARG italic_r - divide start_ARG roman_log ( 2 ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_ARG start_ARG square-root start_ARG italic_V ( roman_Γ ) - | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG end_ARG ) end_CELL start_CELL if italic_r < divide start_ARG roman_log 2 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_CELL end_ROW
≤ { 1 − Φ ( r − log ( 2 ) n V ( Γ ) + | 𝒜 | γ ν max ) + β γ if r ≥ log 2 n 1 − Φ ( r − log ( 2 ) n V ( Γ ) − | 𝒜 | γ ν max ) + β γ if r < log 2 n . absent cases 1 Φ 𝑟 2 𝑛 𝑉 Γ 𝒜 𝛾 subscript 𝜈 subscript 𝛽 𝛾 if 𝑟 2 𝑛 1 Φ 𝑟 2 𝑛 𝑉 Γ 𝒜 𝛾 subscript 𝜈 subscript 𝛽 𝛾 if 𝑟 2 𝑛 \displaystyle\leq\begin{cases}1-\Phi\left(\frac{r-\frac{\log(2)}{\sqrt{n}}}{%
\sqrt{V(\Gamma)+|\mathcal{A}|\gamma\nu_{\max}}}\right)+\beta_{\gamma}&\text{ %
if }r\geq\frac{\log 2}{\sqrt{n}}\\
1-\Phi\left(\frac{r-\frac{\log(2)}{\sqrt{n}}}{\sqrt{V(\Gamma)-|\mathcal{A}|%
\gamma\nu_{\max}}}\right)+\beta_{\gamma}&\text{ if }r<\frac{\log 2}{\sqrt{n}}.%
\end{cases} ≤ { start_ROW start_CELL 1 - roman_Φ ( divide start_ARG italic_r - divide start_ARG roman_log ( 2 ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_ARG start_ARG square-root start_ARG italic_V ( roman_Γ ) + | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG end_ARG ) + italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT end_CELL start_CELL if italic_r ≥ divide start_ARG roman_log 2 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_CELL end_ROW start_ROW start_CELL 1 - roman_Φ ( divide start_ARG italic_r - divide start_ARG roman_log ( 2 ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_ARG start_ARG square-root start_ARG italic_V ( roman_Γ ) - | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG end_ARG ) + italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT end_CELL start_CELL if italic_r < divide start_ARG roman_log 2 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG . end_CELL end_ROW
(30)
Let
r = { V ( Γ ) + | 𝒜 | γ ν max Φ − 1 ( ϵ + 3 β γ ) + log 2 n if ϵ ∈ [ 1 2 − 3 β γ , 1 ) V ( Γ ) − | 𝒜 | γ ν max Φ − 1 ( ϵ + 3 β γ ) + log 2 n if ϵ ∈ ( 0 , 1 2 − 3 β γ ) . 𝑟 cases 𝑉 Γ 𝒜 𝛾 subscript 𝜈 superscript Φ 1 italic-ϵ 3 subscript 𝛽 𝛾 2 𝑛 if italic-ϵ 1 2 3 subscript 𝛽 𝛾 1 𝑉 Γ 𝒜 𝛾 subscript 𝜈 superscript Φ 1 italic-ϵ 3 subscript 𝛽 𝛾 2 𝑛 if italic-ϵ 0 1 2 3 subscript 𝛽 𝛾 \displaystyle r=\begin{cases}\sqrt{V(\Gamma)+|\mathcal{A}|\gamma\nu_{\max}}\,%
\Phi^{-1}\left(\epsilon+3\beta_{\gamma}\right)+\frac{\log 2}{\sqrt{n}}&\text{ %
if }\epsilon\in\left[\frac{1}{2}-3\beta_{\gamma},1\right)\\
\sqrt{V(\Gamma)-|\mathcal{A}|\gamma\nu_{\max}}\,\Phi^{-1}\left(\epsilon+3\beta%
_{\gamma}\right)+\frac{\log 2}{\sqrt{n}}&\text{ if }\epsilon\in\left(0,\frac{1%
}{2}-3\beta_{\gamma}\right).\end{cases} italic_r = { start_ROW start_CELL square-root start_ARG italic_V ( roman_Γ ) + | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ + 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) + divide start_ARG roman_log 2 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_CELL start_CELL if italic_ϵ ∈ [ divide start_ARG 1 end_ARG start_ARG 2 end_ARG - 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT , 1 ) end_CELL end_ROW start_ROW start_CELL square-root start_ARG italic_V ( roman_Γ ) - | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ + 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) + divide start_ARG roman_log 2 end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG end_CELL start_CELL if italic_ϵ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 2 end_ARG - 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) . end_CELL end_ROW
(31)
Note that the upper bound in ( 30 ) 30 (\ref{userhere}) ( ) holds for any r 𝑟 r italic_r . For a given error probability ϵ ∈ ( 0 , 1 ) italic-ϵ 0 1 \epsilon\in(0,1) italic_ϵ ∈ ( 0 , 1 ) , we choose r 𝑟 r italic_r according to ( 31 ) 31 (\ref{myrval}) ( ) . Then using ( 31 ) 31 (\ref{myrval}) ( ) in ( 30 ) 30 (\ref{userhere}) ( ) , we obtain for any given ϵ ∈ ( 0 , 1 ) italic-ϵ 0 1 \epsilon\in(0,1) italic_ϵ ∈ ( 0 , 1 ) that
P γ ( ∑ i = 1 n Z i > n r − log ( 2 ) ) subscript 𝑃 𝛾 superscript subscript 𝑖 1 𝑛 subscript 𝑍 𝑖 𝑛 𝑟 2 \displaystyle P_{\gamma}\left(\sum_{i=1}^{n}Z_{i}>\sqrt{n}r-\log(2)\right) italic_P start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > square-root start_ARG italic_n end_ARG italic_r - roman_log ( 2 ) )
≤ 1 − ϵ − 2 β γ . absent 1 italic-ϵ 2 subscript 𝛽 𝛾 \displaystyle\leq 1-\epsilon-2\beta_{\gamma}. ≤ 1 - italic_ϵ - 2 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT .
(32)
( 32 ) 32 (\ref{firstterm}) ( ) provides an upper bound to the first term in ( 24 ) 24 (\ref{q}) ( ) .
We now upper bound the second term in ( 24 ) 24 (\ref{q}) ( ) .
Using again the choice of q 𝑞 q italic_q in ( 23 ) 23 (\ref{choiceq}) ( ) , we have
∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P t ( W ( Y n | X n ) q ( Y n ) > ρ ) subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑃 𝑡 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 \displaystyle\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}}P_{%
t}\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\right) ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ )
≤ ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P t ( W ( Y n | X n ) ∏ i = 1 n q t ( y i ) > ρ 2 | 𝒫 n ( 𝒜 ) | ) absent subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑃 𝑡 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 superscript subscript product 𝑖 1 𝑛 subscript 𝑞 𝑡 subscript 𝑦 𝑖 𝜌 2 subscript 𝒫 𝑛 𝒜 \displaystyle\leq\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}%
}P_{t}\left(\frac{W(Y^{n}|X^{n})}{\prod_{i=1}^{n}q_{t}(y_{i})}>\frac{\rho}{2|%
\mathcal{P}_{n}(\mathcal{A})|}\right) ≤ ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG > divide start_ARG italic_ρ end_ARG start_ARG 2 | caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( caligraphic_A ) | end_ARG )
≤ ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P t ( ∑ i = 1 n log W ( Y i | X i ) q t ( Y i ) > n C ( Γ ) + n r − log 2 ( n + 1 ) | 𝒜 | ) absent subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑃 𝑡 superscript subscript 𝑖 1 𝑛 𝑊 conditional subscript 𝑌 𝑖 subscript 𝑋 𝑖 subscript 𝑞 𝑡 subscript 𝑌 𝑖 𝑛 𝐶 Γ 𝑛 𝑟 2 superscript 𝑛 1 𝒜 \displaystyle\leq\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}%
}P_{t}\left(\sum_{i=1}^{n}\log\frac{W(Y_{i}|X_{i})}{q_{t}(Y_{i})}>nC(\Gamma)+%
\sqrt{n}r-\log 2(n+1)^{|\mathcal{A}|}\right) ≤ ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_log divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG > italic_n italic_C ( roman_Γ ) + square-root start_ARG italic_n end_ARG italic_r - roman_log 2 ( italic_n + 1 ) start_POSTSUPERSCRIPT | caligraphic_A | end_POSTSUPERSCRIPT )
= ( a ) ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ W ( ∑ i = 1 n log W ( Y i | x t , i ) q t ( Y i ) > n C ( Γ ) + n r − log 2 ( n + 1 ) | 𝒜 | ) , superscript 𝑎 absent subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 𝑊 superscript subscript 𝑖 1 𝑛 𝑊 conditional subscript 𝑌 𝑖 subscript 𝑥 𝑡 𝑖
subscript 𝑞 𝑡 subscript 𝑌 𝑖 𝑛 𝐶 Γ 𝑛 𝑟 2 superscript 𝑛 1 𝒜 \displaystyle\stackrel{{\scriptstyle(a)}}{{=}}\sum_{t:T^{n}_{\mathcal{A}}(t)%
\subset\mathcal{P}_{n}^{\gamma}}W\left(\sum_{i=1}^{n}\log\frac{W(Y_{i}|x_{t,i}%
)}{q_{t}(Y_{i})}>nC(\Gamma)+\sqrt{n}r-\log 2(n+1)^{|\mathcal{A}|}\right), start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG ( italic_a ) end_ARG end_RELOP ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_W ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_log divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG > italic_n italic_C ( roman_Γ ) + square-root start_ARG italic_n end_ARG italic_r - roman_log 2 ( italic_n + 1 ) start_POSTSUPERSCRIPT | caligraphic_A | end_POSTSUPERSCRIPT ) ,
(33)
where in equality ( a ) 𝑎 (a) ( italic_a ) , ( x t , 1 , … , x t , n ) subscript 𝑥 𝑡 1
… subscript 𝑥 𝑡 𝑛
(x_{t,1},\ldots,x_{t,n}) ( italic_x start_POSTSUBSCRIPT italic_t , 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_t , italic_n end_POSTSUBSCRIPT ) is any arbitrary sequence from the type class T 𝒜 n ( t ) subscript superscript 𝑇 𝑛 𝒜 𝑡 T^{n}_{\mathcal{A}}(t) italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) . Equality ( a ) 𝑎 (a) ( italic_a ) holds because under the probability measure P t subscript 𝑃 𝑡 P_{t} italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , t ( X n ) = t 𝑡 superscript 𝑋 𝑛 𝑡 t(X^{n})=t italic_t ( italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = italic_t a.s. (see Remark 2 ) and the distribution of
∑ i = 1 n log W ( Y i | X i ) q t ( Y i ) superscript subscript 𝑖 1 𝑛 𝑊 conditional subscript 𝑌 𝑖 subscript 𝑋 𝑖 subscript 𝑞 𝑡 subscript 𝑌 𝑖 \sum_{i=1}^{n}\log\frac{W(Y_{i}|X_{i})}{q_{t}(Y_{i})} ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_log divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG
depends on X n superscript 𝑋 𝑛 X^{n} italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT only through its type.
Continuing from ( 33 ) 33 (\ref{typeetype}) ( ) , we have
∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P t ( W ( Y n | X n ) q ( Y n ) > ρ ) subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑃 𝑡 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 \displaystyle\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}}P_{%
t}\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\right) ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ )
≤ ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ W ( ∑ i = 1 n [ log W ( Y i | x t , i ) q t ( Y i ) − 𝔼 W [ log W ( Y | x t , i ) q t ( Y ) ] ] > \displaystyle\leq\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}%
}W\left(\sum_{i=1}^{n}\left[\log\frac{W(Y_{i}|x_{t,i})}{q_{t}(Y_{i})}-\mathbb{%
E}_{W}\left[\log\frac{W(Y|x_{t,i})}{q_{t}(Y)}\right]\right]>\right. ≤ ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_W ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ roman_log divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG - blackboard_E start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT [ roman_log divide start_ARG italic_W ( italic_Y | italic_x start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y ) end_ARG ] ] >
n [ C ( Γ ) − I ( t , W ) ] + n r − log 2 ( n + 1 ) | 𝒜 | ) \displaystyle\quad\quad\quad\quad\quad\quad\quad n\left[C(\Gamma)-I(t,W)\right%
]+\sqrt{n}r-\log 2(n+1)^{|\mathcal{A}|}\Bigg{)} italic_n [ italic_C ( roman_Γ ) - italic_I ( italic_t , italic_W ) ] + square-root start_ARG italic_n end_ARG italic_r - roman_log 2 ( italic_n + 1 ) start_POSTSUPERSCRIPT | caligraphic_A | end_POSTSUPERSCRIPT )
≤ ∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ W ( ∑ i = 1 n [ log W ( Y i | x t , i ) q t ( Y i ) − 𝔼 W [ log W ( Y | x t , i ) q t ( Y ) ] ] > n K 2 ) absent subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 𝑊 superscript subscript 𝑖 1 𝑛 delimited-[] 𝑊 conditional subscript 𝑌 𝑖 subscript 𝑥 𝑡 𝑖
subscript 𝑞 𝑡 subscript 𝑌 𝑖 subscript 𝔼 𝑊 delimited-[] 𝑊 conditional 𝑌 subscript 𝑥 𝑡 𝑖
subscript 𝑞 𝑡 𝑌 𝑛 𝐾 2 \displaystyle\leq\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}%
}W\left(\sum_{i=1}^{n}\left[\log\frac{W(Y_{i}|x_{t,i})}{q_{t}(Y_{i})}-\mathbb{%
E}_{W}\left[\log\frac{W(Y|x_{t,i})}{q_{t}(Y)}\right]\right]>n\frac{K}{2}\right) ≤ ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_W ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ roman_log divide start_ARG italic_W ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_ARG - blackboard_E start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT [ roman_log divide start_ARG italic_W ( italic_Y | italic_x start_POSTSUBSCRIPT italic_t , italic_i end_POSTSUBSCRIPT ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_Y ) end_ARG ] ] > italic_n divide start_ARG italic_K end_ARG start_ARG 2 end_ARG )
(34)
where the last inequality holds for sufficiently large n 𝑛 n italic_n because r 𝑟 r italic_r , as defined in ( 31 ) 31 (\ref{myrval}) ( ) , is an O ( 1 ) 𝑂 1 O(1) italic_O ( 1 ) term, and from the construction of the set 𝒫 n γ superscript subscript 𝒫 𝑛 𝛾 \mathcal{P}_{n}^{\gamma} caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT , we have
inf t : T 𝒜 n ( t ) ⊂ 𝒫 n γ d W ( t ) ≥ γ > 0 subscript infimum : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑑 𝑊 𝑡 𝛾 0 \inf_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}}\,d_{W}(t)\geq%
\gamma>0 roman_inf start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_t ) ≥ italic_γ > 0
which implies
inf t : T 𝒜 n ( t ) ⊂ 𝒫 n γ [ C ( Γ ) − I ( t , W ) ] > K subscript infimum : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 delimited-[] 𝐶 Γ 𝐼 𝑡 𝑊 𝐾 \inf_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}}\left[C(\Gamma)-%
I(t,W)\right]>K roman_inf start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_C ( roman_Γ ) - italic_I ( italic_t , italic_W ) ] > italic_K
for some constant K > 0 𝐾 0 K>0 italic_K > 0 .
Let i max , t := max a , b : q t ( b ) W ( b | a ) > 0 | log W ( b | a ) q t ( b ) | assign subscript 𝑖 𝑡
subscript : 𝑎 𝑏
subscript 𝑞 𝑡 𝑏 𝑊 conditional 𝑏 𝑎 0 𝑊 conditional 𝑏 𝑎 subscript 𝑞 𝑡 𝑏 i_{\max,t}:=\max_{a,b:q_{t}(b)W(b|a)>0}\big{|}\log\frac{W(b|a)}{q_{t}(b)}\big{|} italic_i start_POSTSUBSCRIPT roman_max , italic_t end_POSTSUBSCRIPT := roman_max start_POSTSUBSCRIPT italic_a , italic_b : italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) italic_W ( italic_b | italic_a ) > 0 end_POSTSUBSCRIPT | roman_log divide start_ARG italic_W ( italic_b | italic_a ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) end_ARG | . We now show that i max , t ≤ 2 log n subscript 𝑖 𝑡
2 𝑛 i_{\max,t}\leq 2\log n italic_i start_POSTSUBSCRIPT roman_max , italic_t end_POSTSUBSCRIPT ≤ 2 roman_log italic_n for all t 𝑡 t italic_t . Let W min := min a , b : W ( b | a ) > 0 W ( b | a ) assign subscript 𝑊 subscript : 𝑎 𝑏
𝑊 conditional 𝑏 𝑎 0 𝑊 conditional 𝑏 𝑎 W_{\min}:=\min_{a,b:W(b|a)>0}W(b|a) italic_W start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT := roman_min start_POSTSUBSCRIPT italic_a , italic_b : italic_W ( italic_b | italic_a ) > 0 end_POSTSUBSCRIPT italic_W ( italic_b | italic_a ) and q min , t := min b : q t ( b ) > 0 q t ( b ) assign subscript 𝑞 𝑡
subscript : 𝑏 subscript 𝑞 𝑡 𝑏 0 subscript 𝑞 𝑡 𝑏 q_{\min,t}:=\min_{b:q_{t}(b)>0}q_{t}(b) italic_q start_POSTSUBSCRIPT roman_min , italic_t end_POSTSUBSCRIPT := roman_min start_POSTSUBSCRIPT italic_b : italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) > 0 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) . Then
q min , t subscript 𝑞 𝑡
\displaystyle q_{\min,t} italic_q start_POSTSUBSCRIPT roman_min , italic_t end_POSTSUBSCRIPT
= min b : q t ( b ) > 0 ∑ a ∈ 𝒜 t ( a ) W ( b | a ) absent subscript : 𝑏 subscript 𝑞 𝑡 𝑏 0 subscript 𝑎 𝒜 𝑡 𝑎 𝑊 conditional 𝑏 𝑎 \displaystyle=\min_{b:q_{t}(b)>0}\sum_{a\in\mathcal{A}}t(a)W(b|a) = roman_min start_POSTSUBSCRIPT italic_b : italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) > 0 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_a ∈ caligraphic_A end_POSTSUBSCRIPT italic_t ( italic_a ) italic_W ( italic_b | italic_a )
≥ min a , b : W ( b | a ) > 0 W ( b | a ) min a : t ( a ) > 0 t ( a ) absent subscript : 𝑎 𝑏
𝑊 conditional 𝑏 𝑎 0 𝑊 conditional 𝑏 𝑎 subscript : 𝑎 𝑡 𝑎 0 𝑡 𝑎 \displaystyle\geq\min_{a,b:W(b|a)>0}W(b|a)\min_{a:t(a)>0}t(a) ≥ roman_min start_POSTSUBSCRIPT italic_a , italic_b : italic_W ( italic_b | italic_a ) > 0 end_POSTSUBSCRIPT italic_W ( italic_b | italic_a ) roman_min start_POSTSUBSCRIPT italic_a : italic_t ( italic_a ) > 0 end_POSTSUBSCRIPT italic_t ( italic_a )
= W min n . absent subscript 𝑊 𝑛 \displaystyle=\frac{W_{\min}}{n}. = divide start_ARG italic_W start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG .
Thus,
i max , t subscript 𝑖 𝑡
\displaystyle i_{\max,t} italic_i start_POSTSUBSCRIPT roman_max , italic_t end_POSTSUBSCRIPT
= max a , b : q t ( b ) W ( b | a ) > 0 | log W ( b | a ) q t ( b ) | absent subscript : 𝑎 𝑏
subscript 𝑞 𝑡 𝑏 𝑊 conditional 𝑏 𝑎 0 𝑊 conditional 𝑏 𝑎 subscript 𝑞 𝑡 𝑏 \displaystyle=\max_{a,b:q_{t}(b)W(b|a)>0}\big{|}\log\frac{W(b|a)}{q_{t}(b)}%
\big{|} = roman_max start_POSTSUBSCRIPT italic_a , italic_b : italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) italic_W ( italic_b | italic_a ) > 0 end_POSTSUBSCRIPT | roman_log divide start_ARG italic_W ( italic_b | italic_a ) end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) end_ARG |
≤ max a , b : q t ( b ) W ( b | a ) > 0 | log W ( b | a ) | + max b : q t ( b ) > 0 | log q t ( b ) | \displaystyle\leq\max_{a,b:q_{t}(b)W(b|a)>0}\big{|}\log W(b|a)\big{|}+\max_{b:%
q_{t}(b)>0}\big{|}\log q_{t}(b)\big{|} ≤ roman_max start_POSTSUBSCRIPT italic_a , italic_b : italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) italic_W ( italic_b | italic_a ) > 0 end_POSTSUBSCRIPT | roman_log italic_W ( italic_b | italic_a ) | + roman_max start_POSTSUBSCRIPT italic_b : italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) > 0 end_POSTSUBSCRIPT | roman_log italic_q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) |
≤ log n W min 2 absent 𝑛 superscript subscript 𝑊 2 \displaystyle\leq\log\frac{n}{W_{\min}^{2}} ≤ roman_log divide start_ARG italic_n end_ARG start_ARG italic_W start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
≤ 2 log n absent 2 𝑛 \displaystyle\leq 2\log n ≤ 2 roman_log italic_n
for all sufficiently large n 𝑛 n italic_n .
Hence, we can use Azuma’s inequality [26 , (33), p. 61] to upper bound ( 34 ) 34 (\ref{recycle}) ( ) , giving us
∑ t : T 𝒜 n ( t ) ⊂ 𝒫 n γ P t ( W ( Y n | X n ) q ( Y n ) > ρ ) subscript : 𝑡 subscript superscript 𝑇 𝑛 𝒜 𝑡 superscript subscript 𝒫 𝑛 𝛾 subscript 𝑃 𝑡 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝜌 \displaystyle\sum_{t:T^{n}_{\mathcal{A}}(t)\subset\mathcal{P}_{n}^{\gamma}}P_{%
t}\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\rho\right) ∑ start_POSTSUBSCRIPT italic_t : italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT caligraphic_A end_POSTSUBSCRIPT ( italic_t ) ⊂ caligraphic_P start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_γ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > italic_ρ )
≤ ( n + 1 ) | 𝒜 | exp ( − n K 2 128 log 2 n ) absent superscript 𝑛 1 𝒜 𝑛 superscript 𝐾 2 128 superscript 2 𝑛 \displaystyle\leq(n+1)^{|\mathcal{A}|}\exp\left(-\frac{nK^{2}}{128\log^{2}n}\right) ≤ ( italic_n + 1 ) start_POSTSUPERSCRIPT | caligraphic_A | end_POSTSUPERSCRIPT roman_exp ( - divide start_ARG italic_n italic_K start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 128 roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n end_ARG )
(35)
which goes to zero as n → ∞ → 𝑛 n\to\infty italic_n → ∞ .
Substituting the upper bounds ( 32 ) 32 (\ref{firstterm}) ( ) and ( 35 ) 35 (\ref{pn4}) ( ) in ( 24 ) 24 (\ref{q}) ( ) , we obtain
( F ∘ W ) ( W ( Y n | X n ) q ( Y n ) > exp ( n C ( Γ ) + n r ) ) 𝐹 𝑊 𝑊 conditional superscript 𝑌 𝑛 superscript 𝑋 𝑛 𝑞 superscript 𝑌 𝑛 𝑛 𝐶 Γ 𝑛 𝑟 \displaystyle(F\circ W)\left(\frac{W(Y^{n}|X^{n})}{q(Y^{n})}>\exp\left(nC(%
\Gamma)+\sqrt{n}r\right)\right) ( italic_F ∘ italic_W ) ( divide start_ARG italic_W ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_X start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG start_ARG italic_q ( italic_Y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) end_ARG > roman_exp ( italic_n italic_C ( roman_Γ ) + square-root start_ARG italic_n end_ARG italic_r ) )
≤ 1 − ϵ − β γ absent 1 italic-ϵ subscript 𝛽 𝛾 \displaystyle\leq 1-\epsilon-\beta_{\gamma} ≤ 1 - italic_ϵ - italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT
for sufficiently large n 𝑛 n italic_n . Since the controller F 𝐹 F italic_F was arbitrary, we can apply Lemma 1 to obtain
log M fb ∗ ( n , ϵ , Γ ) subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ \displaystyle\log M^{*}_{\text{fb}}(n,\epsilon,\Gamma) roman_log italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ )
≤ n C ( Γ ) + n r − log β γ absent 𝑛 𝐶 Γ 𝑛 𝑟 subscript 𝛽 𝛾 \displaystyle\leq nC(\Gamma)+\sqrt{n}r-\log\beta_{\gamma} ≤ italic_n italic_C ( roman_Γ ) + square-root start_ARG italic_n end_ARG italic_r - roman_log italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT
log M fb ∗ ( n , ϵ , Γ ) − n C ( Γ ) n subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ 𝑛 𝐶 Γ 𝑛 \displaystyle\frac{\log M^{*}_{\text{fb}}(n,\epsilon,\Gamma)-nC(\Gamma)}{\sqrt%
{n}} divide start_ARG roman_log italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ ) - italic_n italic_C ( roman_Γ ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG
≤ r − log β γ n absent 𝑟 subscript 𝛽 𝛾 𝑛 \displaystyle\leq r-\frac{\log\beta_{\gamma}}{\sqrt{n}} ≤ italic_r - divide start_ARG roman_log italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG
lim sup n → ∞ log M fb ∗ ( n , ϵ , Γ ) − n C ( Γ ) n subscript limit-supremum → 𝑛 subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ 𝑛 𝐶 Γ 𝑛 \displaystyle\limsup_{n\to\infty}\frac{\log M^{*}_{\text{fb}}(n,\epsilon,%
\Gamma)-nC(\Gamma)}{\sqrt{n}} lim sup start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT divide start_ARG roman_log italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ ) - italic_n italic_C ( roman_Γ ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG
≤ r ′ , absent superscript 𝑟 ′ \displaystyle\leq r^{\prime}, ≤ italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ,
where r ′ superscript 𝑟 ′ r^{\prime} italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is obtained from the expression of r 𝑟 r italic_r in ( 31 ) 31 (\ref{myrval}) ( ) after taking the limit as n → ∞ → 𝑛 n\to\infty italic_n → ∞ , i.e.,
r ′ = { V ( Γ ) + | 𝒜 | γ ν max Φ − 1 ( ϵ + 3 β γ ) if ϵ ∈ [ 1 2 − 3 β γ , 1 ) V ( Γ ) − | 𝒜 | γ ν max Φ − 1 ( ϵ + 3 β γ ) if ϵ ∈ ( 0 , 1 2 − 3 β γ ) . superscript 𝑟 ′ cases 𝑉 Γ 𝒜 𝛾 subscript 𝜈 superscript Φ 1 italic-ϵ 3 subscript 𝛽 𝛾 if italic-ϵ 1 2 3 subscript 𝛽 𝛾 1 𝑉 Γ 𝒜 𝛾 subscript 𝜈 superscript Φ 1 italic-ϵ 3 subscript 𝛽 𝛾 if italic-ϵ 0 1 2 3 subscript 𝛽 𝛾 \displaystyle r^{\prime}=\begin{cases}\sqrt{V(\Gamma)+|\mathcal{A}|\gamma\nu_{%
\max}}\,\Phi^{-1}\left(\epsilon+3\beta_{\gamma}\right)&\text{ if }\epsilon\in%
\left[\frac{1}{2}-3\beta_{\gamma},1\right)\\
\sqrt{V(\Gamma)-|\mathcal{A}|\gamma\nu_{\max}}\,\Phi^{-1}\left(\epsilon+3\beta%
_{\gamma}\right)&\text{ if }\epsilon\in\left(0,\frac{1}{2}-3\beta_{\gamma}%
\right).\end{cases} italic_r start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = { start_ROW start_CELL square-root start_ARG italic_V ( roman_Γ ) + | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ + 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) end_CELL start_CELL if italic_ϵ ∈ [ divide start_ARG 1 end_ARG start_ARG 2 end_ARG - 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT , 1 ) end_CELL end_ROW start_ROW start_CELL square-root start_ARG italic_V ( roman_Γ ) - | caligraphic_A | italic_γ italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ + 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) end_CELL start_CELL if italic_ϵ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 2 end_ARG - 3 italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT ) . end_CELL end_ROW
Finally, since V ( Γ ) 4 | 𝒜 | ν max > γ > 0 𝑉 Γ 4 𝒜 subscript 𝜈 𝛾 0 \frac{V(\Gamma)}{4|\mathcal{A}|\nu_{\max}}>\gamma>0 divide start_ARG italic_V ( roman_Γ ) end_ARG start_ARG 4 | caligraphic_A | italic_ν start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG > italic_γ > 0 was arbitrary, we can take γ 𝛾 \gamma italic_γ and β γ subscript 𝛽 𝛾 \beta_{\gamma} italic_β start_POSTSUBSCRIPT italic_γ end_POSTSUBSCRIPT arbitrarily small, giving us the converse result
lim sup n → ∞ log M fb ∗ ( n , ϵ , Γ ) − n C ( Γ ) n ≤ V ( Γ ) Φ − 1 ( ϵ ) . subscript limit-supremum → 𝑛 subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ 𝑛 𝐶 Γ 𝑛 𝑉 Γ superscript Φ 1 italic-ϵ \displaystyle\limsup_{n\to\infty}\frac{\log M^{*}_{\text{fb}}(n,\epsilon,%
\Gamma)-nC(\Gamma)}{\sqrt{n}}\leq\sqrt{V(\Gamma)}\Phi^{-1}(\epsilon). lim sup start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT divide start_ARG roman_log italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ ) - italic_n italic_C ( roman_Γ ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG ≤ square-root start_ARG italic_V ( roman_Γ ) end_ARG roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ ) .
Since this matches the optimal non-feedback SOCR of simple dispersion DMCs with a peak-power cost constraint, we have
lim n → ∞ log M fb ∗ ( n , ϵ , Γ ) − n C ( Γ ) n = V ( Γ ) Φ − 1 ( ϵ ) subscript → 𝑛 subscript superscript 𝑀 fb 𝑛 italic-ϵ Γ 𝑛 𝐶 Γ 𝑛 𝑉 Γ superscript Φ 1 italic-ϵ \displaystyle\lim_{n\to\infty}\frac{\log M^{*}_{\text{fb}}(n,\epsilon,\Gamma)-%
nC(\Gamma)}{\sqrt{n}}=\sqrt{V(\Gamma)}\Phi^{-1}(\epsilon) roman_lim start_POSTSUBSCRIPT italic_n → ∞ end_POSTSUBSCRIPT divide start_ARG roman_log italic_M start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT fb end_POSTSUBSCRIPT ( italic_n , italic_ϵ , roman_Γ ) - italic_n italic_C ( roman_Γ ) end_ARG start_ARG square-root start_ARG italic_n end_ARG end_ARG = square-root start_ARG italic_V ( roman_Γ ) end_ARG roman_Φ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ )
(36)
for simple-dispersion DMCs with a peak-power cost constraint.