Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
License: CC BY-NC-ND 4.0
arXiv:2403.00237v2 [stat.ME] 25 Mar 2024

Stable Reduced-Rank VAR Identification 111 This work has been submitted to IFAC for possible publication

Xinhui Rong and Victor Solo 222Authors are with School of Electrical Eng. &\&& Telecommunications, UNSW, Sydney, Australia.
Abstract

The vector autoregression (VAR) has been widely used in system identification, econometrics, natural science, and many other areas. However when the state dimension becomes large the parameter dimension explodes. So rank reduced modelling is attractive and is well developed. But a fundamental requirement in almost all applications is stability of the fitted model. And this has not been addressed in the rank reduced case. Here, we develop, for the first time, a closed-form formula for an estimator of a rank reduced transition matrix which is guaranteed to be stable. We show that our estimator is consistent and asymptotically statistically efficient and illustrate it in comparative simulations.

1 Introduction

The vector autoregression (VAR) is widely applied in control, signal processing, time series and econometrics. The lag-one VAR (VAR(1)) is more general than it seems since any higher order VAR can be written as a state space model whose state equation is a VAR(1) [12],[27]. More recently, the VAR(1) has been used in reinforcement learning [23, 1, 17, 19].

High dimensional time series have now become very common across many disciplines, e.g. epidemiology [28], financial economics [20]. So rank reduced modelling has gained increasing attention [20],[4],[5].

In physical system, stability is crucial. And this has led to a continuing development of stability guaranteed estimators, e.g. for state space modelling. A common approach employs a perturbation minimization procedure, where a preliminary unstable least-squares estimator is stabilized by the smallest additive perturbation. For example, Mari et al. [15] consider a semidefinite programming (SP) problem with a Lyapunov stability constraint. The problem can be solved by standard linear matrix inequality (LMI) methods. However, the computation becomes prohibitive with higher-dimensional states. Miller and de Callafon [16] extend the SP method to include more eigenvalue constraints, besides stability. Boots et al. [2] and the first algorithm in [14] use singular value constraints instead of eigenvalue constraints, which can be too conservative. The second algorithm in [14] uses a line search method based on the gradient sampling and the reliability is sensitive to the user-defined step size parameter. Tanaka and Katayama [24] and Jongeneel et al. [10] project the unstable estimator onto a stable region by solving linear quadratic regulator (LQR) problems. Jongeneel et al. [10] show that the computational efficiency is greatly improved in higher-dimension applications, and provide error analysis and statistical guarantees.

Other approaches include, Chui and Maciejowski [6] who iteratively augment the unstable estimator until its largest eigenvalues have the modulus of the user-defined parameter. However, this method distorts the estimator and introduces additional bias. Van Gestel et al. [26] introduce a regularization term to the least squares problem so that the variance of the estimators is reduced and in the meantime, the stability is guaranteed. However, the regularization is too conservative. Umenberger et al. [25] consider the state space modelling and take the maximum likelihood approach to guarantee stability. However, their method is computationally expensive.

In recent work, we [21] solved a state space estimation problem by using a Burg-type forwards-backwards (FB) optimization [3, 22, 18] on the error residuals and remarkably find a closed-form stable solution that is computationally cheap and involves no tuning parameters.

Among the above works, [10] deal with pure VAR(1) and the others consider the state space setup where subspace identification methods are needed before fitting a VAR(1) model [27]. However, none of the above stability-enforced methods deals with the reduced rank modeling.

In this paper, we extend the work in [21], using the FB approach to obtain a reduced rank (RR)-VAR(1) with guaranteed stability. The FB approach generates an estimator that is no more computationally expensive than the least squares RR method of [20]. Further, no tuning parameters are required.

The rest of the paper is organized as follows. We first introduce the forwards and backwards VAR(1) models in Section 2, and then develop for the first time:

  1. (i)

    in Section 3 a closed-form stable estimator for the full-rank VAR(1),

  2. (ii)

    In Section 4, based on (i), a closed-form stable estimator for the RR-VAR(1),

  3. (iii)

    in Section 5, statistical consistency, and asymptotic efficiency for the new estimator.

The results are illustrated in comparative simulations in Section 6. Section 7 contains conclusions.

We use the following notations.
ρ(A)𝜌𝐴\rho(A)italic_ρ ( italic_A ) is the spectral radius of A𝐴Aitalic_A.
A=tr(AA)norm𝐴tr𝐴superscript𝐴\|A\|=\sqrt{\operatorname{tr}(AA^{\prime})}∥ italic_A ∥ = square-root start_ARG roman_tr ( italic_A italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG is the Frobenius norm of A𝐴Aitalic_A.
A>0𝐴0A>0italic_A > 0 means A𝐴Aitalic_A is positive definite.
A0𝐴0A\geq 0italic_A ≥ 0 denotes positive semi-definiteness.
w.p.1 means with probability 1.

2 VAR(1) and the Backwards Model

In this section, we review the (forwards) VAR(1) and its associated backwards model. The VAR(1) is a Markov process generated by

yt=Fyt1+wt,t=1,,T,formulae-sequencesubscript𝑦𝑡𝐹subscript𝑦𝑡1subscript𝑤𝑡𝑡1𝑇\displaystyle y_{t}=Fy_{t-1}+w_{t},\quad t=1,\dotsm,T,italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_F italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_w start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t = 1 , ⋯ , italic_T , (2.1)

where ytsubscript𝑦𝑡y_{t}italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the observed nlimit-from𝑛n-italic_n -vector time series, wtsubscript𝑤𝑡w_{t}italic_w start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is a zero-mean driving white noise with a non-singular covariance matrix Q𝑄Qitalic_Q, and F𝐹Fitalic_F is the transition matrix, assumed to be stable. The stability of F𝐹Fitalic_F ensures the existence of a steady state variance matrix ΠΠ\Piroman_Π which obeys a discrete-time (DT) Lyapunov equation

Π=FΠF+Q.Π𝐹Πsuperscript𝐹𝑄\displaystyle\Pi=F\Pi F^{\prime}+Q.roman_Π = italic_F roman_Π italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_Q .

A sufficient condition for ΠΠ\Piroman_Π to be positive definite is that Q𝑄Qitalic_Q is positive definite. We introduce the steady state lag-one cross-covariance Π10=E[ytyt1]=FΠ=Π01subscriptΠ10Esubscript𝑦𝑡superscriptsubscript𝑦𝑡1𝐹ΠsuperscriptsubscriptΠ01\Pi_{10}=\operatorname{E}[y_{t}y_{t-1}^{\prime}]=F\Pi=\Pi_{01}^{\prime}roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT = roman_E [ italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] = italic_F roman_Π = roman_Π start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, so that F=Π10Π1𝐹subscriptΠ10superscriptΠ1F=\Pi_{10}\Pi^{-1}italic_F = roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. We further introduce the associated correlation matrix RF=Π12Π10Π12=Π12FΠ12subscript𝑅𝐹superscriptΠ12subscriptΠ10superscriptΠ12superscriptΠ12𝐹superscriptΠ12R_{F}=\Pi^{-\frac{1}{2}}\Pi_{10}\Pi^{-\frac{1}{2}}=\Pi^{-\frac{1}{2}}F\Pi^{% \frac{1}{2}}italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT = roman_Π start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT roman_Π start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT = roman_Π start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_F roman_Π start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT which has the same eigenvalues as F𝐹Fitalic_F.

Associated with the VAR(1) is a stationary backwards model [11]

yt1=Fbyt+wb,t1,t=T,,1,formulae-sequencesubscript𝑦𝑡1subscript𝐹𝑏subscript𝑦𝑡subscript𝑤𝑏𝑡1𝑡𝑇1\displaystyle y_{t-1}=F_{b}y_{t}+w_{b,t-1},\quad t=T,\dotsm,1,italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT = italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_w start_POSTSUBSCRIPT italic_b , italic_t - 1 end_POSTSUBSCRIPT , italic_t = italic_T , ⋯ , 1 ,

where Fbsubscript𝐹𝑏F_{b}italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is the backwards transition matrix and wb,t1subscript𝑤𝑏𝑡1w_{b,t-1}italic_w start_POSTSUBSCRIPT italic_b , italic_t - 1 end_POSTSUBSCRIPT is a white noise with covariance matrix Qbsubscript𝑄𝑏Q_{b}italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT which is statistically independent of ytsubscript𝑦𝑡y_{t}italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. We note that [11]

Fb=E[yt1yt]E1[ytyt]=Π10Π1=ΠFΠ1,subscript𝐹𝑏Esubscript𝑦𝑡1superscriptsubscript𝑦𝑡superscriptE1subscript𝑦𝑡subscript𝑦𝑡superscriptsubscriptΠ10superscriptΠ1Πsuperscript𝐹superscriptΠ1\displaystyle F_{b}=\operatorname{E}[y_{t-1}y_{t}^{\prime}]\operatorname{E}^{-% 1}[y_{t}y_{t}]=\Pi_{10}^{\prime}\Pi^{-1}=\Pi F^{\prime}\Pi^{-1},italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = roman_E [ italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] roman_E start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT [ italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] = roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = roman_Π italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ,

and

Qbsubscript𝑄𝑏\displaystyle Q_{b}italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT =E[(yt1Fbyt)(yt1Fbyt)]absentEsubscript𝑦𝑡1subscript𝐹𝑏subscript𝑦𝑡superscriptsubscript𝑦𝑡1subscript𝐹𝑏subscript𝑦𝑡\displaystyle=\operatorname{E}[(y_{t-1}-F_{b}y_{t})(y_{t-1}-F_{b}y_{t})^{% \prime}]= roman_E [ ( italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ( italic_y start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ]
=ΠFbΠ10Π10Fb+FbΠFbabsentΠsubscript𝐹𝑏subscriptΠ10superscriptsubscriptΠ10superscriptsubscript𝐹𝑏subscript𝐹𝑏Πsuperscriptsubscript𝐹𝑏\displaystyle=\Pi-F_{b}\Pi_{10}-\Pi_{10}^{\prime}F_{b}^{\prime}+F_{b}\Pi F_{b}% ^{\prime}= roman_Π - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT - roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT roman_Π italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=ΠΠ01Π1Π10absentΠsubscriptΠ01superscriptΠ1subscriptΠ10\displaystyle=\Pi-\Pi_{01}\Pi^{-1}\Pi_{10}= roman_Π - roman_Π start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Π start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT
=ΠFbΠFb.absentΠsubscript𝐹𝑏Πsuperscriptsubscript𝐹𝑏\displaystyle=\Pi-F_{b}\Pi F_{b}^{\prime}.= roman_Π - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT roman_Π italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT .

An elementary argument shows that Fbsubscript𝐹𝑏F_{b}italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT has the same eigenvalues of F𝐹Fitalic_F. Thus Fbsubscript𝐹𝑏F_{b}italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is stable iff F𝐹Fitalic_F is stable. The following results are used below.

Theorem 1. Converse Lyapunov result. If ΠΠ\Piroman_Π is positive definite and Q𝑄Qitalic_Q is positive semi-definite then any eigenvalue λ𝜆\lambdaitalic_λ of F𝐹Fitalic_F has |λ|1𝜆1|\lambda|\leq 1| italic_λ | ≤ 1. If Q𝑄Qitalic_Q is positive definite then |λ|<1𝜆1|\lambda|<1| italic_λ | < 1.

Proof. Let v𝑣vitalic_v be left eigenvector of F𝐹Fitalic_F with eigenvalue λ𝜆\lambdaitalic_λ. Then denoting (v)superscriptsuperscript𝑣(v^{\ast})^{\prime}( italic_v start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT as vHsuperscript𝑣𝐻v^{H}italic_v start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT we find

vHΠv=|λ|2vHΠv+vHQv(1|λ|2)vHΠv=vHQv0|λ|1.missing-subexpressionsuperscript𝑣𝐻Π𝑣superscript𝜆2superscript𝑣𝐻Π𝑣superscript𝑣𝐻𝑄𝑣1superscript𝜆2superscript𝑣𝐻Π𝑣superscript𝑣𝐻𝑄𝑣0𝜆1\displaystyle\begin{array}[]{rrcl}&v^{H}\Pi v&=&|\lambda|^{2}v^{H}\Pi v+v^{H}% Qv\\ \Rightarrow&(1-|\lambda|^{2})v^{H}\Pi v&=&v^{H}Qv\geq 0\Rightarrow|\lambda|% \leq 1.\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL italic_v start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT roman_Π italic_v end_CELL start_CELL = end_CELL start_CELL | italic_λ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_v start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT roman_Π italic_v + italic_v start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT italic_Q italic_v end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL ( 1 - | italic_λ | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_v start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT roman_Π italic_v end_CELL start_CELL = end_CELL start_CELL italic_v start_POSTSUPERSCRIPT italic_H end_POSTSUPERSCRIPT italic_Q italic_v ≥ 0 ⇒ | italic_λ | ≤ 1 . end_CELL end_ROW end_ARRAY

Clearly if Q𝑄Qitalic_Q is positive definite we get |λ|<1𝜆1|\lambda|<1| italic_λ | < 1. \square

Theorem 2. Qbsubscript𝑄𝑏Q_{b}italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT has full rank iff Q𝑄Qitalic_Q has full rank.

Proof. First note that Qb=ΠQcΠsubscript𝑄𝑏Πsubscript𝑄𝑐ΠQ_{b}=\Pi Q_{c}\Piitalic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = roman_Π italic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT roman_Π where Qc=Π1FΠ1Fsubscript𝑄𝑐superscriptΠ1superscript𝐹superscriptΠ1𝐹Q_{c}=\Pi^{-1}-F^{\prime}\Pi^{-1}Fitalic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F. Then clearly Qbsubscript𝑄𝑏Q_{b}italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT has full rank iff Qcsubscript𝑄𝑐Q_{c}italic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT has full rank. Suppose for some vector v0𝑣0v\neq 0italic_v ≠ 0, we have Qbv=0subscript𝑄𝑏𝑣0Q_{b}v=0italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_v = 0. Then QcΠv=0subscript𝑄𝑐Π𝑣0Q_{c}\Pi v=0italic_Q start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT roman_Π italic_v = 0. Thus, Π1Πv=FΠ1FΠvsuperscriptΠ1Π𝑣superscript𝐹superscriptΠ1𝐹Π𝑣\Pi^{-1}\Pi v=F^{\prime}\Pi^{-1}F\Pi vroman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT roman_Π italic_v = italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F roman_Π italic_v, i.e. v=FΠ1FΠv𝑣superscript𝐹superscriptΠ1𝐹Π𝑣v=F^{\prime}\Pi^{-1}F\Pi vitalic_v = italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F roman_Π italic_v. Note that we cannot have FΠv=0𝐹Π𝑣0F\Pi v=0italic_F roman_Π italic_v = 0 since then v=0𝑣0v=0italic_v = 0 which is a contradiction.

Now consider that

QΠ1=IFΠFΠ1QΠ1FΠ=FΠFΠFΠ1FΠQΠ1FΠv=FΠvFΠFΠ1FΠv=FΠvFΠv=0missing-subexpression𝑄superscriptΠ1𝐼𝐹Πsuperscript𝐹superscriptΠ1𝑄superscriptΠ1𝐹Π𝐹Π𝐹Πsuperscript𝐹superscriptΠ1𝐹Π𝑄superscriptΠ1𝐹Π𝑣𝐹Π𝑣𝐹Πsuperscript𝐹superscriptΠ1𝐹Π𝑣missing-subexpressionmissing-subexpression𝐹Π𝑣𝐹Π𝑣0\displaystyle\begin{array}[]{rrcl}&Q\Pi^{-1}&=&I-F\Pi F^{\prime}\Pi^{-1}\\ \Rightarrow&Q\Pi^{-1}F\Pi&=&F\Pi-F\Pi F^{\prime}\Pi^{-1}F\Pi\\ \Rightarrow&Q\Pi^{-1}F\Pi v&=&F\Pi v-F\Pi F^{\prime}\Pi^{-1}F\Pi v\\ &&=&F\Pi v-F\Pi v=0\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL italic_Q roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL = end_CELL start_CELL italic_I - italic_F roman_Π italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL italic_Q roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F roman_Π end_CELL start_CELL = end_CELL start_CELL italic_F roman_Π - italic_F roman_Π italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F roman_Π end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL italic_Q roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F roman_Π italic_v end_CELL start_CELL = end_CELL start_CELL italic_F roman_Π italic_v - italic_F roman_Π italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F roman_Π italic_v end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL = end_CELL start_CELL italic_F roman_Π italic_v - italic_F roman_Π italic_v = 0 end_CELL end_ROW end_ARRAY

which is a contradiction and proves ‘if’; then ‘only if’ follows by running the argument in reverse. \square

3 Stable Estimators for full-rank VAR(1)

Here we review earlier work of [18, 22, 21] and develop a new result on the full-rank case. Given data yt,t=0,,Tformulae-sequencesubscript𝑦𝑡𝑡0𝑇y_{t},t=0,\cdots,Titalic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t = 0 , ⋯ , italic_T set Y0=[y0yT1]subscript𝑌0delimited-[]matrixsubscript𝑦0subscript𝑦𝑇1Y_{0}=[\begin{matrix}y_{0}&\dotsm&y_{T-1}\end{matrix}]italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_y start_POSTSUBSCRIPT italic_T - 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] and Y1=[y1yT],subscript𝑌1delimited-[]matrixsubscript𝑦1subscript𝑦𝑇Y_{1}=[\begin{matrix}y_{1}&\dotsm&y_{T}\end{matrix}],italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_y start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] , and define the sample covariances Sij=1TYiYj,i,j{0,1}formulae-sequencesubscript𝑆𝑖𝑗1𝑇subscript𝑌𝑖superscriptsubscript𝑌𝑗𝑖𝑗01S_{ij}=\frac{1}{T}Y_{i}Y_{j}^{\prime},\quad i,j\in\{0,1\}italic_S start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_T end_ARG italic_Y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_i , italic_j ∈ { 0 , 1 }. We also introduce the forwards and backwards residual mean squared errors

Sw,f(F)subscript𝑆𝑤𝑓𝐹\displaystyle S_{w,f}(F)italic_S start_POSTSUBSCRIPT italic_w , italic_f end_POSTSUBSCRIPT ( italic_F ) =1T(Y1FY0)(Y1FY0)absent1𝑇subscript𝑌1𝐹subscript𝑌0superscriptsubscript𝑌1𝐹subscript𝑌0\displaystyle=\frac{1}{T}(Y_{1}-FY_{0})(Y_{1}-FY_{0})^{\prime}= divide start_ARG 1 end_ARG start_ARG italic_T end_ARG ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_F italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_F italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=S11FS01S10F+FS00Fabsentsubscript𝑆11𝐹subscript𝑆01subscript𝑆10superscript𝐹𝐹subscript𝑆00superscript𝐹\displaystyle=S_{11}-FS_{01}-S_{10}F^{\prime}+FS_{00}F^{\prime}= italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_F italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
Sw,b(F)subscript𝑆𝑤𝑏𝐹\displaystyle S_{w,b}(F)italic_S start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ( italic_F ) =1T(Y0FbY1)(Y0FbY1)absent1𝑇subscript𝑌0subscript𝐹𝑏subscript𝑌1superscriptsubscript𝑌0subscript𝐹𝑏subscript𝑌1\displaystyle=\frac{1}{T}(Y_{0}-F_{b}Y_{1})(Y_{0}-F_{b}Y_{1})^{\prime}= divide start_ARG 1 end_ARG start_ARG italic_T end_ARG ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=S00FbS10S01Fb+FbS11Fb,absentsubscript𝑆00subscript𝐹𝑏subscript𝑆10subscript𝑆01superscriptsubscript𝐹𝑏subscript𝐹𝑏subscript𝑆11superscriptsubscript𝐹𝑏\displaystyle=S_{00}-F_{b}S_{10}-S_{01}F_{b}^{\prime}+F_{b}S_{11}F_{b}^{\prime},= italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT - italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ,

where Fb=PFP1subscript𝐹𝑏𝑃superscript𝐹superscript𝑃1F_{b}=PF^{\prime}P^{-1}italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = italic_P italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, where P𝑃Pitalic_P is a consistent estimator of ΠΠ\Piroman_Π to be chosen. Here F𝐹Fitalic_F is no longer the true value.

The least squares estimator F^LS=S10S001subscript^𝐹𝐿𝑆subscript𝑆10superscriptsubscript𝑆001\hat{F}_{LS}=S_{10}S_{00}^{-1}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT minimises tr(Sw,f)trsubscript𝑆𝑤𝑓\operatorname{tr}(S_{w,f})roman_tr ( italic_S start_POSTSUBSCRIPT italic_w , italic_f end_POSTSUBSCRIPT ), but it is NOT guaranteed to be stable. Continuing, [18, 22] considered the minimizer of the sum of weighted forwards and backwards sample mean squared errors

J(F;P)=tr{P1(Sw,f+Sw,b)}.𝐽𝐹𝑃trsuperscript𝑃1subscript𝑆𝑤𝑓subscript𝑆𝑤𝑏\displaystyle J(F;P)=\operatorname{tr}\{P^{-1}(S_{w,f}+S_{w,b})\}.italic_J ( italic_F ; italic_P ) = roman_tr { italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT italic_w , italic_f end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ) } .

This yields the following remarkable result.

Theorem 3A. [22, 21, 18].
The minimizer of J(F;P)𝐽𝐹𝑃J(F;P)italic_J ( italic_F ; italic_P ) is F^^𝐹\hat{F}over^ start_ARG italic_F end_ARG, the solution obeys the Sylvester equation below and is stable

F^S00P1+S11P1F^=2S10P1.^𝐹subscript𝑆00superscript𝑃1subscript𝑆11superscript𝑃1^𝐹2subscript𝑆10superscript𝑃1\displaystyle\hat{F}S_{00}P^{-1}+S_{11}P^{-1}\hat{F}=2S_{10}P^{-1}.over^ start_ARG italic_F end_ARG italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG = 2 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Remarks.

  1. (a)

    This result is not explicitly stated in [22] but is rather a special case of his results. Further the argument in [22] is very hard to follow. Partly for that reason, [21] gave a simple direct proof. Note that Theorem 4B below has nearly the same proof.

  2. (b)

    The result only holds with the weighting matrix P𝑃Pitalic_P used to define Fbsubscript𝐹𝑏F_{b}italic_F start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT, not for any other weighting matrix.

  3. (c)

    Using vec algebra we can get a closed-form solution.

We now have the further remarkable closed-form result.

Theorem 3B. Set P=S11𝑃subscript𝑆11P=S_{11}italic_P = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT. Then the stable estimator

F^=F^11=2S10(S00+S11)1,^𝐹subscript^𝐹112subscript𝑆10superscriptsubscript𝑆00subscript𝑆111\displaystyle\hat{F}=\hat{F}_{11}=2S_{10}(S_{00}+S_{11})^{-1},over^ start_ARG italic_F end_ARG = over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT = 2 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ,

Proof. Follows from Theorem 3A by setting P=S11𝑃subscript𝑆11P=S_{11}italic_P = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT.

Remarks.

  1. (d)

    Below we find that the choice P=S11𝑃subscript𝑆11P=S_{11}italic_P = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT enables our new results.

  2. (e)

    Note that consistency and asymptotic efficiency has been established for F^LSsubscript^𝐹𝐿𝑆\hat{F}_{LS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT in [12]. One can then show the same for F^^𝐹\hat{F}over^ start_ARG italic_F end_ARG. We omit details, concentrating here on the reduced rank case.

4 Stable Reduced Rank VAR(1) Estimation

We now consider the case where F𝐹Fitalic_F has rank m<n𝑚𝑛m<nitalic_m < italic_n. Then we can factor F=An×mBm×n𝐹subscript𝐴𝑛𝑚subscript𝐵𝑚𝑛F=A_{n\times m}B_{m\times n}italic_F = italic_A start_POSTSUBSCRIPT italic_n × italic_m end_POSTSUBSCRIPT italic_B start_POSTSUBSCRIPT italic_m × italic_n end_POSTSUBSCRIPT. We then have the following result.

Theorem 4A. [20][Section 2.3]. Reduced Rank Least Squares.

minrank(F)=mJLS:JLS=tr(S111Sw,f):subscriptrank𝐹𝑚subscript𝐽𝐿𝑆subscript𝐽𝐿𝑆trsuperscriptsubscript𝑆111subscript𝑆𝑤𝑓\displaystyle\min_{\operatorname{rank}(F)=m}J_{LS}:J_{LS}=\operatorname{tr}(S_% {11}^{-1}S_{w,f})roman_min start_POSTSUBSCRIPT roman_rank ( italic_F ) = italic_m end_POSTSUBSCRIPT italic_J start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT : italic_J start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT = roman_tr ( italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT italic_w , italic_f end_POSTSUBSCRIPT )

has solution F^RLS=S1112V^*,mV^*,mS1112F^LSsubscript^𝐹𝑅𝐿𝑆superscriptsubscript𝑆1112subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹𝐿𝑆\hat{F}_{RLS}=S_{11}^{\frac{1}{2}}\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}S_{11}^{-% \frac{1}{2}}\hat{F}_{LS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT where V^*,m=[v^*,1v^*,m]subscript^𝑉𝑚delimited-[]matrixsubscript^𝑣1subscript^𝑣𝑚\hat{V}_{*,m}=[\begin{matrix}\hat{v}_{*,1}&\dotsm&\hat{v}_{*,m}\end{matrix}]over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , 1 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] and v^,rsubscript^𝑣𝑟\hat{v}_{\ast,r}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT ∗ , italic_r end_POSTSUBSCRIPT are the ‘top’ mlimit-from𝑚m-italic_m -eigenvectors of R^*=S1112S10S001S01S1112.subscript^𝑅superscriptsubscript𝑆1112subscript𝑆10superscriptsubscript𝑆001subscript𝑆01superscriptsubscript𝑆1112\hat{R}_{*}=S_{11}^{-\frac{1}{2}}S_{10}S_{00}^{-1}S_{01}S_{11}^{-\frac{1}{2}}.over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT . The noise covariance estimator is Q^RLS=S1112(IV^*,mD^*,m2V^*,m)S1112subscript^𝑄𝑅𝐿𝑆superscriptsubscript𝑆1112𝐼subscript^𝑉𝑚superscriptsubscript^𝐷𝑚2superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112\hat{Q}_{RLS}=S_{11}^{\frac{1}{2}}(I-\hat{V}_{*,m}\hat{D}_{*,m}^{2}\hat{V}_{*,% m}^{\prime})S_{11}^{\frac{1}{2}}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_I - over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_D end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT where D^*,m2=diag(λ^*,k)superscriptsubscript^𝐷𝑚2diagsubscript^𝜆𝑘\hat{D}_{*,m}^{2}=\operatorname{diag}(\hat{\lambda}_{*,k})over^ start_ARG italic_D end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = roman_diag ( over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT ) contains the top m𝑚mitalic_m eigenvalues of R^*subscript^𝑅\hat{R}_{*}over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT.

Remarks.

  1. (f)

    F^RLSsubscript^𝐹𝑅𝐿𝑆\hat{F}_{RLS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT is not guaranteed to be stable.

  2. (g)

    We note for future reference that R^*subscript^𝑅\hat{R}_{*}over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT has the same eigenvalues as L^*=S10S001S01S111subscript^𝐿subscript𝑆10superscriptsubscript𝑆001subscript𝑆01superscriptsubscript𝑆111\hat{L}_{*}=S_{10}S_{00}^{-1}S_{01}S_{11}^{-1}over^ start_ARG italic_L end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

  3. (h)

    We can re-express the results in terms of the SVD of G^*=S1112S10S0012=V^*D^*U^*subscript^𝐺superscriptsubscript𝑆1112subscript𝑆10superscriptsubscript𝑆0012subscript^𝑉subscript^𝐷superscriptsubscript^𝑈\hat{G}_{*}=S_{11}^{-\frac{1}{2}}S_{10}S_{00}^{-\frac{1}{2}}=\hat{V}_{*}\hat{D% }_{*}\hat{U}_{*}^{\prime}over^ start_ARG italic_G end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT = over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT over^ start_ARG italic_D end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT over^ start_ARG italic_U end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT since R^*=G^*G^*subscript^𝑅subscript^𝐺superscriptsubscript^𝐺\hat{R}_{*}=\hat{G}_{*}\hat{G}_{*}^{\prime}over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = over^ start_ARG italic_G end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT over^ start_ARG italic_G end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. It follows from the Biename-Cauchy-Schwarz inequality that the singular values of G^*subscript^𝐺\hat{G}_{*}over^ start_ARG italic_G end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT are <1absent1<1< 1 w.p.1.

  4. (i)

    [20] gives a different formula for Q^RLSsubscript^𝑄𝑅𝐿𝑆\hat{Q}_{RLS}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT but it is straightforward to show it equals the one given here.

We would now like to set up a FB version of this problem. But it turns out that for general P𝑃Pitalic_P there is no simple solution. Fortunately if we choose P=S11𝑃subscript𝑆11P=S_{11}italic_P = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT we can find a simple guaranteed stable estimator.

Theorem 4B. The solution to minrank(F)=mJ(F;S11)subscriptrank𝐹𝑚𝐽𝐹subscript𝑆11\min_{\operatorname{rank}(F)=m}J(F;S_{11})roman_min start_POSTSUBSCRIPT roman_rank ( italic_F ) = italic_m end_POSTSUBSCRIPT italic_J ( italic_F ; italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) is given by F^R=S1112V^mV^mS1112F^11subscript^𝐹𝑅superscriptsubscript𝑆1112subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹11\hat{F}_{R}=S_{11}^{\frac{1}{2}}\hat{V}_{m}\hat{V}_{m}^{\prime}S_{11}^{-\frac{% 1}{2}}\hat{F}_{11}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT where V^m=[v^1v^m]subscript^𝑉𝑚delimited-[]matrixsubscript^𝑣1subscript^𝑣𝑚\hat{V}_{m}=[\begin{matrix}\hat{v}_{1}&\dotsm&\hat{v}_{m}\end{matrix}]over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ] and v^rsubscript^𝑣𝑟\hat{v}_{r}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT are the ‘top’ m𝑚mitalic_m eigenvectors (corresponding to the top m𝑚mitalic_m eigenvalues in D^m=diag(λ^k)subscript^𝐷𝑚diagsubscript^𝜆𝑘\hat{D}_{m}=\operatorname{diag}(\hat{\lambda}_{k})over^ start_ARG italic_D end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = roman_diag ( over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )) of

R^=2S1112S10(S00+S11)1S01S1112.^𝑅2superscriptsubscript𝑆1112subscript𝑆10superscriptsubscript𝑆00subscript𝑆111subscript𝑆01superscriptsubscript𝑆1112\displaystyle\hat{R}=2S_{11}^{-\frac{1}{2}}S_{10}(S_{00}+S_{11})^{-1}S_{01}S_{% 11}^{-\frac{1}{2}}.over^ start_ARG italic_R end_ARG = 2 italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT .

Further F^Rsubscript^𝐹𝑅\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is stable.

Proof. See the appendix.

5 Asymptotic Analysis

Here we show the consistency and central limit theorem (CLT) for F^Rsubscript^𝐹𝑅\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT by reducing them to results for F^RLSsubscript^𝐹𝑅𝐿𝑆\hat{F}_{RLS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT.

For asymptotic results, we need stronger assumptions. There are a wide range of possibilities, but to keep things simple, while still retaining reasonable generality we use:

Assumption A1. wtsubscript𝑤𝑡w_{t}italic_w start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT in (2.1) are iid with finite fourth moments, and Q𝑄Qitalic_Q is positive definite.

Assumption A2. F𝐹Fitalic_F has rank m𝑚mitalic_m and the non zero eigenvalues of F𝐹Fitalic_F are distinct.

This enables the following result for F^RLSsubscript^𝐹𝑅𝐿𝑆\hat{F}_{RLS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT (where F𝐹Fitalic_F is again the true value).

Theorem 5A. Assuming m=rank(F)𝑚rank𝐹m=\operatorname{rank}(F)italic_m = roman_rank ( italic_F ) is known

  1. (i)

    Under A1: S00𝑝Π𝑝subscript𝑆00ΠS_{00}\xrightarrow{p}\Piitalic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW roman_Π and S11𝑝Π𝑝subscript𝑆11ΠS_{11}\xrightarrow{p}\Piitalic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW roman_Π.

  2. (ii)

    Under A1: F^LS𝑝FS10𝑝FΠ𝑝subscript^𝐹𝐿𝑆𝐹subscript𝑆10𝑝𝐹Π\hat{F}_{LS}\xrightarrow{p}F\equiv S_{10}\xrightarrow{p}F\Piover^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_F ≡ italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_F roman_Π.

Under A1, A2: RF=Π12FΠ12subscript𝑅𝐹superscriptΠ12𝐹superscriptΠ12R_{F}=\Pi^{-\frac{1}{2}}F\Pi^{\frac{1}{2}}italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT = roman_Π start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_F roman_Π start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT has rank m𝑚mitalic_m and:

  1. (iii)

    V^*,m𝑝Vm𝑝subscript^𝑉𝑚subscript𝑉𝑚\hat{V}_{*,m}\xrightarrow{p}V_{m}over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_V start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT the eigenvectors of RFRFsubscript𝑅𝐹superscriptsubscript𝑅𝐹R_{F}R_{F}^{\prime}italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.
    U^*,m𝑝Um𝑝subscript^𝑈𝑚subscript𝑈𝑚\hat{U}_{*,m}\xrightarrow{p}U_{m}over^ start_ARG italic_U end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_U start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT the eigenvectors of RFRFsuperscriptsubscript𝑅𝐹subscript𝑅𝐹R_{F}^{\prime}R_{F}italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT.
    λ^k𝑝λk and λ^*,k𝑝λk𝑝subscript^𝜆𝑘subscript𝜆𝑘 and subscript^𝜆𝑘𝑝subscript𝜆𝑘\hat{\lambda}_{k}\xrightarrow{p}\lambda_{k}\mbox{ and }\hat{\lambda}_{*,k}% \xrightarrow{p}\lambda_{k}over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT the kthsuperscript𝑘𝑡k^{th}italic_k start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT largest eigenvalue of RFRFsubscript𝑅𝐹superscriptsubscript𝑅𝐹R_{F}R_{F}^{\prime}italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

  2. (iv)

    G^*𝑝RF=Π12FΠ12𝑝subscript^𝐺subscript𝑅𝐹superscriptΠ12𝐹superscriptΠ12\hat{G}_{*}\xrightarrow{p}R_{F}=\Pi^{-\frac{1}{2}}F\Pi^{\frac{1}{2}}over^ start_ARG italic_G end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT = roman_Π start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_F roman_Π start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT.

  3. (v)

    F^RLS𝑝F=AB𝑝subscript^𝐹𝑅𝐿𝑆𝐹𝐴𝐵\hat{F}_{RLS}\xrightarrow{p}F=ABover^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_F = italic_A italic_B.

  4. (vi)

    Tvec(F^RLSF)ZN(0,Σ)𝑇vecsuperscriptsubscript^𝐹𝑅𝐿𝑆superscript𝐹𝑍similar-to𝑁0Σ\sqrt{T}\operatorname{vec}(\hat{F}_{RLS}^{\prime}-F^{\prime})\Rightarrow Z\sim N% (0,\Sigma)square-root start_ARG italic_T end_ARG roman_vec ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⇒ italic_Z ∼ italic_N ( 0 , roman_Σ ) where
    Σ=MWMΣ𝑀𝑊superscript𝑀\Sigma=MWM^{\prime}roman_Σ = italic_M italic_W italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT with M=[(IB),(AI)]𝑀tensor-product𝐼superscript𝐵tensor-product𝐴𝐼M=[(I\otimes B^{\prime}),(A\otimes I)]italic_M = [ ( italic_I ⊗ italic_B start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , ( italic_A ⊗ italic_I ) ] and W𝑊Witalic_W (which is complicated) is given in [20](2.36).

Proof. (i),(ii) can be found in [8][chapter 11]. (v),(vi) are from [20][Section 2.5 (Theorem 2.4) and Section 5.2]. For (iii) first note that from (i),(ii) L^*𝑝L=FΠFΠ1𝑝subscript^𝐿𝐿𝐹Πsuperscript𝐹superscriptΠ1\hat{L}_{*}\xrightarrow{p}L=F\Pi F^{\prime}\Pi^{-1}over^ start_ARG italic_L end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_L = italic_F roman_Π italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT roman_Π start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT which has the same eigenvalues as R=RFRF𝑅subscript𝑅𝐹superscriptsubscript𝑅𝐹R=R_{F}R_{F}^{\prime}italic_R = italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT italic_R start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. The second part follows in the same way. Next, since distinct eigenvalues and eigenvectors are continuous functions of the underlying matrix entries [13] then V^*,m𝑝Vm𝑝subscript^𝑉𝑚subscript𝑉𝑚\hat{V}_{*,m}\xrightarrow{p}V_{m}over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_V start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT333 If ξT𝑝ξ𝑝subscript𝜉𝑇𝜉\xi_{T}\xrightarrow{p}\xiitalic_ξ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_ξ and f𝑓fitalic_f is continuous then f(ξT)𝑝f(ξ)𝑝𝑓subscript𝜉𝑇𝑓𝜉f(\xi_{T})\xrightarrow{p}f(\xi)italic_f ( italic_ξ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) start_ARROW overitalic_p → end_ARROW italic_f ( italic_ξ ) containing the ‘top’ eigenvalues of R𝑅Ritalic_R. Further, the eigenvalues 𝑝𝑝\xrightarrow{p}start_ARROW overitalic_p → end_ARROW as stated. (iv) now follows from (iii).

To continue for F^Rsubscript^𝐹𝑅\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT, we need some lemmas.

Lemma 5B. Let Γt=E[ytyt]subscriptΓ𝑡Esubscript𝑦𝑡superscriptsubscript𝑦𝑡\Gamma_{t}=\operatorname{E}[y_{t}y_{t}^{\prime}]roman_Γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_E [ italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] and denote ρ=𝜌absent\rho=italic_ρ = spectral radius of F𝐹Fitalic_F. Then ΓtΠ2cρ2tsubscriptnormsubscriptΓ𝑡Π2𝑐superscript𝜌2𝑡\|\Gamma_{t}-\Pi\|_{2}\leq c\rho^{2t}∥ roman_Γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - roman_Π ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_c italic_ρ start_POSTSUPERSCRIPT 2 italic_t end_POSTSUPERSCRIPT for some constant c𝑐citalic_c.

Proof. Taking variances in (2.1) gives Γt=FΓt1F+QsubscriptΓ𝑡𝐹subscriptΓ𝑡1superscript𝐹𝑄\Gamma_{t}=F\Gamma_{t-1}F^{\prime}+Qroman_Γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_F roman_Γ start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_Q. Subtracting the ΠΠ\Piroman_Π equation from this gives ΓtΠ=F(Γt1Π)FsubscriptΓ𝑡Π𝐹subscriptΓ𝑡1Πsuperscript𝐹\Gamma_{t}-\Pi=F(\Gamma_{t-1}-\Pi)F^{\prime}roman_Γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - roman_Π = italic_F ( roman_Γ start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT - roman_Π ) italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Iterating this gives ΓtΠ=FtD(F)tsubscriptΓ𝑡Πsuperscript𝐹𝑡𝐷superscriptsuperscript𝐹𝑡\Gamma_{t}-\Pi=F^{t}D(F^{\prime})^{t}roman_Γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - roman_Π = italic_F start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_D ( italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT where D=Γ0Π𝐷subscriptΓ0ΠD=\Gamma_{0}-\Piitalic_D = roman_Γ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - roman_Π. Now recall Gelfand’s theorem [9]: Ft1/tρsuperscriptnormsuperscript𝐹𝑡1𝑡𝜌\|F^{t}\|^{1/t}\rightarrow\rho∥ italic_F start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 1 / italic_t end_POSTSUPERSCRIPT → italic_ρ, so that limsupt(ΓtΠ)/ρ2t1/tc𝑙𝑖𝑚𝑠𝑢subscript𝑝𝑡superscriptnormsubscriptΓ𝑡Πsuperscript𝜌2𝑡1𝑡𝑐limsup_{t}\|(\Gamma_{t}-\Pi)/\rho^{2t}\|^{1/t}\leq citalic_l italic_i italic_m italic_s italic_u italic_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ ( roman_Γ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - roman_Π ) / italic_ρ start_POSTSUPERSCRIPT 2 italic_t end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 1 / italic_t end_POSTSUPERSCRIPT ≤ italic_c. \square

The result follows from this.

Lemma 5C. Under A1: TS11S00𝑝0𝑝𝑇normsubscript𝑆11subscript𝑆000\sqrt{T}\|S_{11}-S_{00}\|\xrightarrow{p}0square-root start_ARG italic_T end_ARG ∥ italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ∥ start_ARROW overitalic_p → end_ARROW 0.

Proof. We have

S11S00=1T[yTyTy0y0]S11S0022T(yT2+y02)TES11S0022T[tr(ΓT)+tr(Γ1)]missing-subexpressionsubscript𝑆11subscript𝑆001𝑇delimited-[]subscript𝑦𝑇superscriptsubscript𝑦𝑇subscript𝑦0superscriptsubscript𝑦0superscriptnormsubscript𝑆11subscript𝑆0022𝑇superscriptnormsubscript𝑦𝑇2superscriptnormsubscript𝑦02𝑇Esuperscriptnormsubscript𝑆11subscript𝑆0022𝑇delimited-[]trsubscriptΓ𝑇trsubscriptΓ1\displaystyle\begin{array}[]{rrcl}&S_{11}-S_{00}&=&\frac{1}{T}[y_{T}y_{T}^{% \prime}-y_{0}y_{0}^{\prime}]\\ \Rightarrow&\|S_{11}-S_{00}\|^{2}&\leq&\frac{2}{T}(\|y_{T}\|^{2}+\|y_{0}\|^{2}% )\\ \Rightarrow&\sqrt{T}\operatorname{E}\|S_{11}-S_{00}\|^{2}&\leq&\frac{2}{\sqrt{% T}}[\operatorname{tr}(\Gamma_{T})+\operatorname{tr}(\Gamma_{1})]\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_T end_ARG [ italic_y start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL ∥ italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL ≤ end_CELL start_CELL divide start_ARG 2 end_ARG start_ARG italic_T end_ARG ( ∥ italic_y start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL square-root start_ARG italic_T end_ARG roman_E ∥ italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL ≤ end_CELL start_CELL divide start_ARG 2 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG [ roman_tr ( roman_Γ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) + roman_tr ( roman_Γ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] end_CELL end_ROW end_ARRAY

The second term 0absent0\rightarrow 0→ 0 while the first term 0absent0\rightarrow 0→ 0 from Lemma 5B. The result then follows. \square

Lemma 5D. Under A1, A2: TF^LSF^11𝑝0𝑝𝑇normsubscript^𝐹𝐿𝑆subscript^𝐹110\sqrt{T}\|\hat{F}_{LS}-\hat{F}_{11}\|\xrightarrow{p}0square-root start_ARG italic_T end_ARG ∥ over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ∥ start_ARROW overitalic_p → end_ARROW 0.

Proof. We have

T(F^LSF^11)=T(S10S0012S10(S00+S11)1)=T2S10[(2S00)1(S11+S00)1]=TS10S001[S00+S112S00](S00+S11)1=F^LS[T(S11S00)](S11+S00)1missing-subexpressionmissing-subexpression𝑇subscript^𝐹𝐿𝑆subscript^𝐹11missing-subexpression𝑇subscript𝑆10superscriptsubscript𝑆0012subscript𝑆10superscriptsubscript𝑆00subscript𝑆111missing-subexpression𝑇2subscript𝑆10delimited-[]superscript2subscript𝑆001superscriptsubscript𝑆11subscript𝑆001missing-subexpression𝑇subscript𝑆10superscriptsubscript𝑆001delimited-[]subscript𝑆00subscript𝑆112subscript𝑆00superscriptsubscript𝑆00subscript𝑆111missing-subexpressionsubscript^𝐹𝐿𝑆delimited-[]𝑇subscript𝑆11subscript𝑆00superscriptsubscript𝑆11subscript𝑆001\displaystyle\begin{array}[]{rcl}&&\sqrt{T}(\hat{F}_{LS}-\hat{F}_{11})\\ &=&\sqrt{T}(S_{10}S_{00}^{-1}-2S_{10}(S_{00}+S_{11})^{-1})\\ &=&\sqrt{T}2S_{10}[(2S_{00})^{-1}-(S_{11}+S_{00})^{-1}]\\ &=&\sqrt{T}S_{10}S_{00}^{-1}[S_{00}+S_{11}-2S_{00}](S_{00}+S_{11})^{-1}\\ &=&\hat{F}_{LS}[\sqrt{T}(S_{11}-S_{00})](S_{11}+S_{00})^{-1}\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG ( italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - 2 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG 2 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT [ ( 2 italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - ( italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ] end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT [ italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - 2 italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ] ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = end_CELL start_CELL over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT [ square-root start_ARG italic_T end_ARG ( italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) ] ( italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARRAY

Now from Theorem 5A and Lemma 5C each term converges in probability, so the product also converges in probability. But T(S11S00)𝑝0𝑝𝑇subscript𝑆11subscript𝑆000\sqrt{T}(S_{11}-S_{00})\xrightarrow{p}0square-root start_ARG italic_T end_ARG ( italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT ) start_ARROW overitalic_p → end_ARROW 0 yielding the result.

Theorem 5E. Under A1, A2,

  1. (i)

    F^R𝑝F𝑝subscript^𝐹𝑅𝐹\hat{F}_{R}\xrightarrow{p}Fover^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_F.

  2. (ii)

    T(F^RLSF^R)𝑝0𝑝𝑇subscript^𝐹𝑅𝐿𝑆subscript^𝐹𝑅0\sqrt{T}(\hat{F}_{RLS}-\hat{F}_{R})\xrightarrow{p}0square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) start_ARROW overitalic_p → end_ARROW 0.

  3. (iii)

    Tvec(FRF)ZN(0,Σ)𝑇vecsuperscriptsubscript𝐹𝑅superscript𝐹𝑍similar-to𝑁0Σ\sqrt{T}\operatorname{vec}(F_{R}^{\prime}-F^{\prime})\Rightarrow Z\sim N(0,\Sigma)square-root start_ARG italic_T end_ARG roman_vec ( italic_F start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⇒ italic_Z ∼ italic_N ( 0 , roman_Σ ).

Proof. (i) follows from (ii) and Theorem 5A(v). (iii) follows from (ii) and Theorem 5A(vi). (ii) is proved in the appendix. \square

Refer to caption
Figure 1: Pole locations of the first 50505050 repeats.
Refer to caption
Figure 2: Pole magnitude histograms of the 1000100010001000 repeats: *** are the true poles. Unstable poles for LS for each T𝑇Titalic_T: 25.9%,6.9%,0.2%percent25.9percent6.9percent0.225.9\%,6.9\%,0.2\%25.9 % , 6.9 % , 0.2 %.

6 Simulations

We now show simulations to illustrate our stable estimator F^Rsubscript^𝐹𝑅\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT. Since there is no existing algorithm that guarantees a stable RR-VAR(1) estimator, we compare our estimator with the standard, reduced-rank, least square estimator F^RLSsubscript^𝐹𝑅𝐿𝑆\hat{F}_{RLS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT. We call our method the FB method and the latter the LS method. We expect the LS method will generate unstable estimates while FB does not. Also that LS and FB will have similar computational complexity given their formulae.

We run two sets of simulations:

  • Low dimensional state n=6𝑛6n=6italic_n = 6.

  • High dimensional state, up to n>3000𝑛3000n>3000italic_n > 3000.

The latter study represents increasingly common practical examples.

6.1 Simulation Design

Firstly, we specify F=[F00003×3]Ip𝐹tensor-productdelimited-[]subscript𝐹000subscript033subscript𝐼𝑝F=\left[\begin{smallmatrix}F_{0}&0\\ 0&0_{3\times 3}\end{smallmatrix}\right]\otimes I_{p}italic_F = [ start_ROW start_CELL italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 start_POSTSUBSCRIPT 3 × 3 end_POSTSUBSCRIPT end_CELL end_ROW ] ⊗ italic_I start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT, where p𝑝pitalic_p is to be chosen and F0=[0.990.100.10.990000.95]subscript𝐹0delimited-[]0.990.100.10.990000.95F_{0}=\left[\begin{smallmatrix}0.99&-0.1&0\\ 0.1&0.99&0\\ 0&0&0.95\end{smallmatrix}\right]italic_F start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = [ start_ROW start_CELL 0.99 end_CELL start_CELL - 0.1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0.1 end_CELL start_CELL 0.99 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0.95 end_CELL end_ROW ]. So, there are 3p3𝑝3p3 italic_p zero eigenvalues and 3p3𝑝3p3 italic_p non-zero eigenvalues repeating at 0.99±ȷ0.1plus-or-minus0.99italic-ȷ0.10.99\pm\jmath 0.10.99 ± italic_ȷ 0.1 and 0.950.950.950.95, very close to the unit circle. This is a typical case when a stability-guaranteed algorithm is needed. This is a similar setup as in [10].

Secondly, we let the length T𝑇Titalic_T of the time series be a multiple of model order n𝑛nitalic_n, so that the computational complexity for both LS and FB is O(n3)𝑂superscript𝑛3O(n^{3})italic_O ( italic_n start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) (assuming a standard matrix multiplication method). Also, note that in the one-dimensional case, the precision |F^LSF||F|subscript^𝐹𝐿𝑆𝐹𝐹\frac{|\hat{F}_{LS}-F|}{|F|}divide start_ARG | over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT - italic_F | end_ARG start_ARG | italic_F | end_ARG decays at a rate of T𝑇\sqrt{T}square-root start_ARG italic_T end_ARG. Thus, we consider T/n=l2T=l2n𝑇𝑛superscript𝑙2𝑇superscript𝑙2𝑛T/n=l^{2}\Rightarrow T=l^{2}nitalic_T / italic_n = italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⇒ italic_T = italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_n. The values of l𝑙litalic_l are best determined by system time constants. However, here we take a trial-and-error method and let T/n=22,62,102T=4n,36n,100nformulae-sequenceformulae-sequence𝑇𝑛superscript22superscript62superscript102𝑇4𝑛36𝑛100𝑛T/n=2^{2},6^{2},10^{2}\Rightarrow T=4n,36n,100nitalic_T / italic_n = 2 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , 6 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⇒ italic_T = 4 italic_n , 36 italic_n , 100 italic_n.

Thirdly, we choose p=2kn=6×2k,kformulae-sequence𝑝superscript2𝑘𝑛6superscript2𝑘𝑘p=2^{k}\Rightarrow n=6\times 2^{k},k\in\mathbb{N}italic_p = 2 start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ⇒ italic_n = 6 × 2 start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_k ∈ blackboard_N, so that the logarithm of the computational time is proportional to k𝑘kitalic_k.

Throughout this section, we assume the true model order m𝑚mitalic_m is known. Also, throughout this section, we take the noise covariance Q=In𝑄subscript𝐼𝑛Q=I_{n}italic_Q = italic_I start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT.

6.2 Low Dimensional Simulations

We let k=0p=1n=6𝑘0𝑝1𝑛6k=0\Rightarrow p=1\Rightarrow n=6italic_k = 0 ⇒ italic_p = 1 ⇒ italic_n = 6, and the rank m=3𝑚3m=3italic_m = 3. We consider 3333 time series lengths T/n=22,62,102T=24,216,600formulae-sequenceformulae-sequence𝑇𝑛superscript22superscript62superscript102𝑇24216600T/n=2^{2},6^{2},10^{2}\Rightarrow T=24,216,600italic_T / italic_n = 2 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , 6 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , 10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⇒ italic_T = 24 , 216 , 600. We simulate N=1000𝑁1000N=1000italic_N = 1000 realizations for each T𝑇Titalic_T and get the estimates F^Rsubscript^𝐹𝑅\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT and F^RLSsubscript^𝐹𝑅𝐿𝑆\hat{F}_{RLS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT.

We first plot the estimated pole locations in Fig. 1. We only plot the first 50505050 estimates due to crowdedness. It is sufficient to plot the poles with zero or positive imaginary parts. Fig. 1 gives a rough impression of the pole distributions. It is observed that as T𝑇Titalic_T increases, the poles of the estimates from both algorithms cluster closer around the true poles. However, the LS estimates have poles outside the unit circle while all poles of our FB estimates are inside the unit circle.

We further plot the histogram of pole magnitudes from the 1000100010001000 repeats in Fig. 2. The same conclusions can be drawn as from Fig. 1. At T=600𝑇600T=600italic_T = 600, both histograms have peaks at the true pole magnitudes, indicating good performance. However, we detect 25.9%,6.9%,0.2%percent25.9percent6.9percent0.225.9\%,6.9\%,0.2\%25.9 % , 6.9 % , 0.2 % unstable LS estimates for T=24,216,600𝑇24216600T=24,216,600italic_T = 24 , 216 , 600, respectively, while our FB method always guarantees stability.

We introduce two relative estimation errors (which are plotted as percentages) (using Frobenius norms)

eFB=F^RFF and eLS=F^RLSFFsubscript𝑒𝐹𝐵normsubscript^𝐹𝑅𝐹norm𝐹 and subscript𝑒𝐿𝑆normsubscript^𝐹𝑅𝐿𝑆𝐹norm𝐹\displaystyle e_{FB}=\frac{\|\hat{F}_{R}-F\|}{\|F\|}\mbox{ and }e_{LS}=\frac{% \|\hat{F}_{RLS}-F\|}{\|F\|}italic_e start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT = divide start_ARG ∥ over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - italic_F ∥ end_ARG start_ARG ∥ italic_F ∥ end_ARG and italic_e start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT = divide start_ARG ∥ over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT - italic_F ∥ end_ARG start_ARG ∥ italic_F ∥ end_ARG

and two prediction errors

ϵFBsubscriptitalic-ϵ𝐹𝐵\displaystyle\epsilon_{FB}italic_ϵ start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT =Y1F^RY0Y1F^LSY0Y1F^LSY0absentnormsubscript𝑌1subscript^𝐹𝑅subscript𝑌0normsubscript𝑌1subscript^𝐹𝐿𝑆subscript𝑌0normsubscript𝑌1subscript^𝐹𝐿𝑆subscript𝑌0\displaystyle=\frac{\|Y_{1}-\hat{F}_{R}Y_{0}\|-\|Y_{1}-\hat{F}_{LS}Y_{0}\|}{\|% Y_{1}-\hat{F}_{LS}Y_{0}\|}= divide start_ARG ∥ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ - ∥ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ end_ARG start_ARG ∥ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ end_ARG
ϵLSsubscriptitalic-ϵ𝐿𝑆\displaystyle\epsilon_{LS}italic_ϵ start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT =Y1F^RLSY0Y1F^LSY0Y1F^LSY0absentnormsubscript𝑌1subscript^𝐹𝑅𝐿𝑆subscript𝑌0normsubscript𝑌1subscript^𝐹𝐿𝑆subscript𝑌0normsubscript𝑌1subscript^𝐹𝐿𝑆subscript𝑌0\displaystyle=\frac{\|Y_{1}-\hat{F}_{RLS}Y_{0}\|-\|Y_{1}-\hat{F}_{LS}Y_{0}\|}{% \|Y_{1}-\hat{F}_{LS}Y_{0}\|}= divide start_ARG ∥ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ - ∥ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ end_ARG start_ARG ∥ italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ end_ARG

Note that F^LSsubscript^𝐹𝐿𝑆\hat{F}_{LS}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT minimizes the Frobenius prediction error, so ϵFB,ϵLS>0subscriptitalic-ϵ𝐹𝐵subscriptitalic-ϵ𝐿𝑆0\epsilon_{FB},\epsilon_{LS}>0italic_ϵ start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT > 0. We plot the histograms of the estimate errors eLS,eFBsubscript𝑒𝐿𝑆subscript𝑒𝐹𝐵e_{LS},e_{FB}italic_e start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT , italic_e start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT in Fig. 3 and of the prediction errors ϵLS,ϵFBsubscriptitalic-ϵ𝐿𝑆subscriptitalic-ϵ𝐹𝐵\epsilon_{LS},\epsilon_{FB}italic_ϵ start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT in Fig. 4. It is observed that that FB method has competitive performance against LS even in small sample cases, whilst guaranteeing stability.

Refer to caption
Figure 3: Estimation errors eLS,eFBsubscript𝑒𝐿𝑆subscript𝑒𝐹𝐵e_{LS},e_{FB}italic_e start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT , italic_e start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT: the red ‘***’ marks and the numbers at the right upper corners are the medians and the blue ‘||||’ marks are the upper and lower quantiles.
Refer to caption
Figure 4: Prediction errors ϵLS,ϵFBsubscriptitalic-ϵ𝐿𝑆subscriptitalic-ϵ𝐹𝐵\epsilon_{LS},\epsilon_{FB}italic_ϵ start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT: the red ‘***’ marks and the numbers at the right upper corners are the medians and the blue ‘||||’ marks are the upper and lower quantiles.

6.3 High Dimensional Simulations

While most stability-guaranteed algorithms are computationally intensive (again, these are for full-rank VAR(1)), and scale badly to high dimensional data, the FB method is not and has the same computational complexity as the LS method. We demonstrate this feature here. We also compare the estimation and prediction errors.

Refer to caption
Figure 5: Computational times tcompsubscript𝑡𝑐𝑜𝑚𝑝t_{comp}italic_t start_POSTSUBSCRIPT italic_c italic_o italic_m italic_p end_POSTSUBSCRIPT in log scale: tcompsubscript𝑡𝑐𝑜𝑚𝑝t_{comp}italic_t start_POSTSUBSCRIPT italic_c italic_o italic_m italic_p end_POSTSUBSCRIPT for LS and FB are almost identical and log(tcomp)subscript𝑡𝑐𝑜𝑚𝑝\log(t_{comp})roman_log ( italic_t start_POSTSUBSCRIPT italic_c italic_o italic_m italic_p end_POSTSUBSCRIPT ) is almost linear with logn𝑛\log nroman_log italic_n. The average computational time is about 49.649.649.649.6s for model order n=3072𝑛3072n=3072italic_n = 3072.

We consider k=1,2,,9𝑘129k=1,2,\dotsm,9italic_k = 1 , 2 , ⋯ , 9, so that n=12,,3072𝑛123072n=12,\dotsm,3072italic_n = 12 , ⋯ , 3072. We take the time series lengths T=36n𝑇36𝑛T=36nitalic_T = 36 italic_n and T=100n𝑇100𝑛T=100nitalic_T = 100 italic_n and simulate N=50𝑁50N=50italic_N = 50 realizations for each n𝑛nitalic_n. We expect that the computational times tcompsubscript𝑡𝑐𝑜𝑚𝑝t_{comp}italic_t start_POSTSUBSCRIPT italic_c italic_o italic_m italic_p end_POSTSUBSCRIPT for LS and FB be similar and that logtcompsubscript𝑡𝑐𝑜𝑚𝑝\log t_{comp}roman_log italic_t start_POSTSUBSCRIPT italic_c italic_o italic_m italic_p end_POSTSUBSCRIPT be linear with logn𝑛\log nroman_log italic_n.

We plot the computational times against the model order n𝑛nitalic_n in Fig. 5 in log scale. We only plot the case of T=100n𝑇100𝑛T=100nitalic_T = 100 italic_n because the plots of T=36n𝑇36𝑛T=36nitalic_T = 36 italic_n are very similar. It is easily observed that FB is as efficient as LS. For the largest model order n=3072𝑛3072n=3072italic_n = 3072, the average computational time is only about 49.649.649.649.6s. To the best of our knowledge, there exists no (full-rank) stability-guaranteed algorithm that can handle such a high model order with achievable computational resources, including time and memory.

For the estimation accuracy, we compare the estimation errors eLS,eFBsubscript𝑒𝐿𝑆subscript𝑒𝐹𝐵e_{LS},e_{FB}italic_e start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT , italic_e start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT and the prediction errors ϵLS,ϵFBsubscriptitalic-ϵ𝐿𝑆subscriptitalic-ϵ𝐹𝐵\epsilon_{LS},\epsilon_{FB}italic_ϵ start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT , italic_ϵ start_POSTSUBSCRIPT italic_F italic_B end_POSTSUBSCRIPT with T=36n𝑇36𝑛T=36nitalic_T = 36 italic_n and T=100n𝑇100𝑛T=100nitalic_T = 100 italic_n. The results are plotted in Fig. 6. It is interesting to observe that, first, LS and FB still have very similar errors in the higher-order cases, and second, the error medians converge to constants as n𝑛nitalic_n grows for T𝑇Titalic_T being a multiple of n𝑛nitalic_n.

Refer to caption
Figure 6: Relative estimation and prediction errors plotted against the model order n𝑛nitalic_n on log scale: The solid lines are the medians and the dotted lines are the standard deviations.

6.4 Simulations Summary

From the various comparisons above, we conclude that our FB estimator guarantees stability, has low computational complexity, and has competitive accuracy for both short and long length time series, and for both low and high model orders.

7 Conclusions

In this paper, we developed, for the first time, a simple, closed-form, stability-guaranteed estimator for the reduced-rank VAR(1). It is based on a forwards-backwards least squares criterion.

We also gave an asymptotic analysis showing that the new estimator is consistent and asymptotically efficient. Finally, we showed simulations demonstrating the competitive accuracy and computational efficiency of the new stable estimator.

In the future, we will develop a rank selection criterion for medium and large sample sizes.

8 Appendix: Proofs

We use the following reduced-rank least squares lemma.

Lemma R. [20][Theorem 2.2]

C^^𝐶\displaystyle\hat{C}over^ start_ARG italic_C end_ARG =argminCm×n:rank(C)=mtr{(ZCX)P1(ZCX)}absentsubscript:subscript𝐶𝑚𝑛rank𝐶𝑚trsuperscript𝑍𝐶𝑋superscript𝑃1𝑍𝐶𝑋\displaystyle=\arg\min_{C_{m\times n}:\operatorname{rank}(C)=m}\operatorname{% tr}\{(Z-CX)^{\prime}P^{-1}(Z-CX)\}= roman_arg roman_min start_POSTSUBSCRIPT italic_C start_POSTSUBSCRIPT italic_m × italic_n end_POSTSUBSCRIPT : roman_rank ( italic_C ) = italic_m end_POSTSUBSCRIPT roman_tr { ( italic_Z - italic_C italic_X ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_Z - italic_C italic_X ) }
=P12V^mV^mP12ZX(XX)1.absentsuperscript𝑃12subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscript𝑃12𝑍superscript𝑋superscript𝑋superscript𝑋1\displaystyle=P^{\frac{1}{2}}\hat{V}_{m}\hat{V}_{m}^{\prime}P^{-\frac{1}{2}}ZX% ^{\prime}(XX^{\prime})^{-1}.= italic_P start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_P start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_Z italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_X italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

where V^m=[v^1,,v^m]subscript^𝑉𝑚subscript^𝑣1subscript^𝑣𝑚\hat{V}_{m}=[\hat{v}_{1},\dotsm,\hat{v}_{m}]over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = [ over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ] with v^rsubscript^𝑣𝑟\hat{v}_{r}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT the eigenvector corresponding to the r𝑟ritalic_r-th largest eigenvalue of the matrix R^=P12ZX(XX)1XZP12^𝑅superscript𝑃12𝑍superscript𝑋superscript𝑋superscript𝑋1𝑋superscript𝑍superscript𝑃12\hat{R}=P^{-\frac{1}{2}}ZX^{\prime}(XX^{\prime})^{-1}XZ^{\prime}P^{-\frac{1}{2}}over^ start_ARG italic_R end_ARG = italic_P start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_Z italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_X italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_X italic_Z start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_P start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT.

Proof of Theorem 4B
We show that J(F;S11)𝐽𝐹subscript𝑆11J(F;S_{11})italic_J ( italic_F ; italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) has the form specified in Lemma R. First, reorganize J(F;P)𝐽𝐹𝑃J(F;P)italic_J ( italic_F ; italic_P ) to get

J(F;P)𝐽𝐹𝑃\displaystyle J(F;P)italic_J ( italic_F ; italic_P )
=\displaystyle== tr{P1(Sw,f+Sw,b)}trsuperscript𝑃1subscript𝑆𝑤𝑓subscript𝑆𝑤𝑏\displaystyle\operatorname{tr}\{P^{-1}(S_{w,f}+S_{w,b})\}roman_tr { italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT italic_w , italic_f end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ) }
=\displaystyle== tr{P1(4S10F+FS00F+FPFP1S11)},trsuperscript𝑃14subscript𝑆10superscript𝐹𝐹subscript𝑆00superscript𝐹𝐹𝑃superscript𝐹superscript𝑃1subscript𝑆11\displaystyle\operatorname{tr}\{P^{-1}(-4S_{10}F^{\prime}+FS_{00}F^{\prime}+% FPF^{\prime}P^{-1}S_{11})\},roman_tr { italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( - 4 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F italic_P italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_P start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) } ,

where we dropped a constant term not dependent on F𝐹Fitalic_F. Now, set P=S11𝑃subscript𝑆11P=S_{11}italic_P = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT to get

J(F;S11)𝐽𝐹subscript𝑆11\displaystyle J(F;S_{11})italic_J ( italic_F ; italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT )
=\displaystyle== tr{S111(4S10F+FS00F+FS11F)}trsuperscriptsubscript𝑆1114subscript𝑆10superscript𝐹𝐹subscript𝑆00superscript𝐹𝐹subscript𝑆11superscript𝐹\displaystyle\operatorname{tr}\{S_{11}^{-1}(-4S_{10}F^{\prime}+FS_{00}F^{% \prime}+FS_{11}F^{\prime})\}roman_tr { italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( - 4 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) }
=\displaystyle== tr{S111(4S10F+F(S00+S11)F}\displaystyle\operatorname{tr}\{S_{11}^{-1}(-4S_{10}F^{\prime}+F(S_{00}+S_{11}% )F^{\prime}\}roman_tr { italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( - 4 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_F ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) italic_F start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }
=\displaystyle== S1112(2S10(S00+S11)12F(S00+S11)12)2,superscriptnormsuperscriptsubscript𝑆11122subscript𝑆10superscriptsubscript𝑆00subscript𝑆1112𝐹superscriptsubscript𝑆00subscript𝑆11122\displaystyle\|S_{11}^{-\frac{1}{2}}(2S_{10}(S_{00}+S_{11})^{-\frac{1}{2}}-F(S% _{00}+S_{11})^{\frac{1}{2}})\|^{2},∥ italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( 2 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT - italic_F ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,

where again we dropped a constant term not dependent on F𝐹Fitalic_F. Now apply Lemma R to get the first quoted result.

We now prove that Q*=S111F^RS111F^Rsubscript𝑄superscriptsubscript𝑆111superscriptsubscript^𝐹𝑅superscriptsubscript𝑆111subscript^𝐹𝑅Q_{*}=S_{11}^{-1}-\hat{F}_{R}^{\prime}S_{11}^{-1}\hat{F}_{R}italic_Q start_POSTSUBSCRIPT * end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is positive definite. Then Lyapunov’s theorem gives that F^Rsubscript^𝐹𝑅\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is stable.

We have

Q*subscript𝑄\displaystyle Q_{*}italic_Q start_POSTSUBSCRIPT * end_POSTSUBSCRIPT =(S111F^11S111F^11)+(F^11S111F^11F^RS111F^R)absentsuperscriptsubscript𝑆111superscriptsubscript^𝐹11superscriptsubscript𝑆111subscript^𝐹11superscriptsubscript^𝐹11superscriptsubscript𝑆111subscript^𝐹11superscriptsubscript^𝐹𝑅superscriptsubscript𝑆111subscript^𝐹𝑅\displaystyle=(S_{11}^{-1}-\hat{F}_{11}^{\prime}S_{11}^{-1}\hat{F}_{11})+(\hat% {F}_{11}^{\prime}S_{11}^{-1}\hat{F}_{11}-\hat{F}_{R}^{\prime}S_{11}^{-1}\hat{F% }_{R})= ( italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) + ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT )
=S111UbS111+(F^11S111F^11F^RS111F^R)absentsuperscriptsubscript𝑆111subscript𝑈𝑏superscriptsubscript𝑆111superscriptsubscript^𝐹11superscriptsubscript𝑆111subscript^𝐹11superscriptsubscript^𝐹𝑅superscriptsubscript𝑆111subscript^𝐹𝑅\displaystyle=S_{11}^{-1}U_{b}S_{11}^{-1}+(\hat{F}_{11}^{\prime}S_{11}^{-1}% \hat{F}_{11}-\hat{F}_{R}^{\prime}S_{11}^{-1}\hat{F}_{R})= italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT )
Ubsubscript𝑈𝑏\displaystyle U_{b}italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT =S11F^bS11F^babsentsubscript𝑆11subscript^𝐹𝑏subscript𝑆11superscriptsubscript^𝐹𝑏\displaystyle=S_{11}-\hat{F}_{b}S_{11}\hat{F}_{b}^{\prime}= italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
F^bsubscript^𝐹𝑏\displaystyle\hat{F}_{b}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT =S11F^11S111absentsubscript𝑆11superscriptsubscript^𝐹11superscriptsubscript𝑆111\displaystyle=S_{11}\hat{F}_{11}^{\prime}S_{11}^{-1}= italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT

For the second term we have

F^11S111F^11F^RS111F^Rsuperscriptsubscript^𝐹11superscriptsubscript𝑆111subscript^𝐹11superscriptsubscript^𝐹𝑅superscriptsubscript𝑆111subscript^𝐹𝑅\displaystyle\hat{F}_{11}^{\prime}S_{11}^{-1}\hat{F}_{11}-\hat{F}_{R}^{\prime}% S_{11}^{-1}\hat{F}_{R}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT
=\displaystyle== F^11S1112(IV^mV^m)S1112F^110.superscriptsubscript^𝐹11superscriptsubscript𝑆1112𝐼subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹110\displaystyle\hat{F}_{11}^{\prime}S_{11}^{-\frac{1}{2}}(I-\hat{V}_{m}\hat{V}_{% m}^{\prime})S_{11}^{-\frac{1}{2}}\hat{F}_{11}\geq 0.over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( italic_I - over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ≥ 0 .

For the first term, we have

2S10=F^11S00+F^11S112S01=S00F^11+S11F^112S01=S00S111F^bS11+F^bS11S00S111F^bS11F^b=2S01F^b+F^bS11F^bΦUb=S002S01F^b+F^bS11F^b.missing-subexpression2subscript𝑆10subscript^𝐹11subscript𝑆00subscript^𝐹11subscript𝑆112subscript𝑆01subscript𝑆00superscriptsubscript^𝐹11subscript𝑆11superscriptsubscript^𝐹112subscript𝑆01subscript𝑆00superscriptsubscript𝑆111subscript^𝐹𝑏subscript𝑆11subscript^𝐹𝑏subscript𝑆11subscript𝑆00superscriptsubscript𝑆111subscript^𝐹𝑏subscript𝑆11superscriptsubscript^𝐹𝑏2subscript𝑆01superscriptsubscript^𝐹𝑏subscript^𝐹𝑏subscript𝑆11superscriptsubscript^𝐹𝑏Φsubscript𝑈𝑏subscript𝑆002subscript𝑆01subscript^𝐹𝑏subscript^𝐹𝑏subscript𝑆11superscriptsubscript^𝐹𝑏\displaystyle\begin{array}[]{rrcl}&2S_{10}&=&\hat{F}_{11}S_{00}+\hat{F}_{11}S_% {11}\\ \Rightarrow&2S_{01}&=&S_{00}\hat{F}_{11}^{\prime}+S_{11}\hat{F}_{11}^{\prime}% \\ \Rightarrow&2S_{01}&=&S_{00}S_{11}^{-1}\hat{F}_{b}S_{11}+\hat{F}_{b}S_{11}\\ \Rightarrow&-S_{00}S_{11}^{-1}\hat{F}_{b}S_{11}\hat{F}_{b}^{\prime}&=&-2S_{01}% \hat{F}_{b}^{\prime}+\hat{F}_{b}S_{11}\hat{F}_{b}^{\prime}\\ \Rightarrow&-\Phi U_{b}&=&S_{00}-2S_{01}\hat{F}_{b}+\hat{F}_{b}S_{11}\hat{F}_{% b}^{\prime}.\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL 2 italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL 2 italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL 2 italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT + over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL = end_CELL start_CELL - 2 italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT + over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⇒ end_CELL start_CELL - roman_Φ italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_CELL start_CELL = end_CELL start_CELL italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT - 2 italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT + over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . end_CELL end_ROW end_ARRAY

where Φ=S00S111Φsubscript𝑆00superscriptsubscript𝑆111\Phi=-S_{00}S_{11}^{-1}roman_Φ = - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. Now take the transpose and add the two equations together to get

ΦUb+Ub(Φ)=Φsubscript𝑈𝑏subscript𝑈𝑏superscriptΦabsent\displaystyle-\Phi U_{b}+U_{b}(-\Phi)^{\prime}=- roman_Φ italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT + italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( - roman_Φ ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 2[S00S01F^bF^bS10+F^bS11F^b]2delimited-[]subscript𝑆00subscript𝑆01superscriptsubscript^𝐹𝑏subscript^𝐹𝑏subscript𝑆10subscript^𝐹𝑏subscript𝑆11superscriptsubscript^𝐹𝑏\displaystyle 2[S_{00}-S_{01}\hat{F}_{b}^{\prime}-\hat{F}_{b}S_{10}+\hat{F}_{b% }S_{11}\hat{F}_{b}^{\prime}]2 [ italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT + over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ]
=\displaystyle== 2T(Y0F^bY1)(Y0F^bY1)=2S^w,b2𝑇subscript𝑌0subscript^𝐹𝑏subscript𝑌1superscriptsubscript𝑌0subscript^𝐹𝑏subscript𝑌12subscript^𝑆𝑤𝑏\displaystyle\frac{2}{T}(Y_{0}-\hat{F}_{b}Y_{1})(Y_{0}-\hat{F}_{b}Y_{1})^{% \prime}=2\hat{S}_{w,b}divide start_ARG 2 end_ARG start_ARG italic_T end_ARG ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ( italic_Y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_Y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 2 over^ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT

which we show below is positive definite. But note that this is a continuous time Lyapunov equation where ΦΦ\Phiroman_Φ is a stability matrix - it has the same eigenvalues as S0012S111S0012superscriptsubscript𝑆0012superscriptsubscript𝑆111superscriptsubscript𝑆0012-S_{00}^{\frac{1}{2}}S_{11}^{-1}S_{00}^{\frac{1}{2}}- italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT which is negative definite with, therefore, negative real eigenvalues. Thus Ubsubscript𝑈𝑏U_{b}italic_U start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT must be positive definite and the result is established.

To see that S^w,bsubscript^𝑆𝑤𝑏\hat{S}_{w,b}over^ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT is positive definite first note that from A1 and Theorem 2 Qbabsentsubscript𝑄𝑏\Rightarrow Q_{b}⇒ italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is positive definite. Now the backward least squares estimator F^b,LS=S01S111subscript^𝐹𝑏𝐿𝑆subscript𝑆01superscriptsubscript𝑆111\hat{F}_{b,LS}=S_{01}S_{11}^{-1}over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT minimises Sw,b(F)subscript𝑆𝑤𝑏𝐹S_{w,b}(F)italic_S start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ( italic_F ). Then for any F𝐹Fitalic_F, completion of squares gives

Sw,b(F)=Q^b,LS+(F^b,LSF)S11(F^b,LSF)subscript𝑆𝑤𝑏𝐹subscript^𝑄𝑏𝐿𝑆subscript^𝐹𝑏𝐿𝑆𝐹subscript𝑆11superscriptsubscript^𝐹𝑏𝐿𝑆𝐹\displaystyle S_{w,b}(F)=\hat{Q}_{b,LS}+(\hat{F}_{b,LS}-F)S_{11}(\hat{F}_{b,LS% }-F)^{\prime}italic_S start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ( italic_F ) = over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT + ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT - italic_F ) italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT - italic_F ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT

where Q^b,LS=Sw,b(F^b,LS)subscript^𝑄𝑏𝐿𝑆subscript𝑆𝑤𝑏subscript^𝐹𝑏𝐿𝑆\hat{Q}_{b,LS}=S_{w,b}(\hat{F}_{b,LS})over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT ). Set F=0𝐹0F=0italic_F = 0 to get Q^b,LS=S00S01S111S10subscript^𝑄𝑏𝐿𝑆subscript𝑆00subscript𝑆01superscriptsubscript𝑆111subscript𝑆10\hat{Q}_{b,LS}=S_{00}-S_{01}S_{11}^{-1}S_{10}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 01 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT which 𝑝Qb𝑝absentsubscript𝑄𝑏\xrightarrow{p}Q_{b}start_ARROW overitalic_p → end_ARROW italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT. Now since Qbsubscript𝑄𝑏Q_{b}italic_Q start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT is full rank and the eigenvalues of Q^b,LSsubscript^𝑄𝑏𝐿𝑆\hat{Q}_{b,LS}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT are continuous functions of the entries in Q^b,LSsubscript^𝑄𝑏𝐿𝑆\hat{Q}_{b,LS}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT they cannot accumulate mass at 00 i.e. the smallest eigenvalue of Q^b,LSsubscript^𝑄𝑏𝐿𝑆\hat{Q}_{b,LS}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT is 00 with zero probability. So Q^b,LSsubscript^𝑄𝑏𝐿𝑆\hat{Q}_{b,LS}over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT is positive definite w.p.1. But setting F=F^b𝐹subscript^𝐹𝑏F=\hat{F}_{b}italic_F = over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT above, we see that S^w,bQ^b,LSsubscript^𝑆𝑤𝑏subscript^𝑄𝑏𝐿𝑆\hat{S}_{w,b}\geq\hat{Q}_{b,LS}over^ start_ARG italic_S end_ARG start_POSTSUBSCRIPT italic_w , italic_b end_POSTSUBSCRIPT ≥ over^ start_ARG italic_Q end_ARG start_POSTSUBSCRIPT italic_b , italic_L italic_S end_POSTSUBSCRIPT and so is positive definite w.p.1 as required.

Proof of Theorem 5E(i).

We have

T(F^RLSF^R)=TS1112(V^*,mV^*,mS1112F^LSV^mV^mS1112F^11)=S1112V^*,mV^*,mS1112(F^LSF^11)T+S1112T(V^*,mV^*,mV^mV^m)S1112F^11missing-subexpressionmissing-subexpressionmissing-subexpression𝑇subscript^𝐹𝑅𝐿𝑆subscript^𝐹𝑅missing-subexpressionmissing-subexpression𝑇superscriptsubscript𝑆1112subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹𝐿𝑆subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹11missing-subexpressionmissing-subexpressionsuperscriptsubscript𝑆1112subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹𝐿𝑆subscript^𝐹11𝑇missing-subexpressionmissing-subexpressionsuperscriptsubscript𝑆1112𝑇subscript^𝑉𝑚superscriptsubscript^𝑉𝑚subscript^𝑉𝑚superscriptsubscript^𝑉𝑚superscriptsubscript𝑆1112subscript^𝐹11\displaystyle\begin{array}[]{rrcl}&&&\sqrt{T}(\hat{F}_{RLS}-\hat{F}_{R})\\ &&=&\sqrt{T}S_{11}^{\frac{1}{2}}(\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}S_{11}^{-% \frac{1}{2}}\hat{F}_{LS}-\hat{V}_{m}\hat{V}_{m}^{\prime}S_{11}^{-\frac{1}{2}}% \hat{F}_{11})\\ &&=&S_{11}^{\frac{1}{2}}\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}S_{11}^{-\frac{1}{2% }}(\hat{F}_{LS}-\hat{F}_{11})\sqrt{T}\\ &&+&S_{11}^{\frac{1}{2}}\sqrt{T}(\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}-\hat{V}_{% m}\hat{V}_{m}^{\prime})S_{11}^{-\frac{1}{2}}\hat{F}_{11}\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R italic_L italic_S end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT - over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL = end_CELL start_CELL italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT italic_L italic_S end_POSTSUBSCRIPT - over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) square-root start_ARG italic_T end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL + end_CELL start_CELL italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT over^ start_ARG italic_F end_ARG start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY

From Theorem A, Lemma 5D, and the fact that V^*,mV^*,m2=m,superscriptnormsubscript^𝑉𝑚superscriptsubscript^𝑉𝑚2𝑚\|\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}\|^{2}=m,∥ over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_m , we get the first term 𝑝0𝑝absent0\xrightarrow{p}0start_ARROW overitalic_p → end_ARROW 0. For the second term, using Theorem 5A, Theorem 3B we are reduced to showing T(V^*,mV^*,mV^mV^m)𝑝0𝑝𝑇subscript^𝑉𝑚superscriptsubscript^𝑉𝑚subscript^𝑉𝑚superscriptsubscript^𝑉𝑚0\sqrt{T}(\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}-\hat{V}_{m}\hat{V}_{m}^{\prime})% \xrightarrow{p}0square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_ARROW overitalic_p → end_ARROW 0. Note that as in the proof of Theorem 5A we deduce that V^m𝑝Vm𝑝subscript^𝑉𝑚subscript𝑉𝑚\hat{V}_{m}\xrightarrow{p}V_{m}over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_V start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT.

Next set S¯=12(S00+S11)¯𝑆12subscript𝑆00subscript𝑆11\bar{S}=\frac{1}{2}(S_{00}+S_{11})over¯ start_ARG italic_S end_ARG = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT + italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) and consider that

T(R^R^*)=TS1112S10[S¯1S001]S10S1112=TS1112S10S001(S00S¯)S¯1S10S1112=TS1112S10S00112(S00S11)S¯1S10S1112missing-subexpressionmissing-subexpressionmissing-subexpression𝑇^𝑅subscript^𝑅missing-subexpressionmissing-subexpression𝑇superscriptsubscript𝑆1112subscript𝑆10delimited-[]superscript¯𝑆1superscriptsubscript𝑆001superscriptsubscript𝑆10superscriptsubscript𝑆1112missing-subexpressionmissing-subexpression𝑇superscriptsubscript𝑆1112subscript𝑆10superscriptsubscript𝑆001subscript𝑆00¯𝑆superscript¯𝑆1superscriptsubscript𝑆10superscriptsubscript𝑆1112missing-subexpressionmissing-subexpression𝑇superscriptsubscript𝑆1112subscript𝑆10superscriptsubscript𝑆00112subscript𝑆00subscript𝑆11superscript¯𝑆1superscriptsubscript𝑆10superscriptsubscript𝑆1112\displaystyle\begin{array}[]{rrcl}&&&\sqrt{T}(\hat{R}-\hat{R}_{*})\\ &&=&\sqrt{T}S_{11}^{-\frac{1}{2}}S_{10}[\bar{S}^{-1}-S_{00}^{-1}]S_{10}^{% \prime}S_{11}^{-\frac{1}{2}}\\ &&=&\sqrt{T}S_{11}^{-\frac{1}{2}}S_{10}S_{00}^{-1}(S_{00}-\bar{S})\bar{S}^{-1}% S_{10}^{\prime}S_{11}^{\frac{1}{2}}\\ &&=&\sqrt{T}S_{11}^{-\frac{1}{2}}S_{10}S_{00}^{-1}\frac{1}{2}(S_{00}-S_{11})% \bar{S}^{-1}S_{10}^{\prime}S_{11}^{-\frac{1}{2}}\end{array}start_ARRAY start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL end_CELL start_CELL square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_R end_ARG - over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT [ over¯ start_ARG italic_S end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ] italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT - over¯ start_ARG italic_S end_ARG ) over¯ start_ARG italic_S end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL start_CELL = end_CELL start_CELL square-root start_ARG italic_T end_ARG italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_S start_POSTSUBSCRIPT 00 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT ) over¯ start_ARG italic_S end_ARG start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_S start_POSTSUBSCRIPT 11 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_CELL end_ROW end_ARRAY

and we find T(R^R^*)𝑝0𝑝𝑇^𝑅subscript^𝑅0\sqrt{T}(\hat{R}-\hat{R}_{*})\xrightarrow{p}0square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_R end_ARG - over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) start_ARROW overitalic_p → end_ARROW 0 in view of Theorem 5A and lemma 5C.

Consider the ‘k-th’ eigenvector v^ksubscript^𝑣𝑘\hat{v}_{k}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and eigenvalue λ^ksubscript^𝜆𝑘\hat{\lambda}_{k}over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT of R^^𝑅\hat{R}over^ start_ARG italic_R end_ARG as well as the corresponding v^*,ksubscript^𝑣𝑘\hat{v}_{*,k}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT and λ^*,ksubscript^𝜆𝑘\hat{\lambda}_{*,k}over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT of R^*subscript^𝑅\hat{R}_{*}over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT.

Return now to T(V^*,mV^*,mV^mV^m)𝑝0𝑝𝑇subscript^𝑉𝑚superscriptsubscript^𝑉𝑚subscript^𝑉𝑚superscriptsubscript^𝑉𝑚0\sqrt{T}(\hat{V}_{*,m}\hat{V}_{*,m}^{\prime}-\hat{V}_{m}\hat{V}_{m}^{\prime})% \xrightarrow{p}0square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT * , italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_ARROW overitalic_p → end_ARROW 0. We rewrite this as k=1mT(v^kv^kv^*,kv^*,k)𝑝0𝑝superscriptsubscript𝑘1𝑚𝑇subscript^𝑣𝑘superscriptsubscript^𝑣𝑘subscript^𝑣𝑘superscriptsubscript^𝑣𝑘0\sum_{k=1}^{m}\sqrt{T}(\hat{v}_{k}\hat{v}_{k}^{\prime}-\hat{v}_{*,k}\hat{v}_{*% ,k}^{\prime})\xrightarrow{p}0∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_ARROW overitalic_p → end_ARROW 0 which holds if T(v^kv^kv^*,kv^*,k)𝑝0𝑝𝑇subscript^𝑣𝑘superscriptsubscript^𝑣𝑘subscript^𝑣𝑘superscriptsubscript^𝑣𝑘0\sqrt{T}(\hat{v}_{k}\hat{v}_{k}^{\prime}-\hat{v}_{*,k}\hat{v}_{*,k}^{\prime})% \xrightarrow{p}0square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_ARROW overitalic_p → end_ARROW 0 for each k𝑘kitalic_k. This has Frobenius norm T2[1(v^kv^*,k)2]𝑇2delimited-[]1superscriptsuperscriptsubscript^𝑣𝑘subscript^𝑣𝑘2\sqrt{T}\sqrt{2[1-(\hat{v}_{k}^{\prime}\hat{v}_{*,k})^{2}]}square-root start_ARG italic_T end_ARG square-root start_ARG 2 [ 1 - ( over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] end_ARG. If we set cos(θk)=v^kv^*,ksubscript𝜃𝑘superscriptsubscript^𝑣𝑘subscript^𝑣𝑘\cos(\theta_{k})=\hat{v}_{k}^{\prime}\hat{v}_{*,k}roman_cos ( italic_θ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT * , italic_k end_POSTSUBSCRIPT then the norm is 2T|sin(θk)|2𝑇subscript𝜃𝑘\sqrt{2T}|\sin(\theta_{k})|square-root start_ARG 2 italic_T end_ARG | roman_sin ( italic_θ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) |.

This allows us to apply a special case of a classic result of [7] which is given in [29][the equation below their equation (1)] namely

|sin(θk)|subscript𝜃𝑘\displaystyle|\sin(\theta_{k})|| roman_sin ( italic_θ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) | 1δ^kR^R^*absent1subscript^𝛿𝑘norm^𝑅subscript^𝑅\displaystyle\leq\frac{1}{\hat{\delta}_{k}}\|\hat{R}-\hat{R}_{*}\|≤ divide start_ARG 1 end_ARG start_ARG over^ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ∥ over^ start_ARG italic_R end_ARG - over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ∥
δ^ksubscript^𝛿𝑘\displaystyle\hat{\delta}_{k}over^ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT =min(|λ^k1λ^,k|,|λ^k+1λ^,k|)absent𝑚𝑖𝑛subscript^𝜆𝑘1subscript^𝜆𝑘subscript^𝜆𝑘1subscript^𝜆𝑘\displaystyle=min(|\hat{\lambda}_{k-1}-\hat{\lambda}_{\ast,k}|,|\hat{\lambda}_% {k+1}-\hat{\lambda}_{\ast,k}|)= italic_m italic_i italic_n ( | over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT ∗ , italic_k end_POSTSUBSCRIPT | , | over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT ∗ , italic_k end_POSTSUBSCRIPT | )

Since (Theorem 5A) λ^l𝑝λl,λ^,l𝑝λlformulae-sequence𝑝subscript^𝜆𝑙subscript𝜆𝑙𝑝subscript^𝜆𝑙subscript𝜆𝑙\hat{\lambda}_{l}\xrightarrow{p}\lambda_{l},\hat{\lambda}_{\ast,l}\xrightarrow% {p}\lambda_{l}over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_λ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , over^ start_ARG italic_λ end_ARG start_POSTSUBSCRIPT ∗ , italic_l end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_λ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT for all l𝑙litalic_l then δ^k𝑝δk=min(|λk1λk|,|λk+1λk|)𝑝subscript^𝛿𝑘subscript𝛿𝑘𝑚𝑖𝑛subscript𝜆𝑘1subscript𝜆𝑘subscript𝜆𝑘1subscript𝜆𝑘\hat{\delta}_{k}\xrightarrow{p}\delta_{k}=min(|\lambda_{k-1}-\lambda_{k}|,|% \lambda_{k+1}-\lambda_{k}|)over^ start_ARG italic_δ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_ARROW overitalic_p → end_ARROW italic_δ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_m italic_i italic_n ( | italic_λ start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | , | italic_λ start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | ).

However, consider that

TR^R^*=tr[T(R^R^*)T(R^R^*)]𝑇norm^𝑅subscript^𝑅tr𝑇^𝑅subscript^𝑅𝑇^𝑅subscript^𝑅\displaystyle\sqrt{T}\|\hat{R}-\hat{R}_{*}\|=\sqrt{\operatorname{tr}[\sqrt{T}(% \hat{R}-\hat{R}_{*})\sqrt{T}(\hat{R}-\hat{R}_{*})]}square-root start_ARG italic_T end_ARG ∥ over^ start_ARG italic_R end_ARG - over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ∥ = square-root start_ARG roman_tr [ square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_R end_ARG - over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) square-root start_ARG italic_T end_ARG ( over^ start_ARG italic_R end_ARG - over^ start_ARG italic_R end_ARG start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ] end_ARG

but each term 𝑝0𝑝absent0\xrightarrow{p}0start_ARROW overitalic_p → end_ARROW 0 and so T|sin(θk)|𝑝0𝑝𝑇subscript𝜃𝑘0\sqrt{T}|\sin(\theta_{k})|\xrightarrow{p}0square-root start_ARG italic_T end_ARG | roman_sin ( italic_θ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) | start_ARROW overitalic_p → end_ARROW 0 and the proof is complete.

References

  • [1] D. Bertsekas. Reinforcement Learning and Optimal Control. Athena Scientific, 2019.
  • [2] B. Boots, G. J. Gordon, and S. Siddiqi. A constraint generation approach to learning stable linear dynamical systems. Advances in Neural Information Processing Systems, 20, 2007.
  • [3] J. P. Burg. Maximum Entropy Spectral Analysis. Stanford University, 1975.
  • [4] EY. Chen, RS. Tsay, and R. Chen. Constrained factor models for high-dimensional matrix-variate time series. Jl. Am. Stat. Assocn., 115:775–793, 2020.
  • [5] R. Chen, H. Xiao, and D. Yang. Autoregressive models for matrix-valued time series. Jl. Econometrics, 222:539–560, 2021.
  • [6] N. L. C. Chui and J. M. Maciejowski. Realization of stable models with subspace methods. Automatica, 32(11):1587–1595, 1996.
  • [7] C. Davis and WM. Kahan. The rotation of eigenvectors by a perturbation. SIAM Jl. on Numerical Analysis, 7:1–46, 1970.
  • [8] JD. Hamilton. Time Series Analysis. Princeton Univ. Press, Princeton, NJ, 1994.
  • [9] RA. Horn and CA. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, UK, 2013.
  • [10] W. Jongeneel, T. Sutter, and D. Kuhn. Efficient learning of a linear dynamical system with stability guarantees. IEEE Trans. on Autom. Contr., 68(5):2790–2804, 2023.
  • [11] T. Kailath. Linear Estimation. Prentice Hall, Upper Saddle River, New Jersey, 2000.
  • [12] H. Lutkepohl. Multivariate Time Series Analaysis. Springer, New York, 2005.
  • [13] JR. Magnus and H. Neudecker. Matrix Differential Calculus. J. Wiley, New York, 1999.
  • [14] G. Mallet, G. Gasso, and S. Canu. New methods for the identification of a stable subspace model for dynamical systems. In IEEE Workshop on Machine Learning for Signal Processing, pages 432–437, 2008.
  • [15] J. Mari, P. Stoica, and T. McKelvey. Vector ARMA estimation: a reliable subspace approach. IEEE Trans. Sig. Proc., 48(7):2092–2104, 2000.
  • [16] D. N. Miller and R. A. de Callafon. Subspace identification with eigenvalue constraints. Automatica, 49(8):2468–2473, 2013.
  • [17] M. Nikolai, A. Proutiere, A. Rantzer, and S. Tu. From self-tuning regulators to reinforcement learning and back again. In Proc. IEEE CDC, pages 3724–3740, 2019.
  • [18] A. H. Nuttall. Multivariate linear predictive spectral analysis employing weighted forward and backward averaging: A generalization of Burg’s algorithm. Report, Naval Underwater System Center, London, 1976.
  • [19] B. Recht. A tour of reinforcement learning: The view from continuous control. Annual Review of Control, Robotics, and Autonomous Systems, 2:253–279, 2019.
  • [20] GC. Reinsel and RP. Velu. Multivariate Reduced-Rank Regression: Theory and Applications, volume 136. Springer, 1998.
  • [21] X. Rong and V. Solo. State space subspace noise modeling with guaranteed stability. In Proc. IEEE CDC, page to appear, 2023.
  • [22] O. Strand. Multichannel complex maximum entropy (autoregressive) spectral analysis. IEEE Trans. Autom. Contr., 22(4):634–640, 1977.
  • [23] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT press, 2018.
  • [24] H. Tanaka and T. Katayama. Stochastic subspace identification guaranteeing stability and minimum phase. IFAC Proceedings Volumes, 38(1):910–915, 2005.
  • [25] J. Umenberger, J. Wågberg, I. R. Manchester, and T. B. Schön. Maximum likelihood identification of stable linear dynamical systems. Automatica, 96:280–292, 2018.
  • [26] T. Van Gestel, J.A.K. Suykens, P. Van Dooren, and B. De Moor. Identification of stable models in subspace identification by using regularization. IEEE Trans. Autom. Contr., 46:1416–1420, 2001.
  • [27] P. Van Overschee and B. De Moor. Subspace Identification for Linear Systems: Theory, Implementation, Applications. Kluwer, Boston, 1996.
  • [28] C. Weikert and M. B. Schulze. Evaluating dietary patterns: the role of reduced rank regression. Current Opinion in Clinical Nutrition and Metabolic Care, 19(5):341–346, 2016.
  • [29] Y. Yu, T. Wang, and RJ. Samworth. A useful variant of the davis–kahan theorem for statisticians. Biometrika, 102:315–323, 2015.