Statistically Adaptive Spatial Multiplexing For Time-Varying Correlated MIMO Channels
Statistically Adaptive Spatial Multiplexing For Time-Varying Correlated MIMO Channels
Statistically Adaptive Spatial Multiplexing For Time-Varying Correlated MIMO Channels
l=1
l
a
R
(
R,l
)a
H
T
(
T,l
)e
j2
l
t
(2)
where L is the number of paths. For the l-th path, the array steering and response vectors are given by
a
T
(
T,l
) =
_
1, e
j2
T,l
, . . . , e
j2(N
T
1)
T,l
T
, a
R
(
R,l
) =
_
1, e
j2
R,l
, . . . , e
j2(N
R
1)
R,l
T
(3)
where the parameter is related to the physical path angle
1
as = dsin/ with and d denoting
the wavelength and the antenna spacing, respectively. Without loss of generality (WLOG), we focus
on the critical spacing d = /2 in this paper and, hence, = sin/2 [0.5, 0.5]. The impact of
antenna spacing on capacity and diversity performance has been investigated in [15]. The complex path
amplitude
l
=
l
e
j
l
has envelope
l
> 0 and phase
l
uniformly distributed in [0, 2]. WLOG, the
transmitter and receiver are assumed to move in the array-broadside directions at speeds of v
T
and v
R
.
The resultant Doppler frequency shift of the l-th path is given by
l
=(v
R
cos
R,l
+ v
T
cos
T,l
) / with
the maximum shift as f
max
= (v
R
+ v
T
)/. Moreover, {
l
}, {
R,l
}, {
T,l
}, {
l
} are xed for a given
scattering environment, while {
l
} randomly change for different channel realizations [13, 14].
In contrast to (1), signaling can be realized in virtual angle domain (beamspace) instead of in spatial
domain. The signal relation in virtual angle domain can be written as
y(t) = H(t)x(t) +n(t) (4)
where the virtual channel matrix H(t) =A
H
R
H
c
(t)A
T
is the 2D Discrete Fourier Transform (DFT) of
H
c
(t), and the transformed vectors y(t) =A
H
R
y
c
(t), x(t) =A
H
T
x
c
(t), n(t) =A
H
R
n
c
(t) represent the
received, transmitted, and noise vectors in beamspace. The N
R
N
R
and N
T
N
T
unitary DFT matrices
1
measured w.r.t. the array broadside.
4
are given by A
R
= [a
R
(
R,1
), . . . , a
R
(
R,N
R
)]/
N
R
and A
T
= [a
T
(
T,1
), . . . , a
T
(
T,N
T
)]/
N
T
with
the xed receive and transmit virtual angles dened as
R,q
=
q
N
R
N
R
, q = 1, . . . , N
R
,
T,p
=
p
N
T
N
T
, p = 1, . . . , N
T
(5)
where, WLOG, we assume the number of antennas is odd and dene
N
R
= (N
R
1)/2 + 1,
N
T
=
(N
T
1)/2 + 1. The signaling in virtual angle domain is illustrated in Fig.1. Through DFT, the p-th
element of x(t) is transmitted from the p-th transmit beam directed at
T,p
, while the q-th element of
y(t) denotes the signal captured by the q-th receive beam at
R,q
. As implied by (4), the virtual channel
coefcient H(q, p, t) in H(t) represents the channel coupling between the p-th transmitted element
x
p
(t) and the q-th received element y
q
(t) in beamspace.
One important property of virtual channel coefcients is their approximately uncorrelated nature.
This can be interpreted via virtual path partitioning, which introduces the following subsets of paths
S
R,q
= {l : 1/(2N
R
) (
R,l
R,q
) < 1/(2N
R
)}, q = 1, . . . , N
R
S
T,p
= {l : 1/(2N
T
) (
T,l
T,p
) < 1/(2N
T
)}, p = 1, . . . , N
T
(6)
corresponding to the spatial resolutions:
R
= 1/N
R
and
T
= 1/N
T
. The partitioning in case of
N
R
=N
T
=3 is illustrated in Fig.2, where each dot represents a path angular position (
R,l
,
T,l
) in the
2D domain [0.5, 0.5] [0.50.5]. The 3 receive beams partition the paths into 3 rows {S
R,q
}
3
q=1
with height 1/3, and the 3 transmit beams partition the paths into 3 columns {S
T,p
}
3
p=1
with width 1/3.
Based on the above partitioning, each virtual channel coefcient in (4) can be approximated as
H(q, p, t) =
1
N
R
N
T
L
l=1
l
a
H
R
(
R,q
)a
R
(
R,l
)a
H
T
(
T,l
)a
T
(
T,p
)e
j2
l
t
_
N
R
N
T
lS
q,p
l
e
j2
l
t
(7)
where S
q,p
=S
R,q
S
T,p
represents the paths jointly captured by the p-th transmit beam and the q-th
receive beam, as illustrated in Fig.2. The mathematical reasoning of the approximation can be found in
[15]. It indicates that {H(q, p, t)} for different (q, p)s are contributed by disjoint subsets of paths and,
hence, they are uncorrelated due to the independent path amplitudes. Based on the uncorrelated nature,
it has been shown that the capacity-achieving input covariance matrix in virtual domain E
_
x(t)x
H
(t)
5
has a diagonal structure [20, 21]. This motivates the consideration of transmitting independent data
streams via different virtual angles in the proposed scheme.
Furthermore, the temporal correlation of H(q, p, t) can be derived from (7) as
r
q,p
(t t
) E [H(q, p, t)H
(q, p, t
)] =
_
G
q,p
()e
j2(tt
)
d (8)
where the path Doppler power spectrum conditioned on (q, p) can be approximated as
G
q,p
() =
1
N
R
N
T
L
l=1
2
l
a
H
R
(
R,q
)a
R
(
R,l
)
a
H
T
(
T,l
)a
T
(
T,p
)
2
(
l
)
N
R
N
T
lS
q,p
2
l
(
l
).
(9)
The above approximation implies that the conditional Doppler power spectrum is contributed by the
paths in S
q,p
and hence has a smaller spread than that in SISO channel, which is contributed by all
paths. The smaller spread yields a slower decay of the corresponding temporal correlation and hence
slows down the effective channel variation. This explains why MIMO systems can reduce the training
update frequency, as reected by the numerical results.
III. System Design Overview
The systemcontinuously transmits and receives space-time packets with identical structure. WLOG,
the structure of the rst packet is shown in Fig. 3. It has a training phase with N
tr
symbol periods fol-
lowed by a data transmission phase with N
D
symbol periods. The xed symbol period is denoted as
T
s
, and the time 0 represents the end of training phase. The packet is transmitted and received in virtual
angle domain. WLOG, the N
T
transmit virtual angles are sorted in descending order according to their
statistical strengths
2
. In the p-th training symbol period, a training symbol is transmitted from the beam
at the p-th transmit virtual angle with full power . This implies N
tr
N
T
. Accordingly, the transmitted
training signal matrix is D=
I
N
tr
with I
N
tr
as the identity matrix. In the n-th data symbol period, a
k(n)1 signal vector x(n) is launched from the rst k(n) transmit virtual beams.
2
which means EH(:, 1, t)
2
F
, . . . , EH(:, N
T
, t)
2
F
, where H(:, p, t) represents the p-th column of H(t) and
2
F
denotes the Frobenius norm. It can be shown that the order is independent of t for a given scattering environment.
6
The system block diagram in the n-th data symbol period is illustrated in Fig.4. At the transmitter
side, the transmitted signal vector in virtual angle domain is formed as x(n) =
1/2
(n)s(n), where
(n) represents a diagonal power-shaping matrix. The k(n)1 data vector s(n) consists of independent
QAM symbols with E[s(n)s
H
(n)] =I
k(n)
, and R
p
(n) represents the number of bits in the p-th symbol.
Accordingly, the covariance matrix of x(n) has a diagonal structure, which is capacity-achieving [20,
21]. The vector x(n) is further zero-padded and DFT transformed to the spatial domain for antenna
transmission: x
c
(n) =A
T
[x
T
(n)
.
.
. 0
T
]
T
, where 0 is a (N
T
k(n))1 all-zero vector. Due to the DFT,
the k(n) data symbols in x(n) are respectively launched from the rst k(n) transmit virtual beams. At
the receiver side, the signal vector on the N
R
receive antennas is transformed to the beamspace through
y(n)=A
H
R
y
c
(n), where y(n) consists of signals captured by the N
R
receive virtual beams. The vector
y(n) is then fed to a linear decoder to estimate the data vector: s(n) = G(n)y(n). The decoder is
formed based on the predicted virtual channel matrix
H(n), which is generated by feeding the N
R
N
tr
signal matrix Z received in training phase to a linear channel predictor L(n).
The general design objective is to maximize the average rate per packet
N
D
n=1
k(n)
p=1
R
p
(n)/(N
tr
+
N
D
) subject to the transmit power constraint and the BER requirement for each data stream. The
optimization is over L(n), G(n), (n), {R
p
(n)}, k(n), and N
D
with channel statistics known at both
sides. The procedure is briey described below.
At the receiver side:
Step 1 In each data symbol period, L(n) is formed by minimizing the Mean Square Error (MSE)
between the true channel state and its prediction. The MSE is averaged over channel statistics.
Therefore, L(n) is determined by channel statistics and hence is the same for different packets;
Step 2 The decoder G(n) is formed by minimizing the MSE between s(n) and s(n), which is averaged
over s(n) and the noise. As shown later, G(n) is a function of
H(n) and (n). Therefore, it
varies with different realizations of
H(n) in different packets;
At the transmitter side:
Step 1 For a given number of data streams k(n), (n) is optimized by minimizing the MSE between
7
s(n) and s(n), which is averaged over channel statistics, since the transmitter only knows channel
statistics. Next, the rate of the p-th stream R
p
(n) is chosen as the maximum number of bits keep-
ing the corresponding average BER under a target BER. Both (n) and {R
p
(n)} are optimized
over channel statistics and hence do not change across the packets;
Step 2 The k(n) is further optimized by maximizing the total rate in the n-th data symbol period
k(n)
p=1
R
p
(n), and the corresponding (n) and {R
p
(n)}
k(n)
p=1
are selected as the nal designs;
Step 3 The data block length N
D
is optimized by maximizing the average rate per packet dened earlier
with the optimum k(n) and the corresponding {R
p
(n)}
k(n)
p=1
in the expression. For simplicity, the
length of training phase N
tr
can be rst xed as N
T
, though the optimum value can be smaller as
described later;
The proposed statistics-based design has the following two features. (1) It has low complexity
compared with the CSIT-based design [3]-[7]: instantaneous channel feedback is not required, and the
same optimized packet structure is applied over the period when the channel statistics are static; (2) The
temporal and spatial channel correlations are exploited by optimizing the packet length and the number
of data streams, which signicantly improves the average rate, as shown in the results;
IV. Optimization of System Components
WLOG, we focus on the rst packet and list major design assumptions as follows.
A1) For simplicity, H(t) in the training phase is assumed to be the same as that at time 0.
However, extension to the case considering channel variation in the training phase is feasible;
A2) Virtual channel matrix H(t) in the n-th data symbol period in Fig. 3 is assumed to be the
same as H(nT
s
), which is denoted as H(n) hereafter;
A3) H(q, p, t) has zero-mean Gaussian distribution. It can be seen from (7) that H(q, p, t) is
a weighted sum of independent path amplitudes. According to the central limit theorem, the
distribution would be approximately Gaussian if the number of paths is large;
A4) {H(q, p, t)} are independent for different (q, p)s: E[H(q, p, t)H
(q
, p
, t
)] = r
q,p
(t
t
)
qq
pp
with r
q,p
(t t
I
N
T
and the N
R
N
T
noise matrix Whas i.i.d. complex Gaussian entries with zero mean and variance
2
n
.
The predicted virtual channel matrix in the n-th data symbol period is given by
H(n)=vec(
h(n)),
h(n)=L(n)z, z=vec(Z)=
_
D
T
I
N
R
_
h(0) +w=
h(0) +w (10)
where L(n) is the N
R
N
T
N
R
N
T
linear channel predictor, z = vec(Z), h(0) = vec(H(0)), and w=
vec(W) are obtained by stacking
3
the columns of Z, H(0), and W, respectively, vec() represents the
inverse operation of vec(), and z is specied via the identity vec(ABC) =
_
C
T
A
_
vec(B) with
denoting the Kronecker product. The MMSE channel predictor can be derived from the orthogonality
principle [18] as
L
o
(n) = arg min
L(n)
EL(n)z h(n)
2
F
= E
_
h(n)z
H
_
E
_
zz
H
_
1
=
(n)
_
(0) +
2
n
I
N
R
N
T
_
1
(11)
where h(n) = vec(H(n)) with H(n) given in A2, and the expectation is over both h and w. Ac-
cording to A4, (n) = E
_
h(n)h
H
(0)
/ (r
q,p
(0) +
2
n
). The above relations together with
h(n) =L
o
(n)z give the expressions
of
H(q, p, n) and the associated prediction error as
H(q, p, n) =
r
q,p
(nT
s
)
r
q,p
(0) +
2
n
Z(q, p) =
r
q,p
(nT
s
)
r
q,p
(0) +
2
n
(H(q, p, 0)
+ W(q, p))
E(q, p, n) = H(q, p, n)
H(q, p, n)
(12)
and, according to A3, both have zero-mean complex Gaussian distribution
H(q, p, n) CN
_
0,
2
(q, p, n)
_
,
2
(q, p, n) = |r
q,p
(nT
s
)|
2
/
_
r
q,p
(0) +
2
n
_
E(q, p, n) CN
_
0,
2
(q, p, n)
_
,
2
(q, p, n) = r
q,p
(0)
2
(q, p, n).
(13)
The matrix version of (12) is given by H(n) =
H(n) + E(n) with
H(n) known at the receiver. Based
on A4 and (12), we have the following properties
E
_
H(q, p, n)
(q
, p
, n)
_
=
2
(q, p, n)
qq
pp
E [E(q, p, n)E
(q
, p
, n)] =
2
(q, p, n)
qq
pp
, E
_
E(q, p, n)
(q
, p
, n)
_
= 0
(14)
which indicate that
H(n) and E(n) have independent entries, and they are also mutually independent.
The channel estimation for correlated MIMO channels has been studied in [19], which shows that the
optimum training signal corresponds to transmitting beams in successive symbol intervals along differ-
ent transmit angles as assumed in Fig.3. The proposed scheme also considers the channel variation in
each data block, and the channel in the n-th symbol period is predicted by the predictor in (11).
B. Optimization of Decoder and Power-Shaping Matrix
As described in Section III, the transmitted signal vector in virtual angle domain x(n) =
1/2
(n)s(n)
has dimension k(n), which is simplied as k in the following. The corresponding N
R
1 received signal
vector in virtual domain can be written as
y(n) = H
k
(n)x(n) +n(n) =
H
k
(n)x(n) +v(n) (15)
10
where the matrix with subscript k contains the rst k columns of the original matrix, n(n) has i.i.d.
complex Gaussian noise entries with zero mean and variance
2
n
, the virtual channel matrix is decom-
posed as H
k
(n) =
H
k
(n)+E
k
(n), and v(n) =E
k
(n)x(n)+n(n) represents the effective noise vector.
The MMSE decoder can be derived as
G
o
(n) = arg min
G(n)
EG(n)y(n) s(n)
2
F
= E
_
s(n)y
H
(n)
_
E
_
y(n)y
H
(n)
_
1
=
1/2
(n)
H
H
k
(n)
_
H
k
(n)(n)
H
H
k
(n) +(n)
_
1
(16)
where the expectation is over E
k
(n), s(n), n(n), and the effective noise covariance matrix is given by
(n) = E
_
v(n)v
H
(n)
= E
_
E
k
(n)(n)E
H
k
(n)
+
2
n
I
N
R
= diag (tr (
1
(n)(n)) , . . . , tr (
N
R
(n)(n))) +
2
n
I
N
R
(17)
where
q
(n)=E
_
E
H
k
(q, :, n)E
k
(q, :, n)
=diag (
2
(q, 1, n), . . . ,
2
(q, k, n)) with E
k
(q, :, n) as the q-th
row of E
k
(n) and
2
(q, p, n) given in (13). The MSE for the optimization of (n) is dened as
MSE(n) = E tr
_
_
G
o
(n)y(n) s(n)
__
G
o
(n)y(n) s(n)
_
H
_
= E tr
_
I
k
+
1/2
(n)
H
H
k
(n)
1
(n)
H
k
(n)
1/2
(n)
_
1
(18)
E tr
_
I
k
+
1/2
(n)
H
H
k
(n)
1
(n)
H
k
(n)
1/2
(n)
_
1
MSE(n)
2
(1, n), . . . ,
2
(N
R
, n)
_
+
2
n
I
N
R
,
2
(q, n) = max
p=1,...,k
2
(q, p, n)
where the readers can refer to [25, 26] for the 2nd step, the inequality stems from
(n) (n) as
well as the properties of matrix ordering [22], and the expectation is over s(n), n(n), E
k
(n), and
H
k
(n), since the transmitter only knows channel statistics. The optimum (n) is aimed at minimizing
MSE(n). However, the optimization may not be convex due to the term tr (
q
(n)(n)) in (17) and
the inverse on (n) in (18). Therefore, (n) is optimized by minimizing the upper bound of MSE(n)
o
(n, k) = arg min
(n)
MSE(n), s.t. tr((n)) = (19)
where the variable k emphasizes that the solution is conditioned on k. It can be shown with the tech-
niques in [20] that (19) is convex programming and, therefore, the global minimizer can be reliably
obtained via optimization routines.
11
C. Optimization of Transmission Rates
With the power-shaping matrix, we next optimize the transmission rates of the k streams. The SINR
corresponding to the p-th received stream of s(n)=G
o
(n)y(n) can be straightforwardly shown as
p
(n) =
p
(n)
H
H
k
(:, p, n)
_
j=p
j
(n)
H
k
(:, j, n)
H
H
k
(:, j, n) +(n)
_
1
H
k
(:, p, n) (20)
where
H
k
(:, p, n) represents the p-th column of
H
k
(n), and
p
(n) denotes the p-th diagonal element of
o
(n, k). With A6, a tight BER approximation for QAM constellation with 2
i
points is given by [27]:
BER(i,
p
(n)) = 0.2 exp (1.5
p
(n)/(2
i
1)). Since the transmitter only knows channel statistics,
the rate of the p-th stream in the n-th data symbol period is dened as the maximum number of bits
keeping the average BER under the target BER
R
p
(n, k) = max
i{0,1,2,... }
i, s.t. BER
p
(n) = E [BER(i,
p
(n))] BER
tar
(21)
where the expectation is over
H
k
(n), and the variable k emphasizes that the solution is conditioned on
k. Deriving a closed-form expression of R
p
(n, k) is difcult, since the distribution of
p
(n) for arbi-
trary channel statistics is unknown. The asymptotic distribution for i.i.d.
H
k
(n) with large number of
antennas can be found in [28], and the distribution for i.i.d.
H
k
(n) with arbitrary number of antennas is
given in [29] but is complicated for numerical evaluation. Therefore, R
p
(n, k) is numerically searched
for general MIMO channels. However, closed-form solutions can be obtained for SISO, MISO, and
SIMO channels with arbitrary correlations. They are ignored due to the space limitation.
D. Optimization of Data Stream Number and Data Block Length
The number of data streams k can be optimized to maximize the total rate in the n-th data symbol period
k
o
(n) = arg max
kS
1
k
p=1
R
p
(n, k), S
1
= {1, . . . , min{N
T
, N
R
}} (22)
where min{N
T
, N
R
} represents the upper bound of k, as described in A7. The optimized power shaping
matrix and transmission rates associated with k
o
(n) are selected as the nal designs
o
(n) =
o
(n, k
o
(n)), R
o
(n) =
_
R
1
(n, k
o
(n)), . . . , R
k
o
(n)
(n, k
o
(n))
T
. (23)
12
To assess the performance, we dene the average rate per packet as
AR(N
D
) =
N
D
n=1
R
o
(n)
N
D
+ N
tr
, R
o
(n) =
k
o
(n)
p=1
R
p
(n, k
o
(n)), N
tr
= max
n{1,...,N
D
}
k
o
(n) (24)
where R
o
(n) denotes the instantaneous total rate in the n-th data symbol period, and the number of
training symbol periods N
tr
is equal to the number of transmit angles involved in the data transmission.
Note that N
tr
could be less than N
T
initially assumed in Section IV-A, since the data transmission may
not use all N
T
angles. The optimum data block length is the one maximizing the average rate
N
o
= arg max
N
D
S
2
AR(N
D
), S
2
= {1, . . . , N
max
} (25)
where N
max
represents the maximum length for searching N
o
and can be predetermined by simulating
typical channels in practice.
V. Adaptation with Instantaneous Feedback
The statistics-based design in Section III only requires channel statistics at the transmitter. However,
most existing work [3]-[7] assumes the availability of Channel State Information at Transmitter (CSIT).
It would be instructive to compare the statistics-based design to the design with CSIT. In the latter case,
the transmitter is assumed to know the received training signal matrix Z, which is perfectly fed back
from the receiver without delay and error. Therefore, the transmitter can know the predicted virtual
channel matrix
H(n) via (10).
The transceiver with CSIT has the same structure as Fig.4 except that the power-shaping matrix
and the zero-padding block are replaced by a linear precoder. The optimization is also the same as the
statistics-based design except that the precoder and rates are optimized for each realization of
H(n).
For a xed number of data streams k, the N
R
1 received signal vector can be written as
y(n) = H(n)F(n)s(n) +n(n) =
H(n)F(n)s(n) +v(n) (26)
where F(n) is the N
T
k precoder, and v(n) =E(n)F(n)s(n) +n(n). Eqn (26) will reduce to (15) if
13
F(n)=[
1/2
(n)
.
.
. 0
k(N
T
k)
]
T
. The precoder is aimed at minimizing the following MSE
MSE
_
H(n)
_
= tr
_
I
k
+F
H
(n)
H
H
(n)
1
(n)
H(n)F(n)
_
1
tr
_
I
k
+F
H
(n)
H
H
(n)
1
(n)
H(n)F(n)
_
1
(27)
which is derived in the same way as (18) except that the expectation is not on
H(n), since the transmitter
now knows
H(n). Let VV
H
be the eigen-decomposition of
H
H
(n)
1
(n)
H(n)
_
can be minimized by choosing F(n)=V[
1/2
.
.
. 0
k(N
T
k)
]
T
with the optimumpower-
shaping matrix =diag (
1
, . . . ,
k
) specied in [25]. Furthermore, the SINR of the p-th stream at the
MMSE decoder output can be straightforwardly shown as
p
(n) = F
H
(:, p, n)
H
H
(n)
_
j=p
H(n)F(:, j, n)F
H
(:, j, n)
H
H
(n) +(n)
_
1
H(n)F(:, p, n) (28)
where F(:, p, n) denotes the p-th column of F(n). The maximum rate of the p-th stream under the BER
requirement
4
is given by R
p
(n, k) =log
2
(1 1.5
p
(n)/ln(BER
tar
/0.2)), and the optimum number
of data streams is searched according to k
o
(n) =arg max
kS
1
k
p=1
R
p
(n, k). In contrast to the statistics-
based design, F(n), {R
p
(n, k)}, and k
o
(n) are optimized for a given
H(n) instead of its statistics. The
corresponding average rate can be expressed as
AR(N
D
)=
N
D
n=1
R
o
(n)
N
D
+ N
tr
, R
o
(n)=E
_
_
k
o
(n)
p=1
R
p
(n, k
o
(n))
_
_
(29)
where the expectation is on
H(n), and the training block length N
tr
is equal to the number of non-
vanishing transmit virtual angles
5
. This is because unlike the statistics-based design, the precoder and
rates are optimized for each estimated H(n). To estimate H(n), the pilot symbols have to be sent from
all non-vanishing transmit virtual angles.
4
Mathematically, R
p
(n, k)=max
iZ
i, s.t. BER(i,
p
(n))BER
tar
with BER(i,
p
(n)) dened above (21).
5
which equals the number of columns of H(n) whose Frobenious norms averaged over channel statistics are not zero.
14
VI. Results and Discussions
A. Simulation Parameters and Procedure
We consider L = 100 paths whose arrival and departure angles {
R,l
,
T,l
} are randomly uniformly
distributed within a 2D angular region in the virtual domain: [
max
,
max
] [
max
,
max
], where
max
determines the angular spread. Each path has the same strength
2
l
= 1/L, which implies the channel
normalization EH
c
(t)
2
F
=N
R
N
T
. Both the transmitter and receiver have the same speed v
T
=v
R
=
10km/h. The carrier frequency is 1.8GHz, and the resultant maximum Doppler shift is f
max
=33.3Hz.
The symbol period T
s
is specied via the product f
max
T
s
, which determines the fading rate. The BER
target is 10
3
, the noise power per receive antenna
2
n
is 1, and the other parameters N
R
, N
T
,
max
,
f
max
T
s
, are specied in the results.
Based on the above physical parameters, r
q,p
(t t
2
(q, p, n) due to the vanishing temporal correlation r
q,p
(nT
s
). Accordingly, the SINR of each data
stream in (20) and hence R
o
(n) diminish as time progresses. The corresponding average rate AR(N
D
)
is plotted in Fig.6. As N
D
increases, AR(N
D
) in each case goes up before reaching the peak and then
6
which is the variance of the estimable part and represents the estimation quality.
15
gradually decreases. This is because smaller N
D
reduces the portion of data transmission time in each
packet and hence degrades transmission efciency, while larger N
D
extends data transmission time at
the cost of lower R
o
(n) in the extended period, which eventually brings down AR(N
D
).
Next, we study the impact of number of data streams on the MIMO performance, which is best
revealed by the comparison of
o
(n, N
T
) and
o
(n). The former distributes power over all N
T
transmit
angles, while the latter allocates power only to the k
o
(n) strongest angles. In each case, the number of
excited data streams is equal to the rank of the power-shaping matrix. Compared with
o
(n),
o
(n, N
T
)
induces a 55% to 100% reduction of R
o
(n) in Fig.5 and a 64% to 71% reduction of AR(N
D
) in Fig.6.
The ranks of both power-shaping matrices are plotted in Fig.7, which shows that
o
(n, N
T
) has a xed
rank of 11, while the rank of
o
(n) ranges between 4 and 8. Therefore, activating all 11 data streams
with less rate per stream is inferior to concentrating power on fewer streams with more power and rate
in each.
The performance of non-MIMO congurations is investigated next. The instantaneous rate R
o
(n)
for MISO and SISO is plotted in Fig.5, where MISO achieves improvement over SISO due to the array
gain. However, the improvement is not signicant, since multiplexing doesnt exploit transmit diversity,
which can be achieved via space-time codes [30, 31]. Compared with MISO and SISO, SIMO achieves
signicant improvement in both R
o
(n) and AR(N
D
). This is because SIMO captures more channel
power by the N
R
receive antennas, and the received SNR is stabilized by the antenna diversity. In
addition, SIMO has a lower training cost than MIMO due to the use of single transmit antenna. Both
antenna diversity and low training cost make the maximum AR(N
D
) of SIMO roughly match that
of MIMO with
o
(n, N
T
) in Fig.6. In fact, MIMO with
o
(n, N
T
) and SIMO correspond to the
full-multiplexing and full-diversity schemes, respectively, while MIMO with
o
(n) makes a judicious
diversity-multiplexing tradeoff by choosing the optimum number of data streams. As demonstrated in
Fig.5 and 6, the tradeoff brings signicant improvement over SIMO and MIMO with
o
(n, N
T
).
Finally, it would be instructive to compare the statistics-based design to that with CSIT, whose
performance is represented by MIMO, CSIT. It can be observed in Fig.5 that MIMO with CSIT
16
on average achieves a 35% improvement in R
o
(n) over MIMO with
o
(n), and the improvement in
AR(N
D
) is on average 24% as shown in Fig.6. These results indicate that the loss may be acceptable if
there is no instantaneous channel state feedback.
C. Optimum Data Block Length
As shown in Fig.6, the average rate AR(N
D
) is maximized by an optimum data block length N
D
,
which is dened as N
o
in (25). It can be observed that N
o
for MIMO with
o
(n), SIMO, MISO,
and SISO is 22, 6, 6, and 2, respectively. This decreasing order can be intuitively explained via the
virtual path partitioning in Fig. 2. In case of MIMO, the paths are partitioned by both the transmit and
receive beams and, therefore, each virtual channel coefcient in beamspace captures fewer paths with
smaller Doppler spread, compared to those in SIMO, MISO, and SISO cases
7
. The smaller Doppler
spread results in slower decay of temporal correlation in virtual domain, which yields better channel
prediction and hence longer N
o
, as implied by (13). In short, MIMO has ner resolution in beamspace,
which effectively slows down the temporal channel variation. For the same reason, the N
o
for SIMO or
MISO is longer than that for SISO due to the ner path partitioning. The idea of reducing the temporal
channel variation by beamforming has been reported in [32, 33]. In this work, the benet of the slowed
variation is accomplished by extending the data block length, which improves the rate by reducing the
training update frequency.
The N
o
is also affected by the number of antennas. Fig.8 shows the average rate when the maximum
number of antennas at one side is 7 for MIMO, SIMO, and MISO. The N
o
for MIMO, SIMO, MISO,
and SISO is found to be 15, 6, 5, and 2, respectively. Compared with Fig.6, the smaller antenna number
decreases the average rate in each case and signicantly reduces N
o
for MIMO. This is because fewer
antennas enlarge the size of each virtual angular bin S
q,p
as well as the associated path Doppler spread,
which reduces the temporal correlation and hence N
o
.
The fading rate affects N
o
as well. Fig.9 shows the average rate for f
max
T
s
= 10
1
. Compared
7
For instance, the paths in S
2,2
are fewer than those in S
R,2
, S
T,2
and those in the whole
R
T
domain in Fig. 2.
17
with Fig.6, the higher fading rate signicantly reduces N
o
, which
8
is 8, 3, and 1 for MIMO, SIMO,
and MISO, since the training signal has to be sent more frequently to track the channel variation. The
higher fading rate also impairs the average rate in each case due to the larger channel estimation error.
D. Accuracy of Virtual Channel Model
As described in A3 and A4 in Section IV, the virtual channel coefcients are modeled as indepen-
dent Gaussian entries. To assess the modeling accuracy, the rate performance is rigorously simulated
based on the physical model (2), which does not impose any assumptions on the statistics of virtual
coefcients. The physical-model-based statistics are used to optimize the transceiver components in
the same way as described in Section IV. As shown in Fig.10, the good agreement between the perfor-
mances of the two veries the accuracy of virtual channel model. In contrast to the physical model, the
virtual model provides insights into the system design: the capacity-optimum signaling corresponds to
transmitting independent streams from different virtual angles, and the optimum data block length is
determined by the temporal correlation in beamspace.
VII. Conclusions
This work proposes a statistically adaptive spatial multiplexing scheme for correlated time-varying
MIMO channels based on the virtual channel representation. With the knowledge of channel statistics,
the transmitter adjusts the power and rate for each data stream in each symbol period, and the data
block length is further optimized to maximize the average rate. Major results for the performance of
instantaneous and average rates are summarized below.
For each antenna conguration, the instantaneous rate R
o
(n) decreases for larger time index n,
while the average rate AR(N
D
) is usually a hill-shape function implying an optimum N
D
;
The rate-maximizing power-shaping matrix
o
(n) makes a judicious diversity-multiplexing trade-
off by exciting the optimum number of data streams, which signicantly improves the rate;
8
The average rate is always zero in case of SISO.
18
Compared with the statistics-based design, the design with CSIT on average improves R
o
(n) and
AR(N
D
) by 35% and 24%, respectively. Therefore, the former might be a good complexity-
performance tradeoff by avoiding the instantaneous feedback of CSIT, which will both reduce
the resource and suffer from the imperfections in practice;
Interesting results on the behavior of optimum data block length N
o
are summarized below.
N
o
increases according to the order SISO < {SIMO, MISO} < MIMO and increases for more
antennas in each case. This is because each virtual channel coefcient captures fewer paths
with smaller Doppler spread due to the ner path partitioning in beamspace. The smaller spread
effectively slows down the channel variation in beamspace and hence yields a longer N
o
;
Higher fading rate factor f
max
T
s
yields smaller N
o
and reduces both R
o
(n) and AR(N
D
);
References
[1] I.E. Telatar, Capacity of multi-antenna gaussian channels, Tech. Rep., AT&T Bell Labs., 1995.
[2] G.J. Foschini, M.J. Gans, On the limits of wireless communications in a fading environment when using
multiple antennas, Wirless Personal Commun., vol. 6, no. 3, pp. 311-335, Mar. 1998.
[3] G. Lebrun, J. Gao, and M. Faulkner, MIMO transmission over a time-varying channel using SVD, IEEE
Trans. on wireless Communications, vol. 4, no. 2, pp. 757-764, March 2005.
[4] N. Khaled, G. Leus, C. Desset, and H. De Man, A robust joint linear precoder and decoder MMSE design
for slowly time-varying MIMO channels, in Proc. IEEE ICASSP, pp. 485-488, May 2004.
[5] Y. Ko and C. Tepedelenlioglu, Space-time block coded rate-adaptive modulation with uncertain SNR
feedback, in Proc. IEEE Conf. on Signals, Systems and Computers, pp. 1032-1036, Nov. 2003.
[6] S. Zhou and G.B. Giannakis, Adaptive modulation for multiantenna transmissions with channel mean
feedback, IEEE Trans. on Wireless Commun., vol. 3, no. 5, pp. 1626-1636, Sep. 2004.
[7] S. Zhou and G.B. Giannakis, How accurate channel prediction needs to be for transmit-beamforming with
adaptive modulation over Rayleigh MIMO channels? IEEE Trans. on wireless Communications, vol. 3,
no. 4, pp. 1285-1294, July 2004.
19
[8] M. Baissas and A.M. Sayeed, Pilot-based estimation of time-varying multipath channels for coherent
CDMA receivers, IEEE Trans. on Signal Processing, pp. 2037-2049, August 2002.
[9] M. Dong and L. Tong, Optimal insertion of pilot symbols for transmissions over time-varying at fading
channels, IEEE Trans. on Signal Processing, vol. 52, no. 5, pp. 1403-1417, May 2004.
[10] S. Ohno and G.B. Giannakis, Average-rate optimal PSAM transmissions over time-selective fading chan-
nels, IEEE Trans. Wireless Communications, vol. 1, pp. 712-720, Oct. 2002.
[11] A. Boariu, Effect of delayed commands on error probability and throughput capacity in a multiple access
system, in Proc. IEEE VTC, Orlando, USA, pp. 262-265, Oct. 2003.
[12] A.E. Ekpenyong and Y.F. Huang, Markov channel-based feedback schemes for adaptive modulation sys-
tems, in Proc. IEEE GLOBECOM, pp. 1091-1095, Nov. 2004.
[13] C.J. Jakes, Microwave Mobile Communications, Wiley, 1974.
[14] J. D. Parsons, The Mobile Radio Propagation Channel, John Wiley & Sons, 2000.
[15] A.M. Sayeed, Deconstructing multi-antenna fading channels, IEEE Trans. Signal Processing, vol. 50, no.
10, pp. 2563-2579, Oct. 2002.
[16] A.M. Sayeed, A virtual representation for time- and frequency-selective correlated MIMO channels, in
Proc. IEEE ICASSP, vol. 4, pp. 648-651, 2003.
[17] Y. Zhou and A.M. Sayeed, Experimental study of MIMO channel statistics and capacity via virtual channel
representation, submitted to IEEE Trans. Wireless Communications, June 2005.
[18] S. Haykin, Adatptive Filter Theory. Prentice Hall: New Jersey, 1996.
[19] J. Kotecha and A.M. Sayeed, Optimal estimation of correlated MIMO channels, IEEE Trans. on Signal
Processing, vol. 52, no. 2, pp. 546-557, Feb. 2004.
[20] V.V. Veeravalli, Y. Liang, and A.M. Sayeed, Correlated MIMO rayleigh fading channels: capacity, optimal
signaling, and asymptotics, IEEE Trans. on Info. Theo., vol. 51, no. 6, pp. 2058-2072, June 2005.
[21] J. Kotecha and A.M. Sayeed, Canonical statistical models for correlated MIMO fading channels and ca-
pacity analysis, submitted to IEEE Trans. on Wireless Communications.
[22] R.A. Horn, Matrix Analysis. Cambridge University Press, 1985.
20
[23] H.V. Poor and S.V. Verdu, Probability of error in MMSE multiuser detection, IEEE Trans. on Info. Theo.,
vol. 43, pp. 858-871, May 1997.
[24] A. Scaglione, P. Stoica, S. Barbarossa, G.B. Giannakis, and H. Sampath, Optimal Designs for Space-time
Linear Precoder and Decoder, IEEE Trans. on SP, vol. 50, pp. 1051-1064, May 2002.
[25] A. Scaglione, G.B. Giannakis, and S. Barbarossa, Redundant lterband precoder and equalizers Part I:
unication and optimal designs IEEE Trans. on SP, vol. 47, pp. 1988-2005, July 1999.
[26] D.P. Palomar, M.A. Lagunas, and J.M. Ciof, Optimum linear joint transmit-receive processing for MIMO
channels with QoS constraints, IEEE Trans. on SP, vol. 52, pp. 1179-1197, May 2004.
[27] A. Goldsmith and S.G. Chua, Variable-rate variable-power MQAM for fading channels, IEEE Trans. on
Communications, vol. 45, pp. 1218-1230, Oct. 1997.
[28] D.N.C. Tse and O. Zeitouni, Linear multiuser receivers in random environments, IEEE Trans. on Info.
Theo., vol. 46, pp. 171-188, Jan. 2000.
[29] H. Gao, P.J. Smith, and M.V. Clark, Theoretical reliability of MMSE linear diversity combining in
rayleigh-fading additive interference channels, IEEE Trans. on Commun., vol. 46, pp. 666-672, May 1998.
[30] V. Tarokh, H. Jafarkhani, and A.R. Calderbank, Space-time block codes from orthogonal designs, IEEE
Trans. on Info. Theo., vol. 45, pp. 1456-1467, July 1999.
[31] V. Tarokh, N. Seshadri, and A.R. Calderbank, Space-time codes for high data rate wireless communication:
Performance criterion and code construction IEEE Trans. on Info. Theo., vol. 44, pp. 744-765, Mar 1998.
[32] D. Chizhik, Slowing the time-uctuating MIMO channel by beam forming, IEEE Trans. on Wireless
Communications, vol. 3, pp. 1554-1565, Sep. 2004.
[33] R.G. Vaughan, Angular partitioning to yield equal Doppler contributions IEEE Trans. on Veh. Technol.,
vol. 48, pp. 1437-1442, Sep. 1999.
21
F
i
g
.
1
:
I
l
l
u
s
t
r
a
t
i
o
n
o
f
s
i
g
n
a
l
i
n
g
i
n
v
i
r
t
u
a
l
a
n
g
l
e
d
o
m
a
i
n
.
22
Fig. 2: Illustration of virtual path partitioning (N
T
=N
R
=3).
Fig. 3: Format of transmitted signal in the rst packet.
23
F
i
g
.
4
:
B
l
o
c
k
d
i
a
g
r
a
m
o
f
t
h
e
p
r
o
p
o
s
e
d
s
y
s
t
e
m
.
24
1 11 21 31 41 51 61
0
5
10
15
20
25
30
35
40
45
50
55
n, time index of data symbol period (
max
=0.5, =30dB, f
max
T
s
=10
2
)
R
o
(
n
)
,
i
n
s
t
a
n
t
a
n
e
o
u
s
r
a
t
e
(
b
i
t
s
/
s
y
m
b
o
l
p
e
r
i
o
d
)
MIMO, CSIT. (11x11)
MIMO,
o
(n) (11x11)
MIMO,
o
(n,N
T
) (11x11)
SIMO (N
R
=11, N
T
=1)
MISO (N
R
=1, N
T
=11)
SISO (N
R
=1, N
T
=1)
Fig. 5: Instantaneous rate in each data symbol period for angular spread
max
= 0.5, transmit SNR
= 30dB, fading rate factor f
max
T
s
= 10
2
.
25
1 11 21 31 41 51 61
0
5
10
15
20
25
30
N
D
, data block length (
max
=0.5, =30dB, f
max
T
s
=10
2
)
A
R
(
N
D
)
,
a
v
e
r
a
g
e
r
a
t
e
(
b
i
t
s
/
s
y
m
b
o
l
p
e
r
i
o
d
)
MIMO, CSIT. (11x11)
MIMO,
o
(n) (11x11)
MIMO,
o
(n,N
T
) (11x11)
SIMO (N
R
=11, N
T
=1)
MISO (N
R
=1, N
T
=11)
SISO (N
R
=1, N
T
=1)
Fig. 6: Average rate in each packet for angular spread
max
= 0.5, transmit SNR = 30dB, fading rate
factor f
max
T
s
= 10
2
.
26
1 11 21 31 41 51 61
3
4
5
6
7
8
9
10
11
12
n, time index of data symbol period (
max
=0.5, =30dB, f
max
T
s
=10
2
)
r
a
n
k
o
f
p
o
w
e
r
s
h
a
p
i
n
g
m
a
t
r
i
x
MIMO,
o
(n) (11x11)
MIMO,
o
(n,N
T
) (11x11)
Fig. 7: Ranks of power-shaping matrices for angular spread
max
= 0.5, transmit SNR = 30dB, fading
rate factor f
max
T
s
= 10
2
.
27
1 11 21 31 41 51 61
0
2
4
6
8
10
12
14
16
N
D
, data block length (
max
=0.5, =30dB, f
max
T
s
=10
2
)
A
R
(
N
D
)
,
a
v
e
r
a
g
e
r
a
t
e
(
b
i
t
s
/
s
y
m
b
o
l
p
e
r
i
o
d
)
MIMO,
o
(n) (7x7)
SIMO (N
R
=7, N
T
=1)
MISO (N
R
=1, N
T
=7)
SISO (N
R
=1, N
T
=1)
Fig. 8: Average rate in each packet for angular spread
max
= 0.5, transmit SNR = 30dB, fading rate
factor f
max
T
s
= 10
2
.
28
1 11 21 31 41 51 61
0
1
2
3
4
5
6
7
8
9
N
D
, data block length (
max
=0.5, =30dB, f
max
T
s
=10
1
)
A
R
(
N
D
)
,
a
v
e
r
a
g
e
r
a
t
e
(
b
i
t
s
/
s
y
m
b
o
l
p
e
r
i
o
d
)
MIMO,
o
(n) (11x11)
SIMO (N
R
=11, N
T
=1)
MISO (N
R
=1, N
T
=11)
SISO (N
R
=1, N
T
=1)
Fig. 9: Average rate in each packet for angular spread
max
= 0.5, transmit SNR = 30dB, fading rate
factor f
max
T
s
= 10
1
.
29
Fig. 10: Average rate comparison for virtual and physical models, angular spread
max
= 0.5, transmit
SNR = 30dB, fading rate factor f
max
T
s
= 10
2
.
30