Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Ce Lte Almmse Impl

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Implementation Aspects of Channel Estimation

for 3GPP LTE Terminals


Michal

Simko

, Di Wu

, Christian Mehlf uhrer

, Johan Eilert

and Dake Liu

Institute of Telecommunications, Vienna University of Technology, Vienna, Austria

Department of Electrical Engineering, Link oping University, Link oping, Sweden

ST-Ericsson AT AB, Lund, Sweden


Contact: msimko@nt.tuwien.ac.at
AbstractIn this paper, hardware implementation aspects of
the channel estimator in 3GPP LTE terminals are investigated.
A channel estimation ASIC, which handles the real-time channel
estimation, is presented. Compared to traditional correlator-
based channel estimators, the channel estimator presented boosts
the throughput at feasible silicon cost by adopting a recently
proposed estimation method named Approximate Linear Min-
imum Mean Square Error (ALMMSE). In this paper, both
the architecture and VLSI implementation of the estimator are
elaborated. Implemented using a 65 nm CMOS process, the
channel estimator supports the full 20 MHz bandwidth of 3GPP
LTE and consumes only 49 kgates.
I. INTRODUCTION
The 3rd Generation Partnership Project Long-Term Evolu-
tion (3GPP LTE) is an emerging mobile broadband commu-
nication system that offers high-speed mobile data service.
LTE adopts Orthogonal Frequency Division Multiple Access
(OFDMA) and Multiple-Input Multiple-Output (MIMO) to
achieve a peak bit-rate target of 326 Mbit/s in the downlink
(assuming a 4 4 MIMO system with 20 MHz bandwidth,
64 QAM, coding rate 1 and 19% pilot symbol overhead).
The performance gain of MIMO heavily depends on the
accurate estimation of Channel State Information (CSI), which
is crucial for every communications system. As addressed in
[1], the complexity of MIMO channel estimation is infeasible
for most low-complexity receivers in practice. Hence, the
pilots (reference signals) introduced in LTE have been chosen
orthogonal to allow low-complexity channel estimation for
multi-antenna transmissions. In this work, a channel estimation
ASIC is implemented as an accelerator for 3GPP LTE modem
platforms to handle real-time channel estimation with different
antenna congurations and mobility.
In [9], a practical implementation of a general MIMO
OFDM transceiver is presented, including a simple least
squares channel estimator. Hardware implementation aspects
of a Singular Value Decomposition (SVD) based channel
estimator are described in [10]. In [11], authors present an
implementation of an LTE terminal, however the presented
channel estimator is very simple and its performance is not
sufcient. In contrast to the cited works, we utilize a fully
standard compliant LTE physical layer link, including the
standardized pilot structure. Therefore, our results are repre-
sentative for:
1) throughput performance,
2) channel estimation performance
3) and implementation complexity
of feasible LTE channel estimators.
The remainder of the paper is organized as follows. In
Sec. II, the LTE system model is presented in brief. Sec. III
introduces several channel estimation algorithms evaluated in
this paper. Sec. IV elaborates the architecture of the channel
estimator. In Sec. VI, link-level simulation results with xed-
point datatypes are presented to quantitatively justify the de-
sign trade-off between performance and complexity. The ASIC
implementation is presented in Sec. VII. Finally, Sec. VIII
concludes the paper.
II. SYSTEM MODEL
A. Overview
As dened in [4], the duration of an LTE frame is
T
frame
= 10 ms. Each frame consists of ten subframes and
each subframe of two slots. LTE species two different cyclic
prex lengths (normal and extended) to handle different delay
spreads. Accordingly, to obtain a constant slot duration of
0.5 ms, each slot contains seven OFDM symbols in case of
normal cyclic prex transmission and six OFDM symbols
in case of extended cyclic prex transmission. A so-called
resource block (RB) consists of twelve adjacent subcarriers
over seven (six) OFDM symbols.
B. Reference Signal Structure
As in many other OFDM systems (e.g. DVB-T), pilot
symbols (called reference signals in LTE) are inserted during
subcarrier mapping in both time and frequency directions such
that the receiver can estimate time-variant radio channels. One
major enhancement in LTE is the introduction of multiple
antennas. Being different from the typical MIMO-OFDM
channel estimation problems [1], [8], [12] in academia, the
reference signals transmitted from multiple antennas are or-
thogonal to each other. As a consequence the channel impulse
response between different Tx-Rx antenna pairs can be es-
timated individually. We therefore consider in our system
model only a single transmit and a single receive antenna.
The received OFDM symbol y at one receive antenna can be
written as
y = Xh +w, (1)
Copyright 2011 IEEE. Published in Proc. 17th European Wireless Conference (EW 2011), April, Vienna, Austria
Antenna port 1
Antenna port 2
Antenna port 3
Antenna port 4
frequency
time
Fig. 1: Reference Signal Mapping in LTE
where the vector h contains the channel coefcients in the
frequency domain and w is additive zero mean white Gaussian
noise with variance
2
w
at the receive antenna. The matrix
X comprises permuted data symbols x
d
and pilot symbols
x
p
on the main diagonal. The permutation is given by the
permutation matrix P:
x =
_
x
T
p
x
T
d

T
, (2)
x = P x, (3)
X = diag (x) . (4)
The vectors y, h and w in Eq. (1) can be divided according
to Eq. (2) into two parts:
1) a part corresponding to the pilot symbol positions,
2) a part corresponding to the remaining data symbol
positions.
III. CHANNEL ESTIMATION IN LTE
In this section, the different types of channel estimation
techniques considered in this paper are explained.
A. LS
The Least-Squares (LS) channel estimator for subcarriers
on which pilot symbols are located, is given by

h
LS
p
= X
H
p
y
p
. (5)
The remaining channel coefcients have to be obtained by
interpolation. In this work, we apply linear interpolation.
B. LMMSE
The Linear Minimum-Mean-Square-Error (LMMSE) chan-
nel estimator requires the knowledge of the second order
statistics of the channel and the noise. It performs better
than the LS estimator, but it requires higher computational
complexity. The LMMSE channel estimate can be obtained
by ltering the LS estimate

h
LMMSE
= R
h,h
p
_
R
h
p
,h
p
+
2
w
I
_
1

h
LS
p
, (6)
where the R
h
p
,h
p
is the autocorrelation matrix of the channel
at the pilot symbols position and R
h,h
p
is the crosscorrelation
matrix between the channel at the data symbol positions and
the channel at the pilot symbol position.
C. ALMMSE
The performance of the LMMSE estimator is in general
superior to that of the LS estimator [1] at the cost of higher
computational complexity because of the matrix inversion
in Eq. (6). In a real-time implementation, a reduction of
complexity is desired while preserving the performance of
the LMMSE estimator. In this section, we discuss a low
complexity estimator originally proposed in [2], where the
authors applied this estimator in WiMAX. The main difference
between LTE and WiMAX from the channel estimation point
of view is that LTE utilizes distributed pilot symbols for the
channel estimation instead of a preamble utilized in WiMAX
[4], [5]. Consequently, the ALMMSE estimator presented in
[2] has to be adopted for the application in LTE.
The two main ideas of the ALMMSE estimator in [2] are:
1) Calculate the LMMSE ltering matrix by using only the
correlation between L neighboring subcarriers instead of
the full correlation between all subcarriers as in the case
of LMMSE estimation.
2) Assume that the correlation is frequency independent
and estimate a full rank L L autocorrelation matrix
utilizing the LS channel estimate.
The ALMMSE algorithm adapted for LTE consists of the
following steps:
1) Choose the correlation length L that denes the dimen-
sion of

R
(L)
h
to be L L. Due to the pilot structure
in LTE, L is bounded by 3 L K
sub
(K
sub
is the number of subcarriers). A small L is generally
desirable from the complexity point of view. However,
with increasing L also the performance of the estimator
will improve. If L = K
sub
is chosen, the ALMMSE
estimator is equal to the LMMSE estimator.
2) Choose the interval I
k
of L consecutive subcarrier in-
dices according to the following rule (k is the subcarrier
index of the channel coefcient to be estimated):
I
k
=
_
_
_
[1, . . . , L] ; k
L+1
2 _
k
L1
2
, . . . , k +
L1
2

; otherwise
[K
sub
L + 1, , K
sub
] ; k K
sub

L1
2
(7)
Let h
(I
k
)
be the channel vector for the subcarriers from
the chosen interval I
k
h
(I
k
)
=
_
h
I
k
(1)
, . . . , h
I
k
(L)

T
. (8)
3) Find the K
(L)
p
=
L
3
subcarriers on which the pilot
symbols are located within the chosen interval I
k
. Let
h
(I
k
)
p
be the vector of channel coefcients on the pilot
symbol positions.
4) Create a permutation matrix P of dimension LL with

h
(I
k
)
=
_
h
(I
k
)
p
T
h
(I
k
)
d
T
_
T
= P
T
h
(I
k
)
, (9)
where h
(I
k
)
d
is the channel vector on the data positions
within the chosen interval I
k
.
5) Permute

R
(L)
h
with P

R
(L)
h
= P
T

R
(L)
h
P. (10)
6) Extract

R
(L)
h
LS
and

R
(L)
h,h
LS
from

R
(L)
h
as

R
(L)
h
LS
=
_

R
(L)
h
_
K
(L)
p
,K
(L)
p
, (11)

R
(L)
h,h
LS
=
_

R
(L)
h
_
L,K
(L)
p
. (12)
(The operator (A)
M,N
creates a submatrix of matrix A
which is given by the rst M rows and rst N columns).
7) Calculate the ltering matrix

F
(L)

F
(L)
=

R
(L)
h,h
LS
_

R
(L)
h
LS
+
2
w
I
_
1
. (13)
8) Obtain an estimate of the channel coefcients by multi-
plying (ltering) the LS estimate on the pilot positions
from the chosen interval I
k
with

F
(L)
and permuting
(multiplying by P
T
). Finally, the k-th element has to be
selected
q = P
T

F
(L)
. .
F
l

h
(I
k
)
LS
(14)

h
ALMMSE,k
=
_
_
_
[q]
k
; k
L+1
2
[q]

L+1
2

; otherwise
[q]
L+kK
sub
; k K
sub

L1
2
(15)
[q]
k
means that the k-th element of vector q is selected.
Due to the fact, that in LTE at least one pilot symbol
is transmitted every third subcarrier, just three different
permutations of the matrix

F
(L)
have to be calculated,
if the channel is quasi-static within one OFDM symbol.
IV. PROCESSING FLOW
The algorithm presented in Sec. III-C needs further op-
timizations for real-time implementation. As illustrated in
Fig. 2, the processing ow of the LTE channel estimator
includes both LS and ALMMSE channel estimation. The rst
stage is the scaling in Eq. (5), which computes the channel
response on pilot symbol subcarriers. ALMMSE is only used
when the UE is in low-speed mode (the mode in which
MIMO is more likely to be used). When the UE velocity is
higher than the threshold velocity K
v
, LS channel estimation
is used. When the UE velocity is lower than K
v
, ALMMSE
estimation is used. Meanwhile, to estimate the correlation
matrix

R
(L)
h
LS
and

R
(L)
h,h
LS
, a number of subframes (those
with subframe number < N after the UE enters Connected
Mode) will be used as training. During the training phase,
the channel estimator works in LS mode and calculates the
correlation matrices using the result of LS estimation. The
updated correlation matrix is stored in a buffer and updated
every subframe. The computation of

R
(L)
h
LS
and

R
(L)
h,h
LS
is only
needed when the training phase is nished and the estimator
enters ALMMSE mode.
In ALMMSE mode, three coefcient matrices F
1
, F
2
, F
3
need to be computed from the correlation matrices. This
involves the matrix inversion of 4 4 matrices (for 4 2 and
22 MIMO) assuming L = 12. However, such an operation is
only needed when the SNR changes signicantly. If the SNR
estimation is subframe based, it will not change within one
subframe. Thus, the major computational cost of ALMMSE
estimation is the matrix multiplication in Eq. (14).
scaling
FIR interpolation
YES
YES
YES
YES
NO
NO
NO
NO
LS ALMMSE
||speed > K
v
connected_time <= N
Calculate ltering
matrices F
1
, F
2
, F
3
Calculate
||SNR
new
-SNR
old
||<=
speed > K
v
Update the correlation
matrix
connected_time==N
Calculate
and
R
h
(L)

R
h,h
(L)

LS
R
h
(L)

LS
h
ALMMSE,k

Fig. 2: Processing Flow of the LTE Channel Estimator


V. CHANNEL ESTIMATOR ARCHITECTURE
The channel estimator architecture is shown in Fig. 3. It
contains two major parts, the controller unit and the Interpo-
lation Unit (IU). Our implementation allows real-time channel
estimation for a Category 4 LTE modem (with up to 4 2
spatial multiplexing and 20 MHz maximum bandwidth).
A. Controller Unit
The Controller Unit is in general nite state machine that
performs matrix inversion and handles the address calcula-
tions and conguration of the IU for all tasks including the
extraction of RS from the resource blocks and permutation
of channel estimates. In principle, it contains an address
generation unit that generates parallel addresses for multiple
data that are fetched to compute the

h
LS
p
in Eq. (5). As
AGU
Symbols after CP
remove and FFT
R
hh
Coef
Controller
Floating-point
Unit
Estimated Channel
Coecient H
Interpolation Unit
four-way CMAC
CMAC CMAC CMAC CMAC
Fig. 3: Block Diagram of the Channel Estimator Implementation
presented in Sec. III-C, in order to exploit the channel statis-
tics, ALMMSE channel estimation involves operations such as
matrix inversion dened in Eq. (13). Hence, the Controller also
contains a 16-bit oating point datapath (with a multiplier, an
accumulator and a reciprocal unit to compute scalar inverses).
The matrix inversion method presented in [6] is used to
compute the updated channel autocorrelation R
hh
. Since the
inverse is needed only when the SNR changes signicantly
(which is at a much lower rate compared to the symbol rate),
it can easily be handled by the controller software.
B. Interpolation Unit (IU)
The IU is responsible for computing the actual

h
LS
p
and the
channel response at the data resource elements by interpolating

h
LS
p
. The main operation involved is matrix-vector multiplica-
tion. The IU mainly consists of a 16-bit xed point four-way
Single Instruction Multiple Data (SIMD) Complex Multiply-
ACcumulate (CMAC) unit. Thanks to the regularity of the LTE
RS locations, the interpolation coefcients can be efciently
stored in a small look-up table to reduce the hardware cost.
The implementation of the four-way SIMD CMAC unit is
similar to the one presented in [6].
Note that in this paper, the channel estimator is designed
as a hardware accelerator to focus on the signal processing of
channel estimation and make it easier to quantize the silicon
cost. For more exibility and hardware reuse, the channel
estimator can also be mapped to a baseband DSP processor
such as [16] in pure software.
VI. LINK-LEVEL PERFORMANCE
In order to evaluate the performance of the channel esti-
mation algorithms, simulation is carried out using a standard-
compliant 3GPP LTE simulator [3]. The simulator is imple-
mented partly in Matlab and partly in C. It includes the com-
plete physical layer signal processing such as timing/frequency
synchronization [15], channel estimation, subcarrier demap-
ping, rate-matching [14] and turbo decoding. H-ARQ [13]
based on the CRC of coded blocks is also enabled to support
up to three retransmissions. Ped-B [7] is selected as channel
model. It is assumed that the channel is quasi-static within
one OFDM symbol duration. The bandwidth is set to 5 MHz
in the simulation, the velocity of user equipment is 3 km/h.
The parameter L is set to 12. Perfect synchronization and
ML detection is assumed to focus the simulation on channel
estimation performance.
In Fig. 4, the throughput result are shown in the case a 16-
bit oating point datatype is used, the performance severely
degrades at high SNR compared to IEEE double precision.
This is mainly due to the fact the matrices involved in the
ALMMSE processing are nearly singular and require a suf-
ciently high numerical precision. When 64-bit oating point
is used, no degradation is observed. In order to minimize the
number of bits and the hardware cost, SNR under-estimation
(regularization) is used, which sets a xed
2
value when the
SNR is higher than a threshold (12 dB in this paper). This also
reduces the amount of processing needed in ALMMSE. The
result shows that a 16-bit datatype with SNR under-estimation
incur a negligible degradation compared to 64-bit processing
without SNR under-estimation.
5 10 15 20
0
2
4
6
8
10
12
14
16
18
SNR [dB]
T
h
r
o
u
g
h
p
u
t

[
M
b
i
t
/
s
]


PERFECT
LMMSE
ALMMSE 64-bit
ALMMSE 16-bit
regularized
ALMMSE 16-bit
LS
Fig. 4: Coded Throughput (rate 0.602, 16-QAM)
Fig. 5 shows the MSE for different LTE channel estimators.
The LMMSE channel estimator outperforms the remaining
channel estimators. However, its hardware implementation
requires the most computational power. The ALMMSE chan-
nel estimator using IEEE double precision shows 4 dB SNR
improvement when compared to the LS channel estimator.
Unfortunately, implementation using IEEE double precision
is too costly. The ALMMSE channel estimator using 16 bit
implementation is at high SNR unstable. However, using SNR
underestimation the performance is close the the ALMMSE
with IEEE double precision. This channel estimator offer good
performance-complexity tradeoff.
VII. IMPLEMENTATION
The channel estimator is implemented using the ST CMOS
65 nm process libraries and Synopsys low-power design ow.
5 10 15 20
10
-4
10
-3
10
-2
10
-1
10
0
SNR [dB]
M
S
E
LMMSE
ALMMSE 64-bit
ALMMSE 16-bit
regularized
ALMMSE 16-bit
LS
Fig. 5: MSE of the different channel estimator
Table I depicts the synthesized gate count, and working
frequency.
Area of Controller (kgate) 17
Area of IU (kgate) 32
Total Area (kgate) 49
Working Frequency (MHz) 200
TABLE I: Implementation Cost Estimate
In LTE, assuming the channel estimation has to be per-
formed every subframe (1 ms), up to 1200 channel coefcients
have to be estimated per transmit-receive antenna pair to
support the full 20 MHz downlink bandwidth. The proposed
architecture running at 200 MHz can handle ALMMSE esti-
mation for up to 4 2 MIMO systems in real-time.
VIII. CONCLUSIONS
The result shows that algorithm-architecture cooptimization
can further simplify the ALMMSE channel estimation al-
gorithm. A short wordlength can be used to allow a low
cost ASIC implementation. With SNR under-estimation, a
16-bit oating-point datatype provides sufcient precision to
support the 44 matrix inversion involved in ALMMSE with
negligible degradation of performance. A trade-off between
performance and complexity has been reached at a feasible
silicon cost with a 1 dB throughput gain compared to the LS
channel estimation.
IX. ACKNOWLEDGEMENT
This work has been funded by the Christian Doppler Labora-
tory for Wireless Technologies for Sustainable Mobility under the
supervision of Christoph Mecklenbr auker. The authors thank their
industrial partners A1 Telekom Austria AG and KATHREIN-Werke
KG. Furthermore, the nancial support by the Federal Ministry
of Economy, Family and Youth and the National Foundation for
Research, Technology and Development is gratefully acknowledged.
The work of J. Eilert, D. Wu and D. Liu was funded by EU FP7
MultiBase Project in partnership with Ericsson AB et al.
REFERENCES
[1] Q. Wang, D. Wu, J. Eilert and D. Liu, Cost Analysis of Channel
Estimation in MIMO-OFDM for Software Dened Radio, in
Proc. IEEE Wireless Communications & Networking Conference,
April 2008.
[2] C. Mehlf uhrer, S. Caban, M. Rupp, An Accurate and Low
Complex Channel Estimator for OFDM WiMAX, in Proc. IEEE
ISCCSP, March 2008.
[3] C. Mehlf uhrer, M. Wrulich, J. C. Ikuno, D. Bosanska and M.
Rupp, Simulating the Long Term Evolution Physical Layer,
in Proc. of the 17th European Signal Processing Conference
(EUSIPCO 2009), Aug. 2009, Glasgow, Scotland
[4] 3GPP, Technical Specication Group Radio Access Network;
Evolved Universal Terrestrial Radio Access (E-UTRA); Physical
Channels and Modulation (Tech. Spec. 36.211 V8.4.0), Sept 2008
[5] IEEE, IEEE Standard for Local and Metropolitan Area Net-
works Part 16: Air Interface for Fixed Broadband Wireless Access
Systems, 2004
[6] J. Eilert, D. Wu, D. Liu, Implementation of a Programmable
Linear MMSE Detector for MIMO-OFDM, in Proc. IEEE
ICASSP, 2008
[7] ITU, Recommendation ITU-R M.1225: Guidelines for Evalua-
tion of Radio Transmission Technologies for IMT- 2000 Systems,
1998
[8] M.

Simko, C. Mehlf uhrer, M. Wrulich and M. Rupp, Doubly
Dispersive Channel Estimation with Scalable Complexity, in
Proc. IEEE WSA, 2010
[9] S. Haene, D. Perels, A. Burg, A Real-Time 4-Stream MIMO-
OFDM Transceiver: System Design, FPGA Implementation, and
Characterization, IEEE Journal on Selected Areas in Communi-
cations, vol.26, no.6, pp.877-889, August 2008
[10] J. Lofgren, S. Mehmood, N. Khan, B. Masood, M. Awan,
I. Khan, N.A. Chisty, P. Nilsson, Hardware implementation
of an SVD based MIMO OFDM channel estimator, in Proc.
NORCHIP, 2009
[11] J. Berkmann,C. Carbonelli, F.Dietrich, C.Drewes, W. Xu, On
3G LTE Terminal Implementation - Standard, Algorithms, Com-
plexities and Challenges, in Proc. IWCMC, 2008
[12] M.

Simko, C. Mehlf uhrer, T. Zemen and M. Rupp, Inter-
Carrier Interference Estimation in MIMO OFDM Systems with
Arbitrary Pilot Structure, in Proc. IEEE VTC Spring, 2011
[13] J. C. Ikuno, C. Mehlf uhrer and M. Rupp, A Novel LEP Model
for OFDM Systems with HARQ, in Proc. IEEE ICC, 2011
[14] J. C. Ikuno, S. Schwarz and M.

Simko, LTE Rate Matching
Performance with Code Block Balancing, in Proc. European
Wireless Conference (EW), 2011
[15] Q. Wang, C. Mehlf uhrer and M. Rupp, Carrier Frequency
Synchronization in the Downlink of 3GPP LTE, in Proc. IEEE
PIMRC, 2010
[16] A. Nilsson, E. Tell, and D. Liu, An 11 mm
2
70 mW Fully-
Programmable Baseband Processor for Mobile WiMAX and DVB-
T/H in 0.12m CMOS, in IEEE Journal of Solid-State Circuits,
vol. 44, no. 1, pp. 90-97, 2009

You might also like