Sensors 21 04861 v2
Sensors 21 04861 v2
Sensors 21 04861 v2
Article
Machine Learning-Based 5G-and-Beyond Channel Estimation
for MIMO-OFDM Communication Systems
Ha An Le 1 , Trinh Van Chien 2,3 , Tien Hoa Nguyen 1 , Hyunseung Choo 4 and Van Duc Nguyen 1, *
Abstract: Channel estimation plays a critical role in the system performance of wireless networks. In
addition, deep learning has demonstrated significant improvements in enhancing the communication
reliability and reducing the computational complexity of 5G-and-beyond networks. Even though
least squares (LS) estimation is popularly used to obtain channel estimates due to its low cost
without any prior statistical information regarding the channel, this method has relatively high
estimation error. This paper proposes a new channel estimation architecture with the assistance of
deep learning in order to improve the channel estimation obtained by the LS approach. Our goal is
achieved by utilizing a MIMO (multiple-input multiple-output) system with a multi-path channel
profile for simulations in 5G-and-beyond networks under the level of mobility expressed by the
Doppler effects. The system model is constructed for an arbitrary number of transceiver antennas,
Citation: Le, H.A.; Van Chien, T.;
while the machine learning module is generalized in the sense that an arbitrary neural network
Nguyen, T.H.; Choo, H.; Nguyen, V.D.
architecture can be exploited. Numerical results demonstrate the superiority of the proposed deep
Machine Learning-Based
5G-and-Beyond Channel Estimation
learning-based channel estimation framework over the other traditional channel estimation methods
for MIMO-OFDM Communication popularly used in previous works. In addition, bidirectional long short-term memory offers the
Systems. Sensors 2021, 21, 4861. best channel estimation quality and the lowest bit error ratio among the considered artificial neural
https://doi.org/10.3390/s21144861 network architectures.
Academic Editor: Giovanni Pau Keywords: machine learning; channel estimation; MIMO-OFDM; frequency selective channels
effects should be estimated and compensated at the receiver. For this purpose, the pilot
signals should be known to both the transmitter and receiver, which are exploited to per-
form the channel estimation. In a 5G system, the structure of the pilot symbols in each
data frame could be varied depending on the different use cases in practice [6]. We note
that, among the traditional channel estimation methods, least squares (LS) estimation is
well-known as a low computational complexity method because this estimation requires
no prior channel statistics [7,8]. However, LS estimation provides relatively high channel
estimation errors in many practical applications, especially for multi-path channels. As an
alternative solution, minimum mean square error (MMSE) estimation yields much better
channel estimation quality than LS estimation by minimizing the channel estimation er-
rors on average [9]. The closed-form expression of the channel estimates obtained by the
MMSE estimation relies on the assumption that, for instance, the propagation channels are
modeled by a linear system, while each channel response follows a circularly symmetric
complex Gaussian distribution [10,11]. Nonetheless, the MMSE estimation usually has high
computational complexity since channel statistic information—i.e., the mean values and the
covariance matrices of the propagation channels—is required. In many propagation envi-
ronments, this statistical information is either extremely difficult to obtain or varies quickly
in a short coherence time, making MMSE estimation challenging to implement [12,13].
Machine learning has recently attracted a great deal of attention in both academia
and industry for various applications of wireless communications, such as radio resource
allocation, physical security, signal decoding, and channel estimation [14–18]. Regarding
the channel estimation application, the authors in [19] reported the use of a trained deep
neural network (DNN) model with the help of a pilot signal to estimate underwater chan-
nels in an efficient manner. In [20], the authors suggested to exploit the channel correlation
in both time and frequency domains with a DNN model to perform channel estimation
for the IEEE 802.11p standard. Furthermore, in [21], the authors investigated the effects of
the channel estimation phase for a wireless energy transfer system and demonstrated that
downlink channel estimation is necessary to harvest energy feedback information. In the
considered system, a DNN structure makes better channel estimates than the traditional
estimations comprising the LS estimation and the linear MMSE (LMMSE) estimation. We
emphasize that several sophisticated techniques have been applied to estimate channel
state information (CSI) to date. In a MIMO system, we could assume in practice that the
CSI from each antenna at the BS shares the same autocorrelation pattern for enhancing
the channel estimation quality of a particular terminal [22]. By effectively deploying this
property and arranging the CSI from the multiple antennas into a matrix, the system can
exploit a well-known technique from the fields of image recognition and image denois-
ing [15,23–25] to predict the pattern of CSI variation by means of the channel structure.
In particular, a convolutional neural network (CNN) is applied in [26] for channel estima-
tion in a mmWave Massive MIMO system to reduce noise from the estimated channel, thus
outperforming the traditional counterparts. In [27], the authors proposed a CNN-based
scheme to predict channels in a large-scale MIMO system as the channels age. The authors
in [28] used a deep CNN to enhance the channel estimation quality while retaining high
performance compared to the traditional methods by utilizing less pilot overhead. The nu-
merical results showed that the data-driven method remarkably improved the prediction
quality. However, the authors in those papers did not consider the influences of Doppler
frequencies, which can cause significant changes in the channels over time and even make
the channels nonstationary. In addition, the velocity of the receiver may often vary; thus,
it is important to evaluate the effect of the mismatch of the Doppler frequency between
the training and testing stages of a DNN model. Another approach is to treat instanta-
neous channels as a time series data and then consider the CSI estimation as a typical time
series learning problem to model the problem. In this case, there exist several powerful
architectures in the literature that can track the long-term correlation of the channel profile
effectively, including long short-term memory (LSTM) [29] and the gated recurrent unit
(GRU) [30]. The authors in [31] suggested a scheme that integrates an LSTM network and
Sensors 2021, 21, 4861 3 of 23
2. System Model
In this section, we present a MIMO-OFDM system that comprises a transmitter send-
ing signals to a receiver as illustrated in Figure 1. The transmitter and receiver antenna
Sensors 2021, 21, 4861 4 of 23
2.1. Transmitter
At the transmitter side, the binary data are first encoded and mapped with quadrature
amplitude modulation (QAM) by utilizing the modulation block. We suppose that the
system transmits data in T time slots, and the QAM symbols at time slot t, t = 1, · · · , T,
are combined to a data vector x(t) ∈ C N as
where N is the total number of modulation symbols. The encoded data are then separated
into the NT vectors corresponding to the NT transmit antennas as follows:
The data for each antenna are converted from serial to parallel, and then the pilot
signals, which are known from both the transmitter and receiver, are inserted along with
data in every layer for channel estimation purposes. We denote x a (t) with a = 1, · · · , NT
being the signal vector with a pilot inserted into the corresponding data xi (t); then, the
IFFT (inverse fast Fourier transform) block is applied to x a (t) such that the signals are
transformed from the frequency domain into the time domain (denoted by x̃ a (t)) as
Cyclic Cyclic
.. Pilot .. IFFT .. P/S .. FFT .. Equalization .. P/S
S/P Prefix Prefix S/P
. Insertion . . Insertion Removal . . .
Bits
. . .
Cyclic Cyclic
.. Pilot .. IFFT .. P/S .. FFT .. Equalization .. P/S
S/P Prefix Prefix S/P
Transmitter . Insertion . . Insertion Removal . . .
Receiver
LS DL model
Figure 1. The illustration of the considered MIMO-OFDM system model with the proposed DNN-
aided module in blue. In the figure, CP denotes cyclic prefix; S/P denotes serial to parallel; P/S de-
notes parallel to serial; IFFT denotes inverse fast Fourier transform; and FFT denotes fast Fourier trans-
form.
After that, the cyclic prefix (CP) with the length NG is inserted as a guard interval to
alleviate the ISI (inter-symbol interference) by utilizing the CP insertion block. By including
the cyclic prefix, the transmitted signal that is denoted by x̃ ga (t) is formulated in the time
domain as follows:
(
[x̃ a (t)]n+ NFFT n = − NG , − NG + 1, . . . , −1
[x̃ ga (t)]n = (4)
[x̃ a (t)]n n = 0, 1, . . . , NFFT − 1,
where NFFT is the FFT size. This means that the last NG samples of x̃ a (t) are used as a cyclic
prefix and inserted into the beginning of this symbol, resulting in the signal x̃ ga (t) with a
length of NFFT + NG .
The Doppler spectrum of each tap is characterized by a classical Jake’s spectrum shape,
which is expressed as
1
S( f ) = r 2 , | f | < f d , (5)
f
π fd 1 − f
d
vf
where f d (Hz) is the maximum Doppler shift; i.e., f d = c c , for a given speed v(m/s) and a
carrier frequency f c (Hz), with c ≈ 3 × 108 being the light speed. The auto-correlation of
Jake’s Doppler spectrum is [34]
Z f
d
R(τ ) = S( f )e2πτ d f = J0 (2π f d τ ), (6)
− fd
where J0 (.) is the first kind of Bessel function of order 0. From the continuous form in (6),
the discrete form of the auto-correlation function is defined as follows:
where l and Tsym are the symbol index and the symbol duration, respectively. We denote
h a,b (τi , t) as the time-variant channel impulse response from the a-th transmission antenna
(a = 1, · · · , NT ) to the b-th receiver antenna (b = 1, · · · , NR ), where τl is the transmission
delay at the l-th tap of the propagation channels. A mathematical description of the
frequency-selective and time-variant channel model is given in [35] as follows
L −1
h a,b (τl , t) = ∑ hl δ(τl − t) × expj[2π f D,l (t − τl ) − 2π f c τl ], (8)
l =0
with l is the index of taps, hl represents the l-th resolved amplitude, and τl represents the
express delay of the l-th tap. f D,l = v(t) f c cos[θl ]/c is the Doppler frequency induced by
the relative movement of the Tx and Rx, v(t) represents the relative velocity, θl denotes the
aggregate phase angle of all components arriving in the l-th tap, and c is the speed of light.
To model the propagation channels in this paper, we exploit the Matlab 5G toolbox [33]
to simulate the instantaneous channels. The 5G-and-beyond channels have the TDL-C
profile displayed in Figure 2, with the color-map displaying the channel gain. In more detail,
the channel gain varies from −12 dB to −47 dB. This figure indicates that the considered
channel profile is not sparse, which is a consequence of the mobile communication carrier
frequency at sub-6GHz; i.e., here, the carrier frequency is set to 4 GHz (the channel
estimation quality can be enhanced if a proper domain, in which the channels are spares, is
determined, and thus a sparse channel estimation technique is effectively utilized. This
work is left for the future). In addition, Figure 3 plots the expectation E{HH H }, where H is
the channel matrix of a subcarrier. It shows that all the coefficients are non-zero, therefore
verifying the spatial correlation among the channels.
-15
5
-20
-25
10
-30
15
-35
20 -40
-45
Figure 2. The 2 time−varying channel profile with f d = 200 Hz in the 20 OFDM symbols.
Sensors 2021, 21, 4861 6 of 23
Figure 3. The expectation E{HH H }, where H ∈ C NT × NR is the channel matrix of a subcarrier. Here,
NT = NR = 4 and f d = 200 Hz.
By utilizing the channel model in (8) and the transmitted signal in (4), the received
signal after passing through the 5G multi-path channel is formulated as
NT
ỹ gb (t) = ∑ h̃a,b (τ, t) ⊗ x̃ga (t) + ñb (t), (9)
a =1
where h̃ a,b (τ, t) = [h a,b (τ1 , t), . . . , h a,b (τL , t)]; ñb (t) is the additive noise vector, whose ele-
ments are independent and identically distributed random variables following a circularly
symmetric complex Gaussian distribution with zero-mean and variance σn2 . From the re-
ceived signal in Equation (9), we are able to estimate the propagation channels and analyze
the system performance as shown below.
2.3. Receiver
At the receiver side, the cyclic prefix is first removed from the received signal ỹ gb (t)
on each antenna using the cyclic prefix removal module to obtain the vector ỹb (t) of the
length NFFT . The signal is then converted to the parallel form and transformed into the
frequency domain by the FFT block, which gives a frequency domain signal yb (t) of
The pilot signal is exacted from the frequency-domain signal for channel estimation
purposes. After estimating the channel, the received signal yb (t) is equalized and congre-
gated into a serial sequence from all the receiver antennas by the layer demapping module.
The signal is then demodulated by the demodulation scheme, which corresponds to the
approach used by the transmitter. At this point, the output of the MIMO-OFDM system
model is obtained as the final binary data sequence.
1 1
r (n) = √ [1 − 2c(2n)] + j √ [1 − 2c(2n + 1)], (11)
2 2
Sensors 2021, 21, 4861 7 of 23
where c(i ) is the pseudo-random sequence and is defined by a length-31 Gold sequence
as [36]
where mod is the modulo operator, and the 31-first sequence x1 (n) and x2 (n) are initial-
ized as
(
1, n = 0
x1 ( n ) = (15)
0, n = 1, 2, · · · , 30
30
cinit = ∑ x 2 ( n )2n . (16)
n =0
In the initialization of sequence x2 (n), the value of cinit depends on the application
of the sequence c(n). In the channel estimation application, the value of cinit is calculated
as [36]
slot n n
cinit = [217 ( Nsymb ns, f + l + 1)(2NIDSCID + 1) + 2NIDSCID + nSCID ]mod231 , (17)
where k denotes the subcarrier index, NP = NFFT /D f is the number of pilot signals in
an OFDM symbol, and ∆ defines the pilot position in the frequency domain for each
transmission antenna, the value of which can be found in Table 7.4.1.1.2-1 in [36].
Sensors 2021, 21, 4861 8 of 23
Antenna port
...
Df
Subcarrier Index
Antenna port
...
Dt
Symbol Index
Figure 4. The pilot structure used in the considered MIMO-OFDM system.
3.1. Motivations
As long as no inter-carrier interference occurs, each subcarrier can be expressed as
an independent channel, therefore preserving the orthogonality among the subcarriers.
The orthogonality allows each subcarrier component of the signal in (10) to be expressed
as the Hadamard product of the transmitted signal and channel frequency response at the
subcarrier [34] as
NT
yb (t) = ∑ ha,b (t) xa (t) + nb (t), (21)
a =1
where nb (t), h a,b (t), and x a (t) are the Fourier transforms of the noise, channel, and signal,
respectively (unless we are working in the frequency domain).
Of all the traditional channel estimation methods, LS estimation is one of the most
common approaches. We denote by ĥLSb the channel estimate from the transmission
antennas at the b-th receiver antenna obtained by this estimation method. LS estimation
gives the closed-form expression of the channel estimate as [8]
−1
ĥLSb (t) = [X(t) H X(t)] X H ( t ) y b ( t ), (22)
is the NP × ( NT NP ) matrix, denoting the transmitted signal from the transmission antennas;
NP is the number of the pilot signals in an OFDM symbol; and (·) T is the regular transpose.
The channel estimate from each transmission antenna can be formulated as
h iT
ĥLSbi (t) = ĥLSb (t) (i−1) N , . . . , ĥLSb (t) iN −1 , i = 1, · · · , NT . (24)
P P
Then, the channel responses from all sub-carriers can be obtained by applying a linear
interpolation method. It should be noted that LS estimation is a widely-used estimation
approach because of its simplicity. Nevertheless, this technique does not exploit the side
information from noise and statistical channel properties, such as the spatial correlation
among antennas, in the estimation, and thus high channel estimation error can occur when
applying LS estimation for propagation environments with a high mobility.
To cope with the above drawbacks, one can utilize the LMMSE estimation approach,
which minimizes the mean square error. For LMMSE estimation, the channel estimate is
formulated in the closed form expression as [34]
−1
σn2
ĥLMMSEbi (t) = Rhĥ Rhh + IN ĥLSbi (t), i = 1, · · · , NT , (25)
LSbi σx2 P
where ĥLMMSEbi (t) is the LMMSE estimated channel from the i −th transmission antenna
at the b−th receiver antenna, Rhh = E{hh H } is the auto-correlation matrix of the channel
response in the frequency domain with the size of NP × NP ; Rhĥ H } is the
= E{hĥLSbi
LSbi
cross-correlation between the actual channel and the channel estimate obtained by the
LS estimation with the size of NFFT × NP ; σx2 is the variance of the transmitted signals,
respectively; I NP is the identity matrix of size NP × NP . The impacts of both noise and
spatial correlation among the antennas are taken into account by LMMSE estimation, which
is able to improve the channel estimation accuracy. However, LMMSE estimation requires
the prior knowledge of channel statistical properties; thus, the computational complexity
is higher than LS estimation. Additionally, since it may be difficult to obtain the exact
distribution of channel impulse responses in general [38], the performance of the LMMSE
estimation cannot always be guaranteed.
...
...
...
...
!
M
o = f (z) = f ∑ wi x i + b , (26)
i =1
where M is the number of inputs to the neuron for which xi is the i-th input (i = 1, . . . , M);
wi is the i-th weight corresponding to the i-th input; b is a bias; and o is the output of this
neuron. In Equation (26), f (.) is an activation function that is used to characterize the
non-linearity of the channel data. In our proposed FDNN-based channel estimation, we
borrow the tanh function as the activation function, which is defined as
ez − e−z
f (z) = , (27)
ez + e−z
where e is Euler’s number. To minimize the mean square error, the FDNN-based channel
estimation is used to learn the actual channel information provided by the channel estimates
obtained from the LS estimation as the input. In more detail, we define a realization of the
input for the training process as
n n o n o n o n oo
n n n n
Mn−FDNN = Re ĥLS (t) 0 , Im ĥLS (t) 0 , . . . , Re ĥLS (t) K , Im ĥLS (t) K , (28)
n ( t ) is LS-estimated channel gathered from all received antennas, where the su-
where ĥLS
perscript n denotes the n-th realization; K is the number of channel samples that FDNN can
handle; and the Re{·} and Im{·} operators give the real and imaginary part of a complex
number, respectively. The output of the neural network is formulated as
n n o n o n o n oo
On−FDNN = Re ĥn (t) 0 , Im ĥn (t) 0 , . . . , Re ĥn (t) K , Im ĥn (t) K , (29)
where ĥn (t) is the output of the neural network at the n-th realization. In
Equations (28) and (29), we separate the channel estimates into the real and imaginary
parts to handle the complex numbers for the use of the FDNN neural network. The learn-
ing process handles the one-by-one mapping as
n o n o n o n o
n n
Re ĥLS (t) s
, Im ĥLS (t) s
→ Re ĥn (t) s , Im ĥn (t) s , s = 0, . . . , K. (30)
As desired, the output of the neural network should be identical to the actual channels.
Alternatively, the purpose of the FDNN-aided estimation is to minimize the MSE between
the prediction and actual channels on average; thus, the loss function utilized for the
training phase is defined as
N T
1
2
∑ ∑
ĥn (t) − hn (t)
2 ,
LFDNN (W , B) = (31)
NK n =1 t =1
where N is the number of realizations used for training, and hn (t) is the actual channel
corresponding to ĥn (t). W and B include all the weights and biases, respectively. From a
set of initial values, the weights and biases are updated by minimizing the loss function (31)
with forward and backward propagation [15].
shown in Figure 6. As depicted in the figure, the proposed CNN consists of a 2D input
layer, convolution layers, activation layers, and a linear layer. The 2D input layer takes
the LS-estimated channel as an input, which is separated into the real part and image
part and reshaped to a 2D matrix form. The channel matrix is then fed to the convolution
layers. We denote by L the set of convolution layers for CNN. Each convolution layer
l ∈ L includes cl convolution kernels of size k l × k l that are convolved with the layer input
1 2
Il ∈ Ral −1 × al −1 ×cl −1 , where a1l −1 and a2l −1 are the size of the (l − 1)-th convolution layer.
1
The output of the l-th convolution layer Ol ∈ Ral × a2l × cl is
Ol = Conv(Il , wl ) + bl , l ∈ L, (32)
1 2
where wl ∈ Rkl ×kl ×cl and bl ∈ Ral ×al ×cl are the weights and biases of the convolution
kernel for the l-th convolution layer, respectively, and Conv(·, ·) is the convolution operator.
For the proposed CNN model, after each convolution layer, we apply the well-known
rectified linear unit (ReLU) activation layer, which is given as
...
In particular, to train the CNN model, we first reshape the LS-estimated channel from
n ∈ C NT NR × NFFT , separate it into a real part and image
all antennas into the matrix form ĤLS
part, and then define a realization of the input for the training process as
n n o n oo
n n
Mn−CNN = Re ĤLS , Im ĤLS . (34)
which contains the real and imaginary matrices of the channel estimates. The CNN model
is trained to handle the following matrix mapping as
n o n o n o n o
n n
Re ĤLS , Im ĤLS → Re Ĥn , Im Ĥn . (36)
The purpose of applying the CNN model is to minimize the mean square error
between the estimated and the true channels. Therefore, we use the loss function, which is
defined as follows:
1 N
Ĥn − Hn
2 ,
∑
LCNN (W , B) = F
(37)
N n =1
Sensors 2021, 21, 4861 12 of 23
where N is the number of realizations used for training, and Hn is the actual channel
in the matrix shape corresponding to Ĥn . W and B include all the weights and biases,
respectively. During the training process, the weights and biases of the CNN will be
updated by minimizing the loss function (37). We stress that the loss function (37) shares the
same training data with that in Equation (31), but the fine structure is different. Specifically,
the instantaneous channels are stacked in the vector form in Equation (31), while it is
arranged in a matrix form in Equation (37) to make use of the benefits of the CNN.
where f (·) is the activation function; ht and ht−1 are the hidden states at the time step t
and t − 1, respectively; xt and Yt are the input and the output at the time step t; Wih ,Whh ,
and Who are the weights for the input layer to the hidden layer, the hidden layer to the next
hidden layer, and the hidden layer to the output layer, respectively; and bih , bhh , and bho
are the corresponding biases.
.
Y0 Y1 Yn
.
Linear layer Linear layer Linear layer
h0 h1 . hn-1
RNN cell RNN cell RNN cell
X0 X1 . Xn
However, the simple RNN cell has several weaknesses: first, it has no ability to exploit
the future information of the data, while the channel at the time step t has a relation
not only with the past but also the future. Thus, the bidirectional network should be
used in this case to obtain better performance. Second, another problem with using a
Sensors 2021, 21, 4861 13 of 23
simple RNN cell is that it cannot capture long-term information. One solution for this
problem is to use LSTM instead. Consequently, in this paper, we propose a bidirectional-
long short-term memory (bi-LSTM) network for 5G channel estimation to overcome the
above-mentioned weaknesses.
The structure of the proposed bi-LSTM network for the channel estimation is illus-
trated in Figure 8. In the bi-LSTM structure, the simple RNN cell is replaced by the
corresponding LSTM cell, which has the structure shown in the top of Figure 8. The com-
putation of the LSTM cell will give the result as shown in the following equations [41]:
f t = f ( W f h t −1 + U f X t + b f ), (40)
i t = f ( W i h t −1 + U i X t + b i ), (41)
0
ct = tanh(Wc ht−1 + Uc Xt + bc ), (42)
0
c t = f t c t −1 + i t c t , (43)
o t = f ( W o h t −1 + U o X t + b o ), (44)
ht = ot tanh(ct ), (45)
Yt = WHt + b, (46)
where Ht is the hidden state concatenated from the forward hidden state ht and the
0
backward hidden state ht , and W and b are the weights and biases of the linear layer,
respectively. Therefore, the bi-LSTM approach can exploit the relation of both history and
the future with the data in the current time step. To apply the bi-LSTM model for our
system, we first gather the LS-estimated channels from all antennas and then define a
realization of the input for the training process as
nh n o n oi h n o n oio
n n n n
Mn−bi−LSTM = Re ĥLS (0) ; Im ĥLS (0) , · · · , Re ĥLS ( L − 1) ; Im ĥLS ( L − 1) , (47)
where L is the sequence length considered for bi-LSTM network. Note that the input of
bi-LSTM ĥ LS is the LS-estimated channel for all NT × NR channel streams, so the number of
features for the input is 2NT NR . The output of the bi-LSTM network is the corresponding
true channel as
nh n o n oi h n o n oio
On−bi−LSTM = Re ĥn (0) ; Im ĥn (0) , · · · , Re ĥn ( L − 1) ; Im ĥn ( L − 1) , (48)
Sensors 2021, 21, 4861 14 of 23
LSTM cell
ht
ct-1 ct
x +
tanh
it
ft x ot x
c't
σ σ tanh σ
ht
ht-1
xt
Y0 Y1 ... Yn
h0, c0 h1, c1
LSTM cell LSTM cell LSTM cell
...
X0 X1 ... Xn
Figure 8. The structure of an LSTM cell (top) and the structure of the proposed bi-LSTM approach
(bottom).
The purpose of using a bi-LSTM network is to minimize the MSE between the pre-
dicted channel and the true channel; thus, the MSE loss function is considered. The objective
function of bi-LSTM network is expressed as
N L −1
1
2
Lbi−LSTM (W , B) =
NL ∑ ∑
ĥn (i) − hn (i)
2 , (49)
n =1 i =0
where hn (i ) is the true channel corresponding to ĥn (i ); W and B are all the weights and
biases of bi-LSTM; N is the total number of training samples; and the superscript n denotes
the n-th training sample. The loss function can be minimized by updating W and B using
gradient descent algorithms. We note that this paper considers the perfect instantaneous
channels to be available for the training stage, and therefore we emphasize the imperfect
channel state information as a potential extension of our work in the future.
Remark 1. The deep learning-based channel estimation framework studied in this paper is based
on the assumption that the perfect CSI is available during the training stage. Such information can
be very accurately estimated by the orthogonal pilot signals with a sufficiently large power budget.
Even though these conditions for the pilot signals increase the cost for the training stage, the neural
networks can learn the channel profile properly. The effects of imperfect channels on the training of
neural networks along with the performance reduction in the testing stage as a consequence are of
practical interest, which will lead to solid works in the future.
channel estimations. The number of arithmetic operations with the dominant costs is used
as the metric to obtain the computational complexity order [7].
For the FDNN-based channel estimation, from (26), we can see that if the model has
H hidden layers, the total number of arithmetic operations has a computational complexity
in the order of !
H −1
CFDNN = O In1 + n H K + ∑ n i n i +1 , (50)
i =1
where I, K, and ni denote the input size, output size, and the number of neurons in the i-th
hidden layer, respectively. Therefore, for one OFDM symbol, the input and output size is
chosen as I = K = 2NT NR , and we have NFFT samples. By using (50), the FDNN model
has a complexity that can be shown as
!!
H −1
CFDNN = O NFFT 2NT NR n1 + 2NT NR nK + ∑ n i n i +1 . (51)
i =1
4. Simulation Results
In this section, we evaluate the performance of the proposed deep learning-based
channel estimations over the 5G channel profile and compare it with the traditional meth-
ods; i.e., LS and LMMSE. We also provide an explanation for each obtained result. First,
the settings for the simulation are described, and then the simulation results for three
different aspects are presented and analyzed.
proportions for the training set, validation set, and test set as the FDNN. The parameters
for training those models are shown in Table 5.
Parameters Values
MIMO 4×4
FFT size 256
Subcarrier spacing 15 kHz
Cyclic prefix 24
Type of modulation 16-QAM
Channel PDP TDL-C
Maximum Doppler frequency 36 Hz, 200 Hz
Noise model Gaussian Noise
Sample frequency 3.84 MHz
Parameter Value
Number of input feature layers 32
Number of LSTM layers 2
Hidden layer size 100
Sequence length 256
Activation function Tanh and Sigmoid
Parameters Values
Optimizer Adam
Maximum number of epoches 100
Mini-bath size 32
Training error 10−5
Gradient descent accuracy 10−7
Learning rate 0.001
Maximum validation failures 6
To investigate the performance of all the considered channel estimations used in the
MIMO-OFDM system through the 5G channel model, two different scenarios correspond-
ing to the velocity of mobiles were exploited. In the first scenario, the receiver moved with
a low speed such that the maximum Doppler frequency was 36 Hz. The pilot symbols
were inserted along with data in both frequency and time domains. In the frequency
domain, we referred to the type 1 configuration of DM-RS as in [36]. In this configuration,
six subcarriers were defined for the DM-RS signal for each physical resource block that
contained 12 subcarriers. Thus, the pilot spacing in the frequency domain was D f = 2 for
both scenarios. In the time domain, the 5G system supported up to 4 pilot symbols in 1
slot that included 14 OFDM symbols. Therefore, in the first scenario, since the channel
slowly changed over time, the pilot spacing in the time domain was Dt = 14. In the
second scenario, the system exhibited high-speed mobility, which resulted in the maximum
Doppler frequency of 200 Hz. In this scenario, the setup Dt = 7 was used to cope with the
rapid change of the channels over time.
Figures 9 and 10 show the MSE of different channel estimations in the first and second
scenarios, respectively. The 16-QAM (quadrature amplitude modulation) method was de-
ployed to modulate the transmitted data in the simulation. As shown in Figures 9 and 10,
all the channel estimation methods led to the MSE declining gradually as the SNR in-
creased. In both the scenarios, LS estimation yielded the worst MSE performance, which
was because it does not take the statistical channel information into account when perform-
ing the channel estimation. On the contrary, LMMSE estimation exploits the mean and
covariance matrices, which resulted in better MSE performance than its LS counterpart.
Our proposed deep learning estimators yielded the best MSE performance compared to
the two conventional methods. In detail, the FDNN model showed the smallest MSE
compared to the two other deep learning models. This is because the FDNN model has
the simplest structure; thus, it could not study the structure of the channel as well as the
others. The CNN model, on the other hand, not only could learn more deeply than the
FDNN model but also provided robustness in denoising noisy data. Therefore, we can see
that the CNN model yields better performance compared to the FDNN model. However,
both FDNN and CNN models could not exploit the relation between channels in the same
way as the bi-LSTM model. Therefore, we can see a great improvement in terms ofMSE
performance due to the bi-LSTM model. To further clarify this, the MSE gaps (dB) between
the deep learning-based channel estimation methods and the LMMSE estimation are shown
in Figures 11 and 12. From the two figures, it can be seen that the gaps decrease as the SNR
level increases. Thus, the deep learning-assisted methods work much better in the low
SNR region. Comparing between the two scenarios, due to the change of the pilot density,
the performance differences between two scenarios are not significant.
Sensors 2021, 21, 4861 18 of 23
101
100
-1
10
10-2
10-3
10-4
-5 0 5 10 15 20
Figure 9. The MSE of the channel estimate vs. the SNR level with f D = 36 Hz.
1
10
100
10-1
10-2
10-3
-5 0 5 10 15 20
Figure 10. The MSE of the channel estimate vs. the SNR level with f D = 200 Hz.
-5
-10
-15
-20
-25
-30
-35
-5 0 5 10 15 20
Figure 11. The MSE gap (dB) between the deep learning-based channel estimation methods and the
LMMSE estimation with f D = 36 Hz.
-5
-10
-15
-20
-25
-30
-5 0 5 10 15 20
Figure 12. The MSE gap (dB) between the deep learning-based channel estimation methods and the
LMMSE estimation with f D = 200 Hz.
We also provide the BER performance of the considered scenarios in Figures 13 and 14
with the different channel estimation methods, respectively. The trend of the BER perfor-
mance for the examined estimators is similar to that of MSE performance. However, in both
scenarios, the BER performance of the FDNN model is slightly worse than the LMMSE
Sensors 2021, 21, 4861 19 of 23
method at SNR = 20 dB. This can be explained by the fact that the loss function has been
defined to minimize the channel estimation errors instead of the BER metric.
100
10-1
-5 0 5 10 15 20
Figure 13. The BER of the channel estimate vs. the SNR level with f D = 36 Hz.
100
10-1
-5 0 5 10 15 20
Figure 14. The BER of the channel estimate vs. the SNR level with f D = 200 Hz.
sensitive to the Doppler frequency compared to the FDNN and CNN models. The reason is
that the bi-LSTM model exploits the time-varying properties of channels; thus, the Doppler
frequency has more serious effects on the bi-LSTM model. However, the three proposed
models are still robust to the changes of the Doppler frequency and therefore more efficient
than the conventional methods.
10-1
10-2
10-3
-5 0 5 10 15 20
Figure 15. The impact of the pilot density on the deep learning-based channel estimations.
10-1
10-2
-3
10
-5 0 5 10 15 20
Figure 16. The impact of the Doppler frequency on the deep learning-based channel estimations.
101
100
10-1
-2
10
-3
10
10-4
-5 0 5 10 15 20
Figure 17. The impact of a Doppler frequency mismatch on the deep learning-based channel estimations.
5. Conclusions
In this paper, we have presented the use of different DNN structures, including a
fully-connected DNN, CNN, and bi-LTSM, to assist in the channel estimation process in a
MIMO-OFDM system with different scenarios of fading multi-path channel models based
on the TDL-C model defined in the 5G networks. The proposed DNN-based channel esti-
mation framework was trained with the channel estimation from least squares estimation
and the corresponding perfect channels to obtain the parameters as weights and biases.
By utilizing the QAM modulation scheme, the performance of the proposed estimations
was compared with the conventional LS and LMMSE estimations in terms of the channel
estimation error and the bit error ratio as a function of the SNR levels. As the channel
properties were learned effectively, we observed improvements of the proposed deep
Sensors 2021, 21, 4861 21 of 23
learning-aided estimations in terms of reducing the channel estimation error and bit error
ratio. Among the proposed deep learning-based channel estimation approaches, bi-LSTM
showed the greatest reduction in channel estimation error as a consequence of its ability
to exploit the time and frequency correlation among the channels. Furthermore, the pro-
posed deep learning-based channel estimation approaches exhibited great robustness with
different pilot densities as well as with changes of the Doppler frequency.
Author Contributions: H.A.L. proposed the concept and methodology, conducted software pro-
gramming, and drafted the initial manuscript. T.V.C. provided feedback and revised the manuscript.
T.H.N. revised the manuscript. H.C. provided the research funding and administered the project.
V.D.N. co-proposed the concept, supervised the research, and revised the manuscript. All authors
have read and agreed to the published version of the manuscript.
Funding: This research was supported in part by the Korean government, under the ICT Creative
Consilience program (IITP-2021-2020-0-01821) supervised by the IITP and Mid-Career Research
program (NRF-2020R1A2C2008447) through NRF, and in part by the Vietnam National Foundation
for Science and Technology Development (NAFOSTED) under grant number 102.01-2019.07.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data for the simulation in this paper including noisy channel and
theory channel can be found in this repository: https://drive.google.com/drive/folders/1KWCS9
Yc3jh-IEkXs7rbjR8GW8e4uOzj4?usp=sharing accessed on 15 July 2021.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
5G Fifth generation
3GPP Third generation partnership project
BER Bit error ratio
CNN Convolutional neural network
CP Cyclic prefix
DNN Deep neural network
FDNN Fully connected deep neural network
FFT Fast Fourier transform
ISI Inter-symbol interference
LMMSE Linear minimum mean square error
LS Least Squares
LSTM Long short-term memory
GRU Gated recurrent unit
MIMO Multiple-input multiple-output
MSE Mean square error
OFDM Orthogonal frequency-division multiplexing
QAM Quadrature amplitude modulation
SNR Signal to noise ratio
TDL-C Tapped delay line type C model
References
1. Andrews, J.G.; Buzzi, S.; Choi, W.; Hanly, S.V.; Lozano, A.; Soong, A.C.K.; Zhang, J.C. What Will 5G Be? IEEE J. Sel. Areas
Commun. 2014, 32, 1065–1082. [CrossRef]
2. Van Chien, T.; Ngo, H.Q.; Chatzinotas, S.; Di Renzo, M.; Ottersten, B. Reconfigurable Intelligent Surface-Assisted Cell-Free
Massive MIMO Systems Over Spatially-Correlated Channels. arXiv 2021, arXiv:2104.08648.
3. Wang, X.; Kong, L.; Kong, F.; Qiu, F.; Xia, M.; Arnon, S.; Chen, G. Millimeter wave communication: A comprehensive survey.
IEEE Commun. Surv. Tutor. 2018, 20, 1616–1653. [CrossRef]
4. Smart 2020: Enabling the Low Carbon Economy in the Information Age; Technical Report; The Climate Group and Global e-
Sustainability Initiative (GeSI): Brussels, Belgium, 2008.
Sensors 2021, 21, 4861 22 of 23
5. Ma, X.; Yang, L.; Giannakis, G. Optimal training for MIMO frequency-selective fading channels. IEEE Trans. Wirel. Commun. 2005.
4, 453–466.
6. Le Ha, A.; Van Chien, T.; Nguyen, T.H.; Choi, W. Deep Learning-Aided 5G Channel Estimation. In Proceedings of the 2021 15th
International Conference on Ubiquitous Information Management and Communication (IMCOM), Seoul, Korea, 4–6 January
2021; pp. 1–7.
7. Björnson, E.; Hoydis, J.; Sanguinetti, L Massive MIMO Networks: Spectral, Energy, and Hardware Efficiency. Found. Trends®
Signal Process. 2017, 11, 154–655. [CrossRef]
8. Kay, S. Fundamentals of Statistical Signal Processing: Estimation Theory; Prentice Hall: Hoboken, NJ, USA, 1993.
9. Van Chien, T.; Björnson, E.; Larsson, E.G. Joint pilot design and uplink power allocation in multi-cell Massive MIMO systems.
IEEE Trans. Wirel. Commun. 2018, 17, 2000–2015. [CrossRef]
10. Björnson, E.; Hoydis, J.; Sanguinetti, L. Massive MIMO has unlimited capacity. IEEE Trans. Wirel. Commun. 2018, 17, 574–590.
[CrossRef]
11. Van Chien, T.; Ngo, H.Q.; Chatzinotas, S.; Ottersten, B.; Debbah, M. Uplink Power Control in Massive MIMO with Double
Scattering Channels. arXiv 2021, arXiv:2103.04129.
12. Wu, S.; Wang, C.X.; Haas, H.; Alwakeel, M.M.; Ai, B. A non-stationary wideband channel model for massive MIMO communica-
tion systems. IEEE Trans. Wirel. Commun. 2014, 14, 1434–1446. [CrossRef]
13. Peacock, M.; Collings, I.; Honig, M. Unified Large-System Analysis of MMSE and Adaptive Least Squares Receivers for a Class of
Random Matrix Channels. IEEE Trans. Inf. Theory 2006, 52, 3567–3600. [CrossRef]
14. Eisen, M.; Zhang, C.; Chamon, L.F.; Lee, D.D.; Ribeiro, A. Learning optimal resource allocations in wireless systems. IEEE Trans.
Signal Process. 2019, 67, 2775–2790. [CrossRef]
15. Van Chien, T.; Canh, T.N.; Björnson, E.; Larsson, E.G. Power Control in Cellular Massive MIMO with Varying User Activity: A
Deep Learning Solution. IEEE Trans. Wirel. Commun. 2019, 19, 5732–5748. [CrossRef]
16. O’Shea, T.; Hoydis, J. An introduction to deep learning for the physical layer. IEEE Trans. Cogn. Commun. Netw. 2017, 3, 563–575.
[CrossRef]
17. Neumann, D.; Wiese, T.; Utschick, W. Learning the MMSE channel estimator. IEEE Trans. Signal Process. 2018. 66, 2905–2917.
[CrossRef]
18. Zappone, A.; Di Renzo, M.; Debbah, M. Wireless networks design in the era of deep learning: Model-based, AI-based, or both?
arXiv 2019, arXiv:1902.02647.
19. Jiang, R.; Wang, X.; Cao, S.; Zhao, J.; Li, X. Deep Neural Networks for Channel Estimation in Underwater Acoustic OFDM
Systems. IEEE Access 2019, 7, 23579–23594. [CrossRef]
20. Abdul Karim Gizzini, M.C.; Ahmad Nimr, G.F. Deep Learning Based Channel Estimation Schemes for IEEE 802.11p Standard.
IEEE Access 2020, 8, 113751–113765. [CrossRef]
21. Kang, J.M.; Chun, C.J.; Kim, I.M. Deep-learning-based channel estimation for wireless energy transfer. IEEE Commun. Lett. 2018,
22, 2310–2313. [CrossRef]
22. Truong, K.T.; Heath, R.W. Effects of channel aging in massive MIMO systems. J. Commun. Netw. 2013, 63, 338–351. [CrossRef]
23. Guo, S.; Ya, Z.; Zhang, K.; Zuo, W.; Zhang, L. Toward Convolutional Blind Denoising of Real Photographs. In Proceedings of the
IEEE CVPR, Long Beach, CA, USA, 15–20 June 2019; pp. 1712–1722.
24. Kim, J.; Lee, J.K.; Lee, K.M. Accurate Image Super-Resolution Using Very Deep Convolutional Networks. In Proceedings of the
IEEE CVPR, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654.
25. Lee, W.; Kim, M.; Cho, D.H. Deep Power Control: Transmit Power Control Scheme Based on Convolutional Neural Network.
IEEE Commun. Lett. 2018, 22, 1276–1279. [CrossRef]
26. Jin, Y.; Zhang, J.; Ai, B.; Zhang, X. Channel Estimation for mmWave Massive MIMO With Convolutional Blind Denoising
Network. IEEE Conmmun. Lett. 2020, 24, 95–98. [CrossRef]
27. Yuan, J.; Ngo, H.Q.; Matthaiou, M. Machine Learning-Based Channel Prediction in Massive MIMO With Channel Aging. IEEE
Trans. Wirel. Commun. 2020, 19, 2960–2973. [CrossRef]
28. Dong, P.; Zhang, H.; Li, G.Y.; Gaspar, I.S.; NaderiAlizadeh, N. Deep cnn-based channel estimation for mmwave massive mimo
systems. IEEE J. Sel. Top. Signal Process. 2019, 13, 989–1000. [CrossRef]
29. Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012.
30. Cho, K.; van Merrienboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations
using RNN encoderdecoder for statistical machine translation. In Proceedings of the Empirical Methods Natural Lang. Process
(EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1724–1734.
31. Kang, J.M.; Chun, C.J.; Kim, I.M.; Kim, D.I. Deep RNN-Based Channel Tracking for Wireless Energy Transfer System. IEEE Syst.
J. 2020, 14, 4340–4343. [CrossRef]
32. Bai, Q.; Wang, J.; Zhang, Y.; Song, J. Deep Learning-Based Channel Estimation Algorithm Over Time Selective Fading Channels.
IEEE Trans. Cogn. Commun. Netw. 2020, 6, 125–134. [CrossRef]
33. Study on Channel Model for Frequencies from 0.5 to 100 GHz (Release 15). Technical Report. 3GPP TR 38.901. 2018. Available
online: https://www.3gpp.org/DynaReport/38901.htm (accessed on 14 July 2021)
34. Cho, Y.S.; Kim, J.; Yang, W.Y.; Kang, C.G. MIMO-OFDM Wireless Communications with MATLAB; John Wiley & Sons: Hoboken, NJ,
USA, 2010.
Sensors 2021, 21, 4861 23 of 23
35. Gerald, M.; Franz, H. Fundamentals of Time-Varying Communication Channels; Academic Press: Cambridge, MA, USA, 2011.
36. 38.211, G.T. NR. Physical Channels and Modulation. 2017. Available online: https://www.3gpp.org/DynaReport/38211.htm
(accessed on 14 July 2021)
37. Dahlman, E.; Parkvall, S.; Skold, J. 5G NR: The Next Generation Wireless Access Technology; Academic Press: Cambridge, MA,
USA, 2018.
38. Mei, K.; Liu, J.; Zhang, X.; Wei, J. Machine Learning Based Channel Estimation: A Computational Approach for Universal
Channel Conditions. arXiv 2019, arXiv:1911.03886.
39. Kai, Z.; Wangmeng, Z.; Yunjin, C.; Deyu, M.; Lei, Z. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image
Denoising. IEIEEE Trans. Image Process 2017, 26, 3142–3155.
40. Kai, Z.; Wangmeng, Z.; Lei, Z. Beyond a Gaussian Denoiser: FFDNet: Toward a fast and flexible solution for CNN-based image
denoising. IEIEEE Trans. Image Process 2018, 27, 4608–4622.
41. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016.
42. Matthiesen, B.; Zappone, A.; Besser, K.L.; Jorswieck, E.; Debbah, M. A Globally Optimal Energy-Efficient Power Control
Framework and Its Efficient Implementation in Wireless Interference Networks. IEEE Trans. Signal Process. 2020, 68, 3887–3902.
[CrossRef]
43. Ge, K.; Sun, J. Convolutional neural networks at constrained time cost. In Proceedings of the Conference on Computer Vision
and Pattern Recognition, Boston, MA, USA, 7– 12 June 2015.
44. Sak, H.; Senior, A.; Beaufays, F. Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary
Speech Recognition. arXiv 2014, arXiv:1402.1128.