0% found this document useful (0 votes)

17 views

Deep Learning in Network-Level Performance Prediction Using Cross-Layer Information

Uploaded by

Mark Jennings

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Deep Learning in Network-Level Performance Prediction Using Cross-Layer Information

Uploaded by

Mark Jennings

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

2364 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO.

4, JULY-AUGUST 2022

Deep Learning in Network-Level Performance

Prediction Using Cross-Layer Information
Qi Cao, Man-On Pun , Senior Member, IEEE, and Yi Chen

Abstract—Wireless communication networks are conventionally average user throughput and the package ACK/NACK rate in
designed in model-based approaches through utilizing performance one unified mathematical model as these two performance
metrics such as spectral efficiency and bit error rate. However, from
the perspectives of wireless service operators, network-level metrics measure two vastly different aspects of the network
performance metrics such as the 5%-tile user data rate and network performance. As a result, most existing models can only mea-
capacity are far more important. Unfortunately, it is difficult to sure the so-called link-level performance. However, it is much
mathematically compute such network-level performance metrics in more desirable to investigate the network-level performance
a model-based approach. To cope with this challenge, this work by taking into account available information from all network
proposes a data-driven machine learning approach to predict
these network-level performance metrics by utilizing customized layers.
deep neural networks (DNN). More specifically, the proposed More specifically, it is critical for network designers to
approach capitalizes on cross-layer information from both the understand network-level performance such as the network
physical (PHY) layer and the medium access control (MAC) layer to capacity, the average user data rate and the 5%-tile user data
train customized DNNs, which was considered impossible for the rate (the data rate of the worst 5% users). The main challenges
conventional model-based approach. Furthermore, a robust training
algorithm called weighted co-teaching (WCT) is devised to overcome in modeling these network-level performance stem from the
the noise existing in the network data due to the stochastic nature of following difficulties. First, the wireless communication net-
the wireless networks. Extensive simulation results show that the work is highly complex with many components and protocols,
proposed approach can accurately predict two network-level which renders the whole system analytically intractable. Sec-
performance metrics, namely user average throughput (UAT)
and acknowledgment (ACK)/negative acknowledgment (NACK)
ond, it is difficult to accurately characterize the channel and
feedback with great accuracy. user behavior using channel transfer functions, user distribu-
Index Terms—Cross-layer information, machine learning, net- tion and motion models. Finally, network events are mostly
work-level performance. stochastic such as user arrival and traffic load. For these rea-
sons, two existing approaches have been developed to evalu-
ate network performance in the literature. The first approach is
I. INTRODUCTION to over-simplify the system mathematical model to approxi-
mate the network-level performance. However, the perfor-
C ONVENTIONALLY, wireless communication networks
are designed based on mathematical models that are
established with expert experience. Such models usually focus
mance of such approximation is far from being satisfactory.
Alternatively, the other existing approach is to develop net-
work simulators to predict or model the network-level perfor-
on one-single network layer, for example, the PHY or MAC mance. Such an approach has been widely adopted in the
layer, as it is considered impossible to develop one model to
wireless communications industry. Despite the large discrep-
unify information from multiple layers. For instance, it is
ancy between simulation and field test results, network simu-
rather challenging to mathematically characterize both the
lators are still the more preferable choice for studying the
network-level performance of a large network. However, the
Manuscript received June 7, 2021; revised February 17, 2022; accepted
March 26, 2022. Date of publication March 29, 2022; date of current version
development of network simulators is prohibitively expensive
June 27, 2022. This work was supported by the National Key Research and and labor-intensive.
Development Program of China under Grant 2020YFB1807700. Recom- In the meantime, powerful machine learning techniques
mended for acceptance by Dr. Guoliang Xing. (Corresponding author:
Man-On Pun.) have been recently developed and successfully applied in
Qi Cao is with the School of Science Engineering, The Chinese University many engineering areas such as image and linguistic process-
of Hong Kong, Shenzhen, Guangdong 518172, China (e-mail: caoqi@cuhk. ing. Built upon the ever-increasing computer power and the
edu.cn).
Man-On Pun is with the School of Science Engineering, The Chinese Uni- availability of Big Data, machine learning techniques are char-
versity of Hong Kong, Shenzhen, Guangdong 518172, China, and with Pen- acterized by their data-driven approach that is particularly
gcheng Laboratory, Shenzhen, Guangdong 518055, China, and also with the suitable for the data-rich wireless communication networks.
Shenzhen Research Institute of Big Data, Shenzhen, Guangdong 518172,
China (e-mail: simonpun@cuhk.edu.cn). However, to our best knowledge, there are only a few existing
Yi Chen is with the School of Science Engineering, The Chinese Univer- works on utilizing data from multiple network layers to under-
sity of Hong Kong, Shenzhen, Guangdong 518172, China, and also with the stand network behaviors and subsequently optimize network
Shenzhen Research Institute of Big Data, Shenzhen, Guangdong 518172,
China (e-mail: yichen@cuhk.edu.cn). design. In the following sections, we first review these related
Digital Object Identifier 10.1109/TNSE.2022.3163274 works before summarizing our main contributions.
2327-4697 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See ht_tps://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
CAO et al.: DEEP LEARNING IN NETWORK-LEVEL PERFORMANCE PREDICTION USING CROSS-LAYER INFORMATION 2365

A. Related Works and Main Contributions lacks sufficient bits to describe the messages. However, train-
There were early studies exploiting machine learning in ing the neural network requires that the channel model should
radio resource management (RRM) for wireless networks [1], be able to represent all non-linear properties of the system
[2]. For instance, power allocation in multi-user interference while maintaining to be differentiable, which is difficult to
channels is a classic NP-hard problem due to the combinatorial achieve in real-life systems. To cope with this problem, [16]
nature of the problem. With the goal being maximizing the proposes to use another CNN to approximate the gradient via
weighted sum rate (WSR), the traditional convex optimization supervised learning. Finally, the actor-critic reinforcement
theory can reach solutions that are close to the global optimum learning algorithm has been applied to handle user scheduling
using the iterative algorithm namely, weighted minimum and content caching at the same time [17], [18].
mean-squared error (WMMSE). However, the WMMSE algo- Apart from the problems in PHY layer, [19] studies an RL-
rithm is of high complexity and thus time-consuming [3]. In based resource block (RB) allocation scheduler, which selects
[4], a five-layer fully connected neural network is built to learn the momentarily best scheduler for each transmission time
from the resulting solutions of WMMSE via supervised learn- interval (TTI). As the conventional schedulers always focus on
ing. On this basis, the work in [5] adopts negative WSR as the some particular key performance indicator (KPI), the RL based
loss function to train an ensembling deep neural network scheduler can flexibly choose the best scheduling rule among
(CNN) to solve the same problem of power allocation, showing the conventional schedulers to achieve customized goals. Alter-
a better performance in high SNR regime (>10 dB). However, natively, the work [20] use an RL-based framework to adjust
there is a big impediment hindering the practical implementa- the parameters for the proportional fairness (PF) scheduler,
tion of such DNNs, which is the dynamic number of users. It is which can also allocate RBs better than conventional schedu-
shown from a theoretical perspective that the graph neural net- lers. Using a similar methodology, [21] improves the quality of
work (GNN) is a powerful solver to combinatorial problems as service (QoS) for an unmanned aerial vehicle (UAV)-based
it is adaptively scalable according to the number of entities [6]. immersive live system. In addition, authors in [22] studied a
Thus, by using the a GNN to solve the same power allocation dense small-cell network and proposed to capitalize on deep Q-
problem, the WSR is increased by more than 2% with respect learning (DQN) to reduce the end-to-end delay.
to WMMSE that always finds a local optimum. There are studies concerning other aspects in RRM. For
The above power allocation methods require accurate chan- instance, an actor-critic reinforcement learning is utilized to
nel state information (CSI). However, in many current com- solve the user allocation problem aiming at more energy-effi-
munication networks, accurate CSI of each user equipment cient strategies [23], while the work [24] considers the user
(UE) may not be available, especially in frequency division allocation problem from the handover point of view. Also, the
duplex (FDD) systems. A more common practice is that each authors in [25] effectively reduce the energy consumption in
UE feeds back its channel quality indicators (CQI) to the base base station sleeping control with a data-driven method. Fur-
station, and the base station determines the communication thermore, in [26]–[28], data-driven signal recolonization and
scheme based on the CQI [7]. In [8], a machine learning-based modulation classification problems were investigated, show-
solution is studied that uses only accessible communication ing impressive performance when it is compared with model-
overhead data such as CQI on the transmit side. Based on a driven method.
two-cell model, the study applies reinforcement learning to
allocate limited transmit power to 10 UEs working in the
B. Contributions
same frequency band. The work shows that reinforcement
learning is superior to the traditional algorithm in terms of the Most studies discussed above focus on utilizing information
5%-tile and median UE data rates. Furthermore, the work [9] from only one network layer, but they neglect to verify
shows that it is sufficient to effectively adjust the modulation whether a network is predictable or not, especially when it
and coding scheme (MCS) selection dynamically when the comes to the multi-layer architecture. In contrast with above
base station (BS) only has the UE’s CQI feedback. studies, this work considers a DNN structure to predict the net-
Apart from power allocation, a remarkable application of work-level performance by exploiting information from both
deep learning in the PHY layer is the end-to-end learning- PHY and MAC layers. In [29], we have made some initial
based wireless communication system. The basic idea behind attempts to explore the feasibility of such a CNN structure and
is that a communication system is similar to a neural network achieved some preliminary results. In this work, we will rigor-
in the sense of that both systems have input and output. By ously define the average UE throughput specifically designed
replacing the encoding/modulation and decoding/demodula- for our proposed framework. Furthermore, we will extend our
tion module with DNNs (also known as the autoencoder and investigations to the prediction of UE average throughput as
autodecoder respectively), the whole system can be automati- well as the ACK/NACK feedback. Finally, we will provided
cally optimized by unsupervised learning. Interestingly, it has in-depth elaboration on the applications of such predictions
been observed from [10]–[15] that the trained autoencoder for network parameter fine-tuning. The main contributions of
works like a conventional channel coder when the BS has this work are summarized as follows:
redundant bits to represent the messages. In contrast, the To our best knowledge, this work is the first successful
trained autoencoder behaviors like modulation when the BS attempt to demonstrate that it is feasible to accurately

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
2366 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO. 4, JULY-AUGUST 2022

predict network-level performance using DNN by (RBG) are set to serve downlink UEs. Each RBG consists of
exploiting both PHY and MAC information derived several RBs. We consider the downlink transmission scenario
from various network counters of vastly different natures in which both the BS and UEs are equipped with two antennas.
and complex network mechanisms such as out loop link In contrast to most works in the literature that studied the full-
adaptation (OLLA) and proportional fairness (PF) user buffer transmission, we consider the bursty traffic mode in
scheduling. Specifically, we design two DNN structures which each UE has finite amount of traffic request. Next, we
to predict two important network-level performance met- will elaborate on four basic transmission mechanisms that are
rics, namely the UE average throughput (UAT) and the adopted in our network simulator, namely the OLLA for
ACK/NACK outcome of a transmission, respectively; MCS, the transmission block formation, the proportional fair-
The performance prediction is highly data-dependent, ness user scheduling scheme and the Hybrid Automatic
and the measured real-world data always has very high Repeat reQuest and Retransmission (HARQ). It will be clear
randomness, and the performance is affected by compli- that it is non-trivial to model all these mechanisms mathemati-
cated factors such as scheduling algorithms, user behav- cally using the model-based approach.
iors and so on. However, accurate labels are essential
for training DNNs, and the stochastic nature of wireless A. Out Loop Link Adaptation (OLLA)
communication system causes noisy data. We formulate
The LTE protocol allows the UE to suggest an appropriate
the noisy data cleaning task as a bi-level optimization
MCS to be used in the next transmission, which is aimed at
problem and propose a robust weighted co-teaching
achieving a pre-defined block error rate (BLER). To propose
algorithm to circumvent the problem;
such a suggestion, the UE actually selects its desirable MCS
Predicting network-level performance is ultimately
by sending back a CQI value as a quantized reference. Typi-
about improving QoS/QoE for all users. Leveraging the
cally, each CQI representing a signal-to-noise ratio (SNR)
predictive capability of our trained DNNs, we can depict interval is periodically measured and reported. Thus, MCSs
the MCS landscape for any users, and thus guide the are indeed selected by mapping the received instantaneous
decision-making process during MCS selection. This
SNR into its interval. The challenge about the MCS selection
application is utilized as an example to demonstrate the
is that it cannot be either too aggressive (i.e. too high) or too
feasibility of better utilization of the network resource to
conservative (i.e. too low). A higher-level selected MCS leads
improve users’ QoS by data-driven methodology.
to a larger transport block (TB) size while incurring a higher
Note that the aim of this work is to validate the feasibility of
BLER. In the industry, the MCS achieving a BLER of 10% is
network-level performance prediction. However, this work
commonly adopted to maximize the expected TB size while
can be extended to other applications. For instance, the link- maintaining a high successful transmission rate.
level model is usually over-simplified for the benefit of expe- However, this mapping rule is not sufficient to robustly
diting the system-level simulation, which is known as the
compensate the discrepancy between the chosen MCS and the
link-to-system mapping (L2SM). The mapping is mainly
optimal MCS for different UEs. Note that every UE may have
aimed at providing an outcome of a transport block transmis-
its own preference. For instance, aged devices might require
sion (TB), i.e. whether the TB is successfully received or not
relatively lower MCS for the same given channel conditions
[30]. In the literature, there are many reported results studying
due to their limited computational power. To cope with this
L2SM from an information theoretic point of view [31]–[35].
problem, the OLLA algorithm is designed to enable the BS to
Unfortunately, these studies fail to consider cross-layer infor- adaptively update the CQI value q as follows:
mation as their cross-layer models become analytically intrac-
table. In this work, we devise a new approach to replace the
q ¼ ½q þ a; (1)
L2SM module for network simulation.
In the sequel, we will first introduce the wireless network
where ½ is the rounding operator, and q is the offset CQI used
simulator settings in Section II while the network data prepa-
for MCS selection. Furthermore, a is the adjustment coefficient
ration and the network-level performance prediction tasks are
dynamically updated every time when an ACK/NACK flag is
elaborated in Section III. After that, two customized DNN
fed back. Specifically, let K denote the ACK/NACK flag with
structures are developed in Section IV before two training
algorithms are proposed in Section V. Finally, extensive simu-
1; an ACK received;
lation results are shown in Section VI followed by the conclu- K¼ (2)
sion given in Section VII. 0; a NACK received:

The adjustment coefficient is updated by OLLA as:

II. WIRELESS COMMUNICATION NETWORK BASICS

We begin with reviewing some basic concepts of a typical a þ sA ; K ¼ 1;
a¼ (3)
communication system that we are dealing with extensively in a þ sN ; K ¼ 0;
this work. Specifically, we consider a single-cell wireless where the update rate sA > 0 and sN < 0 are usually cus-
communication network with BS operating in the 2 GHz fre- tomized to achieve a specific BLER level u. For instance, we
quency band in the LTE FDD mode. K resource block groups can set sA ¼ 0:01 and sN ¼ 0:09 to have u ¼ 10%

B. Transmission Block (TB) Formation emptied. On the contrary, if a UE fails to acquire any RB, then
On the BS side, the CQI reported by UEs is one-to-one its effective TB will become zero in the current TTI. As a con-
mapped to an MCS order with unique spectral efficiency (SE), sequence, its moving average throughput will decrease, which
see Appendix for details. We denote by Mð qÞ the mapping will increase its priority in future RBG allocation.
function from the offset CQI q to its SE. Then, the estimated
data rate of the n-th UE in the k-th RBG can be obtained by D. Hybrid Automatic Repeat Request and Retransmission
0 1
X (HARQ)
1
Rn;k ¼ jGn ðkÞj M@ qn;‘ A; (4) Next, we briefly review the HARQ process in the downlink
jGn ðkÞj ‘2G ðkÞ
n transmission. When the TB for a UE is readily to be transmitted
from the BS, the data will be shifted from the buffer to the
where Gn ðkÞ is the set of all RBs in the k-th RBG measured by HARQ buffer. The data will stay in the HARQ buffer until it is
the n-th UE while j j stands for the cardinality of the enclosed completely successfully transmitted or dropped. Usually, eight
set. Furthermore, ‘ and qn;‘ are the RB index and the corre- HARQ processes are prepared for each UE. When a UE has a
sponding offset CQI level, respectively. The BS adopts a new transmission task, its HARQ process with the smallest pos-
default MCS when the CQI is not given, which occurs at the sible index is chosen. Eight TTIs after the data transmission is
beginning of a transmission. According to the estimated data completed, the BS will receive an ACK/NACK flag from the
rates, the BS will then allocate each RBG to an appropriate corresponding UE. In the case of ACK, the HARQ process will
UE while each UE can have more than one RBG. After the terminate as the TB has been successfully transmitted. In con-
allocation is achieved, the BS will re-calculate an MCS for trast, a NACK flag will trigger a retransmission. Since the RBGs
each UE and subsequently, and determine the size of its trans- used to conduct the initial transmission are delegated to conduct
mission block Tn in the current TTI : the retransmission, these RGBs will be temporally unavailable
! for the next user scheduling. The MCS chosen for the retrans-
1 X mission must remain the same to ensure that the same TB can be
Tn ¼ jGn j M qn;‘ : (5)
jGn j ‘2G reloaded to the delegated RBGs again. After five consecutive
n
failed retransmissions of the same HARQ process, the TB will
where Gn the set of all RBs allocated to the n-th UE by the be dropped, which incurs the so-called packet loss.
RBG allocation process. Fig. 1 illustrates an example in which UE1 started a trans-
mission using HARQ process 2in TTI 2. After eight TTIs, the
C. Proportional Fairness (PF) User Scheduling BS received a NACK flag, meaning the transmission was
To balance the tradeoff between system throughput and fair- failed. Then, the BS initiated a retransmission in TTI 10 using
ness among users, an RBG allocation algorithm called propor- the same HARQ process and RBGs.
tional fairness was proposed in [36]. In the algorithm, the BS
records a PF value for each UE-RBG pair, each RBG is allo- III. OBJECTIVES AND DATA PREPARATION
cated to the UE with the largest PF value defined as follows.
As aforementioned, the objective of this paper is to pioneer
The PF value of the UE-RBG pair ðn; kÞ in the i-th transmission
network-level performance prediction in a highly complex
time interval (TTI), denoted by bn;k ½i, is defined as
wireless communication system. In this section, we introduce
Rn;k ½i the network-performance prediction tasks and elaborate on the
bn;k ½i ¼ ; (6) data preparation. It is worth mentioning that a proprietary net-
Tn ½i
work simulator has been employed to generate the raw data
where Tn ½i is the moving average of the historical throughput used in this work. However, we believe that the proposed tech-
and given by niques are generally applicable to the field network data as
well as the simulated data obtained from off-the-shelf network
Tn ½i ¼ ð1 g ÞTn ½i 1 þ gTn ½i 1; (7)
simulators such like NS3 or OPNET.
where g is a small moving average coefficient, and Tn ½i 1 is
the TB size of the n-th UE in the fi 1g-th TTI given by (5). A. Feature Collection
For presentational simplicity, we omit the TTI index in the The state of a UE can be characterized by various features.
sequel. Each RBG will be allocated to the UE with the highest Table I lists some key features and their definitions used in
PF value in its list, specifically, this work. Particularly, a feature may have more than one
n ðkÞ ¼ arg max bn;k : (8) counter, e.g. RB_CQI has 50 counters. In this work, we con-
n
sider in total 268 counters to describe the state of a UE. All
To avoid allocating more RBGs to a UE than it needs, each time these counters, in practice, are available at the BS. The first
when an RBG is allocated to a UE, the system checks if the UE nine features in Table I are classified as the PHY information
has obtained enough RBGs to convey all remaining data in its (also known as Layer 1). In contrast, the last 11 features are
buffer. If so, the UE will be removed from the scheduling list. all MAC information (also known as Layer 2). As discussed
This usually happens when the UE’s buffer is about to be in Section I, it is considered technically impossible to fuse

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
2368 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO. 4, JULY-AUGUST 2022

Fig. 1. Illustration of RBG assignment together with OLLA and HARQ.

TABLE I
COLLECTED NETWORK COUNTERS Definition III: A target UE is a scheduled UE whose perfor-
mance is to be predicted;
Definition IV: A parallel UE is an active UE who is not the
target UE;
Definition V: The Network snapshot of some TTI consists of
features of all active UEs in the TTI.
Now we summarize the tasks as follows:
Task I: Given a network snapshot, the first task is to predict
the UAT for a target UE in the next interval T . Clearly, the
data rate of a target UE is determined by all the complex trans-
mission mechanisms explained in Section II as well as the
states of all active UEs. In particular, since every UE competes
for the limited radio resources of the network, the UAT of the
target UE heavily depends on the state of the parallel UEs. As
a result, UAT prediction has to be performed in the network
level, in lieu of link level. Finally, the 5%-tile UAT that meas-
ures the network fairness can be derived if all UEs’ UATs are
found.
Task II: Given a scheduled UE, the task is to predict its
ACK/NACK. At the first glance, this task may appear to be
related to the link-level performance as ACK/NACK of the
UE is independent of the state of parallel UEs, assuming all
UEs are allocated to non-overlapping RBGs. However, the
information collected from these two layers using the model- ACK/NACK prediction should greatly benefit from the infor-
based approach. In this work, we take advantage of DNN to mation on BLER and iBLER. Furthermore, if the multi-cell
exploit the cross-layer information simultaneously without scenario is considered, then the inter-cell interference surely
explicitly modeling the information. has major impact on the ACK/NACK outcome of a given TB.
Thus, it makes more sense to predict the ACK/NACK result
for a UE in the network level.
A network snapshot consists of all active UEs, and we col-
B. Tasks and Dataset Construction
lect all the concerned counters of a UE in a vector x ¼
Before defining the network-performance prediction tasks, ½c1 ; c2 ; . . . ; cM T , where M is the total number of counters.
we first introduce the following terminology used in this work: Thus the network snapshot is formatted as an N M matrix
Definition I: An active UE of some TTI is a UE with a non- denoted by X ¼ ½x1 ; x2 ; . . . ; xN T , each row corresponding to
empty buffer in the TTI. an active UE, and each column being the same counter mea-
Definition II: A scheduled UE is an active UE who is allo- sured from different UEs. In particular, N is set to be large
cated at least one RBG; enough to cover the maximum number of active UEs in a

technology, a UE can receive up to two TBs in the same TTI.

Since the ACK/NACK flag is a binary counter, there are four
possible outcomes for two TBs. Using a 2-tuple representation
with the first element being the outcome for the first TB and
the second element for the second TB, the four possible out-
comes namely fðN; NÞ; ðA; NÞ; ðN; AÞ; ðA; AÞg with N and
A being NACK and ACK respectively can be one-hot
encoded. For the case in which only one TB is used for a UE,
the default return for the second TB is NACK.

C. Data Preprocessing
Before the samples in the dataset are fed into the DNN, they
are pre-processed to improve the DNN convergence behavors.
Fig. 2. Illustration of data rate calculation. 1) Normalization: The counters collected in a network
snapshot are of different natures. For instance, the ACK/
NACK flag is binary whereas RSRP is a floating-point number
network snapshot in most cases. For the case in which the
and the CQI value is an integer. Therefore, we propose to nor-
number of active UEs in a TTI is less than N, virtual UEs are
malize the values according to the counter type. The maxi-
inserted into the snapshot to keep the total number of active
mum and minimum values of each counter type are first found
users fixed at N. Furthermore, the counters of a virtual UE are
by inspecting a small portion of the sample dataset. After that,
set to some default values that make the virtual UE easily dis-
all counter values are normalized by their respective maxi-
tinguishable from the real UEs. In addition the target UE is
mum and minimum values to the interval ½0; 1.
always placed in the first row namely x1 while the parallel
2) Virtual UE Padding: As mentioned, we assume N
UEs are placed below the target UE in a randomized order.
active UEs in the system. If the actual number of active UEs
We calculate the estimated UAT for the target UE indexed
in the current TTI is smaller than N, then virtual UEs are
by n in TTI t0 by
inserted into the network snapshot to keep the dimensionality
tX
0 þt of the DNN input constant. We propose to fill the virtual UEs’
Tn ½i Kn ½i counters with 1 to make them distinguishable from the regu-
i¼t0
(9)
yn ½t0 ¼ ; lar UEs.
minft; Dtg 3) UE Shuffling: The BS usually communicates with mul-
tiple UEs in the same TTI. Thus, if we switch the current tar-
where Kn ½i is the ACK/NACK flag of the n-th UE in TTI i as
get UE with another scheduled UE, we will generate a new
defined in (2). Furthermore, Dt stands for the duration from t0
training sample. Alternatively, we can keep the target UE
to the moment when the n-th UE successfully receives all its
intact but randomly shuffling the positions of parallel UEs in
data and t is a predefined time period. If the UE successfully
the snapshot, which can produce more samples of the same
receives all its data within t, then the actual transmission
target UE. In short, the UE shuffling process enables more
interval should be Dt (see Case 2in Fig. 2); Otherwise, the
efficient usage of the network simulation data.
UAT is defined with the actual data successfully transmitted
over the time interval t (see Case 1in Fig. 2).
The definition in (9) is motivated by the following observa-
IV. NEURAL NETWORK CONFIGURATION
tions. A UE of a large data buffer to be transmitted may never
be able to receive all its data before the simulation time Fig. 3 shows the structure of the DNN for Task I referred to
expires. In the worst case, a UE who suffers from poor channel as UATNet that consists of four groups of convolutional layers
conditions may never be able to successfully receive any and one group of fully connected layers. Note that we first use
packet (i.e. NACK frequently occurs). Thus, the UAT should 1 5 filters to solely extract the features from individual UE,
not be simply defined as the total data buffer size divided by in lieu of the square filters commonly employed in the image
the simulation time period. In contrast, (9) defines UAT in a processing applications. The fully-connected layers at the end
much finer resolution t using the exact amount of successfully of the DNN are designed to combine the inter-user features.
transmitted data divided by the actual time elapsed to com- To avoid gradient vanishing or exploding, a batch normaliza-
plete such successful transmissions. For Task I, we can gener- tion layer is set before the activation layer with a momentum
ate multiple sets of training samples by adjusting t from one of 0.99, which is not shown in the figure.
simulation run, and the label UAT is calculated by (9). The ACKNet designed for Task II shown in Fig. 4 has a
For Task II, training samples are constructed from every similar structure to UATNet, except that the UATNet takes
scheduled UE in each TTI and the corresponding ACK/ matrix input while ACKNet takes vector inputs. We use one-
NACK flag per transmission is the training label. For the hot encoding to represent the four outcomes of the transmis-
recent wireless networks employing the multi-antenna sion of two TBs.

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
2370 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO. 4, JULY-AUGUST 2022

Fig. 4. Structure of the proposed ACKNet for the target UE’s ACK/NACK
prediction.

used to optimize DNN parameters. These noisy labels are par-

ticularly detrimental for Task I as UAT is a floating-point num-
ber. To cope with this problem, we will have to define robust
Fig. 3. Structure of the proposed UATNet for the target UE’s UAT performance metrics to evaluate the prediction accuracy of our
prediction. trained UATNet, which is discussed in Section V-B.
For Task II, the problem with the training data is that the
V. PROPOSED TRAINING ALGORITHMS four classes occur with very different probabilities as shown
With the DNNs developed in Section IV, we study the opti- in Table III. In other words, we have to deal with the imbal-
mization of the network parameters that produce accurate pre- anced data of the four classes, which may cause the ACKNet
diction. However there are some practical challenges in to prefer some cases to the others. To overcome this problem,
finding the optimum. For instance, the training labels are all we assign weights to each class that is inversely proportional
extracted from the network simulator that mimics the network to its sample size.
behavior of a wireless communication system including the
random events. Thus, the labels are not necessarily the ground B. Proposed Weighted Co-Teaching (WCT) Algorithm
truth as they are distorted by the random events. If we view In this subsection, we focus on the training algorithm tai-
such random events as noise, the training labels are rather lored for UATNet to conquer the influence brought by noisy
noisy, which deteriorates the efficacy of our training. In the labels. Denote by Xi and yi the i-th observed sample and the
following, we will elaborate such challenges and our solutions corresponding label, respectively. In addition, we use u to rep-
to overcome these challenges. resent the set of parameters of a DNN. The DNN training tries
to find a set of parameters that minimize a loss function, i.e.
A. Noisy and Imbalanced Labels X
min Lðyi ; ybi Þ (10)
u
Noisy labels in supervised learning can mislead the training. i2Dtr
To tackle these noisy labels, we first analyze the sources of the
noises. where Dtr is the training set, and ybi ¼ fðXi juuÞ is the model
The randomness in UE arrival following a Poisson prediction for the i-th data. Considering the noises, we use the
distribution; mean absolute percentage error (MAPE) as the loss function,
The randomness in the channel conditions and buffer which is defined by
size of a new UE; X jb
yi yi j
The randomness in TB reception (randomness is added LMAPE ðyi ; ybi Þ ¼ : (11)
yi
to ACK/NACK to model the stochastic nature of the i2Dtr
feedback channel);
The channel variations over time for each UE. The MAPE loss function defined in (11) scales down the
Note that real-life wireless communication systems exhibit absolute error jb
yi yi j by the magnitude of the label, and more
more randomness such as link disconnections, UE handovers focuses on the relative error. In addition to MAPE, two com-
and inter-cell interference and so on. As a result of the random- monly used loss functions are mean absolute error (MAE) and
ness discussed above, the network simulator usually produce mean squared error (MSE) loss functions defined as follows:
different UAT results for the same UE when we repeat the X
same simulation with identical network settings, though the LMAE ðyi ; ybi Þ ¼ jb
yi yi j; (12Þ
network-level performance over all UEs is considered more i2Dtr
X
stable for a sufficiently long simulation time. To highlight this LMSE ðyi ; ybi Þ ¼ ðybi yi Þ2 : (13)
randomness, we refer to the discrepancies between the label i2Dtr
and the ground truth as noises. In other words, the training
labels are noisy. In contrast, we refer to the discrepancies MAE and especially MSE are affected by those samples with
between the predictions and the labels as errors, which can be big absolute errors.

Algorithm 1: Weighted Co-Teaching (WCT).

1: Initialize: Fetch the training dataset Dtr ¼ fXi ; yi g, and initial-
ð0Þ ð0Þ
ize two identical DNNs with parameters u A , u B , and set itera-
tion index n ¼ 0.
ð1Þ
2: Train both DNNs with a batch of training samples, and get u A ,
ð1Þ
u B before updating n n þ 1.
3: while n < N do
4: Shuffle the parallel UEs of each Xi 2 Dtr
5: Divide Dtr into M equal-size groups with random order.
6: for j ¼ 1 : M do
7: Get the jth group Mj from Dtr .
ðnÞ
8: Use DNN u A to make predictions upon Mj , and get
ðnÞ
fybi ¼ fðXi ; u A Þ j 8Xi 2 Mj g.
n o
9: Compute weights wA ¼ max 1 jbyiyy
i
ij
; 0
ðnþ1Þ ðnÞ
10: Train u B uB with weighted training samples
ðMj ; wA Þ.
ðnþ1Þ
11: Use DNN uB to make predictions, and get
ðnþ1Þ
fybi ¼ fðXi ; u B Þ j 8Xi 2
Mj g.
n o
12: Compute weights wB ¼ max 1 jbyiyy
i
ij
; 0
ðnþ1Þ ðnÞ
13: Train u A uA with weighted training samples
ðMj ; wB Þ.
14: end for
15: n n þ 1.
16: end while

rigorous optimization-based method will optimize the weights

by taking the gradient ‘ðw wÞ, but such a gradient is difficult to
Fig. 5. Algorithms against noisy labels: Curriculum Learning (left), Co- obtain since it will involve ruu ðw
wÞ, for which no closed-form
teaching (right) for classification problem, and proposed Weighted Co-teach-
ing (WCT) for regression problem. solution exists.
There are various studies to find the sample weights. In par-
In order to mitigate the negative effects caused by some ticular, it has been reported in the literature that an over-
noisy data, it is necessary to treat each training data with dif- trained DNN ends up over-fitting all training samples [39],
ferent weights so that the prediction model can focus more on [40]. Besides, an interesting experiment conducted in [41]
those representative data samples. In other words, we should shows that in the presence of noisy labels, it is easier for
have a mechanism to identify those “cleaner” data points, and DNNs to find the relationship between the features and labels
put more attention on them. in the early stage of training before it overfits the data.
In the following, we formulate the data hyper-cleaning task Based on this observation, two interesting studies succeeded
as a bi-level optimization problem, which is given below: in handling the noisy label issue in the classification problem.
The first study [42] proposed to use a curriculum learning
X X
algorithm to capture the patterns and prevent the DNN from
wÞ :¼
min ‘ðw Lðyi ; ybi Þ ¼ L yi ; fðXi juu ðw
wÞÞ
w 2Rd1 i2Dtest i2Dtest
tracing the noise in the samples. In particular, a StudentNet
X that makes the final predictions is trained under the supervi-

s.t. u ðw
wÞ ¼ argmin wi Lðyi ; fðXi juuÞÞ: (14) sion of a pre-trained or pre-defined MentorNet (Fig. 5). To
i2Dtr
achieve so, the MentorNet must be capable of telling the accu-
where Dtest is the test set and w ¼ fwi g is the sample weights. rate labels from the noisy ones. The second study [43] pro-
We construct this optimization problem based on the studies [37] posed a co-teaching algorithm in which two twin DNNs are
and [38] where the authors claim that data should be weighted used and they choose training samples for each other if the
according to their importance for model training, which could pro- predictions meet the labels.
vide better generalization to the neural network model. The insight derived from [42], [43] suggests that the sam-
To begin the algorithm design, we note that one main intui- ples that have smaller training errors in the early stage have
tion provided by the bi-level optimization formulation (14) is higher value to be learned from, as their labels are more likely
that, there will be two entities involved in the training, one to be close to the ground truth. Inspired by this insight, we pro-
determines the weights based on the current model (i.e., the pose the weighted co-teaching (WCT) algorithm as shown in
upper layer problem), while the other trains the model based Algorithm 1. In WCT, we design two UATNets and allow
on the fixed set of weights (i.e., the lower layer problem). A them to exchange their training samples. Rather than simply

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
2372 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO. 4, JULY-AUGUST 2022

passing training samples to each other, each UATNet weights

the samples according to its weighting rule before sending the
samples to the other UATNet. Specifically, the weighting rule
is designed to make the impact of the samples consistent with
their the accuracy of predictions.
Apart from some metrics like MAPE and MSE, we also
consider the following mean relative error percentage
(MREP) to measure the accuracy of our predictions, with S
being the size of test set:

1X S
jb
yi yi j
MREP ¼ : (15)
S i¼1 ybi þ yi

MREP will scale the errors to some values between 0 and 1.

Furthermore, MREP only represents the relative distance between
the label and prediction without assuming that the labels are the
ground truth, which differentiates MREP from MAPE. Fig. 6. Loss function comparison.
The complexity of the training algorithm comes from the for-
ward and backpropagation and the number of iterations. The
propagation computation mainly relies on the DNN configura- three UATNets using MSE, MAE and MAPE as the loss func-
tion and here we denote the complexity in forward/backpropa- tion, respectively. Fig. 6 shows the cumulative distribution
gation by CF and CB mostly due to matrix multiplications. For function (CDF) of the REP performance of the three UAT-
a given batch with size Sc we need to run forward propagation Nets, based on the same test set. As shown, MAPE generally
Sc times but backpropagation once. There are actually two has smaller REP values which indicates that it is more resilient
stopping criteria for the training algorithm: i) reaching a fixed to the noise in our data. The conjecture is that MSE and MAE
number of iterations and ii) when the validation loss has amplify the large noise and entails performance degradation.
stopped decreasing for the last Nstop iterations. For the first In the remaining experiments of Task I, we will utilize MAPE
stopping criterion, we calculate the complexity as as our training loss function.
OðNðSt CF þ Sc CB ÞÞ, where St is the cardinality of the train- The training loss of the proposed WCT is depicted in Fig. 7.
ing set. For the second one, it is equivalent to set a small thresh- As two DNNs are initialized and interacted, there are two
old " and stop the training when parameter changes is below ". curves. We uniformly divide the whole training set into ten
In such situation, the number of iterations should be at groups, and let the two DNNs compute the weights of the next
log ð1="Þ level, and Nstop / 1=", so we compute the complex- training group for each other. The fluctuation of the curves
ity as Oðlog Nstop ðSt CF þ Sc CB ÞÞ. arises from the change of training groups and weights. In our
experiment, both UATNets tend to converge after about 15
VI. SIMULATION RESULTS epochs.
In Fig. 8, sixty randomly selected samples from the test set
In this section, we evaluate the performance of our designed
and their labels are shown, where the solid red line is the pre-
DNNs and training algorithms for both Task I and Task II. In diction made by the proposed UATNet and the dotted grey
addition, we use MCS selection as an example to demonstrate line is the label extracted from the raw simulated data.
the potential applications of the proposed DNNs.
Inspection of Fig. 8 suggests that our prediction matches
The simulated network setup follows the descriptions in
well with the labels. It is observed that the DNN is able to
Section II. In addition, the simulator randomly initializes the
accurately predict the UAT in the presence of many compli-
buffer size between 0.5 Kbytes and 3 Mbytes for each UE
cated network mechanisms such as OLLA and HARQ that
upon its arrival into the network. In addition, the noise power
are difficult to model mathematically. Furthermore, the CDF
density is set at 174 dBm/Hz while the transmit power for
of all labels in the test set and predictions are presented in
each RB is fixed at 18 dBm. The data are collected from run- Fig. 9, which also supports the high accuracy of the predic-
ning network simulations for 13 times and each lasts 20 sections in statistics.
onds. We use data generated from 10 runs to build our
Note that even though the dotted line shows the simulation
training set of around 400,000 samples. Then, we randomly
results, it does not stand for the ground truth due to the noisy
extract 6,000 samples from each remaining run to form our
label problem. That being said, we provide different metrics
test set.
to measure how close our predictions is from the labels, and
compare the performance of different training algorithms.
A. Task I: UAT Prediction
Table II compares three training algorithms, namely the
In this subsection, we focus on the results of UAT predic- proposed WCT, the standard basic training (BT) that uses one
tions where t is set to be one second (1000 TTIs). First we UATNet, and the Self-Weighting (SW) shown in Algorithm 2
show the influence of loss functions. In particular we train which uses the trainer itself to generate the sample weights.

TABLE II
NUMERICAL METRICS

Algorithm 2: Self-Weighting (SW).

1: Initialize: Fetch the dataset Dtr ¼ fXi ; yi g, and initialize a DNN
with parameters u ð0Þ and set iteration index n ¼ 0.
2: Train the DNN with a batch of training samples, and get u ð1Þ ,
n n þ 1.
Fig. 7. UATNet training loss using proposed WCT. 3: while n < N do
4: Shuffle the parallel UEs of each Xi 2 D
5: Get the a batch of training samples Mn from D.
6: Use DNN u ðnÞ to make predictions upon Mn , and get
fybi ¼ fðXi ; u ðnÞ Þ j 8Xi n
2 Mng. o
7: Compute weights w ¼ max 1 jbyi yi j ; 0 yi i2Mn
8: Train u ðnþ1Þ u ðnÞ with weighted training samples ðMn ; wÞ.
9: n n þ 1.
10: end while

standard deviation of the 50 REP values. A lower REP_std

Fig. 8. UAT predictions of 60 test samples. means that the performance of the resulting UATNet is
more stable. As indicated by the table, the proposed WCT
algorithm outperforms other two algorithms in terms of
MSE, CORR, REP, and REP_std.

B. Task II: ACK/NACK Prediction

Next, we present the results of the task of ACK/NACK
prediction. The outcomes of ACK/NACK are highly
related to the feature MCS. In order to cover the feature
space of MCS, we disable the OLLA function of the simu-
lator and randomly choose an MCS for each UE to build
the dataset of Task II.
Recalling that each sample can have maximum two TBs,
four classes are possible, namely {(N, N), (A, N), (N, A), (A,
A)}. If only one TB is used, then the default return for the
unused TB is NACK. In most cases, UE’s channel condition
only allows it to transmit one TB, and random MCS selection
mostly leads to a NACK., in our training set is different, which
Fig. 9. CDF of UAT predictions and labels.
causes the common imbalanced-label problem in classifica-
tion. To cope with this problem, data of each class is weighted
The results are collected over 50 experiments with ran- before being fed into the ACKNet, where the weight of each
dom UATNet initializations, wherein each experiment we class is inversely proportional to its size.
test the performance derived from the three training algo- Fig. 10 shows that the training loss and validation loss
rithms in different performance metrics. The metric decrease and converge after about 15 epochs, where the loss
“CORR” stands for the Pearson correlation coefficient function is the standard categorical cross entropy. Further-
between the predictions and the labels. The table records more, The resulting accuracy for both training and validation
the mean value of the metrics except REP_std that is the is above 94% after 40 epochs as shown in Fig. 11. Noth that

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
2374 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO. 4, JULY-AUGUST 2022

TABLE III
CONFUSION MATRIX OF ACK/NACK PREDICTIONS

Fig. 10. ACKNet loss on training process.

Fig. 11. ACKNet accuracy on training process.

the vibrations in the loss curve occurs when the two DNN
exchange training samples and update weights of each sample. Fig. 12. UATNet-predicted MCS landscape.
The resulting confusion matrix of our prediction on the test
set is shown in Table III. 57.94% of the test samples in the
(N, N) class. In each box in Table III, the number on the top this setting, a new dataset can be acquired and a UATNet can
shows the total samples in the class while the number below be trained via WCT. Then, we randomly pick a sample from
shows the percentage of prediction that falls into the class. For the test set, and vary the target UE’s MCS from its mini-
instance, there are 9829 samples whose labels and predictions mum value 1 to its maximum value 29 and adjust the TB
are both (N, N), occupying 94.24% of all the samples with (N, size accordingly, forming 29 modified copies of the sam-
N) labels. The overall accuracy across the diagonal entries is ple. Using the WCT-trained UATNet to predict the UATs
95.23%. upon the modified snapshots gives us the MCS landscapes
shown in Fig. 12. The red star on the graphs is the original
sample given by the network simulator. Clearly, in these
C. Application: MCS Selection
four examples, the stars are mostly on the curves of the
In this section, we demonstrate a potential application of predicted MCS landscapes.
Task I, which is that we can use a well-trained UATNet to pre- In addition, we find that the MCS landscapes drawn by our
dict the data rate for any given MCS, to facilitate the MCS UATNet are in a “bell shape”. This shape is reasonable as an
selection. We disable the OLLA algorithm and let each UE excessively large MCS incurs frequent NACK while a conser-
hold a random MCS till the end of its transmission. Based on vatively small MCS results in under-utilization of the

size given in (5). In Fig. 13, we use ACKNet to predict P ðK ¼

1 j MCS ¼ mÞ for m from 1 to 29 and substitute them to (16)
to obtain the data rate for four UEs.
The red dashed line in Fig. 13 indicates the MCS chosen by
the OLLA algorithm in the simulator, which represents the
optimal MCS corresponding to the highest expected data rate
(with BLER equal to 10%). Fig. 13 confirms that the optimal
MCS derived by ACKNet is reasonably close to the actual
optimum found by the network simulator, although this esti-
mated rate is a naive approximation for any RBG allocation,
assuming that the network states remains the same in the next
one second.
Finally, we put the two MCS landscapes predicted by UAT-
Net and ACKNet for the same snapshot together in Fig. 14.
The first observation is that the optimal MCS values chosen
by the two DNNs are not identical. Interestingly, the optimal
MCS chosen by UATNet is smaller than ACKNet. This means
that the optimal MCS for maximizing the effective TB size is
Fig. 13. ACKNet-predicted MCS landscape. larger than that for maximizing the UAT. This is because
ACKNet only predicts the outcome of a single transmission
based on the physical conditions of the UE, while UATNet
predicts the longer-term performance considering all the trans-
mission schemes.
Thus, the BS prefers the optimal MCS given by
ACKNet, which maximizes the effective TB size for each
scheduled UE, and leads to higher network-level through-
put. In contrast, a UE may prefer the optimal MCS given
by UATNet as the MCS maximizes its individual UAT.
Using a smaller MCS, the UE can remain competitive in
the next round of PF scheduling according to 6. Clearly,
such a UE decision is selfish at the sacrifice of the network
performance. Fortunately, UEs in a centralized network
cannot choose their own MCS orders.

VII. CONCLUSION
Fig. 14. MCS landscape cross validation. In this paper, we have demonstrated the first DNN capa-
ble of predicting the network-level performance of a wire-
less communication system by exploiting information from
allocated RBG. Both cases incur UAT performance degrada- both PHY and MAC layers. More specifically, we have pro-
tion. It is worth emphasizing that Fig. 12 is the first illustration posed two novel DNN structures, UATNet and ACKNet, to
of the MCS landscapes reported in the literature. Empowered predict two network-level performance metrics, namely
with these MCS landscapes, we are capable of designing the user average throughput for a target UE and the ACK/
optimal MCS for UEs. NACK feedback of a TB. In particular, a weighted curricu-
Besides UATNet, ACKNet can also predict the UAT lum training (WCT) algorithm has been developed to allevi-
from another perspective as it can predict the probability ate the impact of noisy labels. Extensive results have
of the outcome of ACK/NACK in terms of different MCS confirmed that UATNet can accurately predict the resulting
orders. Let Tn ðmÞ denote the effective TB size of the UAT while ACKNet can achieve an impressive accuracy
n-th UE under a particular MCS of order m. It can be esti- rate of 95%. Finally, we have demonstrated that the newly
mated by: proposed UATNet and ACKNet can be utilized to find the
optimal MCS value by computing the MCS landscapes for
Tn ðmÞ ¼ P ðK ¼ 1 j MCS ¼ mÞ Tn ; (16) a given UE.
Source code and simulation data used in this work are avail-
where P ðK ¼ 1 j MCS ¼ mÞ is the conditional probability of able on Github at https://github.com/LSCSC/Network-level-
ACK when an MCS of order m is employed, and Tn is the TB Performance-Prediction.

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.
2376 IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, VOL. 9, NO. 4, JULY-AUGUST 2022

APPENDIX [9] R. Bruno, A. Masaracchia, and A. Passarella, “Robust adaptive modula-

tion and coding (AMC) selection in LTE systems using reinforcement
learning,” in Proc. IEEE 80th Veh. Technol. Conf., 2014, pp. 1–6.
TABLE IV [10] T. O’Shea and J. Hoydis, “An introduction to deep learning for the phys-
COLLECTED NETWORK COUNTERS ical layer,” IEEE Trans. Cogn. Commun. Netw., vol. 3, no. 4, pp. 563–
575, Dec. 2017.
[11] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Deep learning based mimo
communications,” 2017, arXiv:1707.07980.
[12] Y. Lin, Y. Tu, and Z. Dou, “An improved neural network pruning tech-
nology for automatic modulation classification in edge devices,” IEEE
Trans. Veh. Technol., vol. 69, no. 5, pp. 5703–5706, May 2020.
[13] Y. Wang, G. Gui, T. Ohtsuki, and F. Adachi, “Multi-task learning for
generalized automatic modulation classification under non-gaussian
noise with varying SNR conditions,” IEEE Trans. Wireless Commun.,
vol. 20, no. 6, pp. 3587–3596, Jun. 2021.
[14] Y. Wang et al., “Distributed learning for automatic modulation classifi-
cation in edge devices,” IEEE Wireless Commun. Lett., vol. 9, no. 12,
pp. 2177–2181, Dec. 2020.
[15] Y. Zhang et al., “CV-3DCNN: Complex-valued deep learning for CSI
prediction in FDD massive MIMO systems,” IEEE Wireless Commun.
Lett., vol. 10, no. 2, pp. 266–270, Feb. 2021.
[16] T. J. O’Shea, T. Roy, N. West, and B. C. Hilburn, “Physical layer com-
munications system design over-the-air using adversarial networks,” in
Proc. 26th Eur. Signal Process. Conf., 2018, pp. 529–532.
[17] Y. Wei, Z. Zhang, F. R. Yu, and Z. Han, “Joint user scheduling and con-
tent caching strategy for mobile edge networks using deep reinforcement
learning,” in Proc. IEEE Int. Conf. Commun. Workshops, 2018, pp. 1–6.
[18] L. Huang, S. Bi, and Y. J. Zhang, “Deep reinforcement learning for
online computation offloading in wireless powered mobile-edge com-
puting networks,” IEEE Trans. Mobile Comput., vol. 19, no. 11,
pp. 2581–2593, Nov. 2020.
[19] I.-S. Comşa et al., “Towards 5G: A reinforcement learning-based sched-
uling solution for data traffic management,” IEEE Trans. Netw. Service
Manag., vol. 15, no. 4, pp. 1661–1675, Dec. 2018.
[20] I.-S. Comşa, S. Zhang, M. Aydin, P. Kuonen, R. Trestian, and G. Ghinea,
“A comparison of reinforcement learning algorithms in fairness-oriented
ofdma schedulers,” Inf., vol. 10, no. 10, 2019, Art. no. 315.
[21] I.-S. Comşa, G.-M. Muntean, and R. Trestian, “An innovative machine-
learning-based scheduling solution for improving live UHD video
streaming quality in highly dynamic network environments,” IEEE
Trans. Broadcast., vol. 67, no. 1, pp. 212–224, Mar. 2021.
[22] M. Elsayed and M. Erol-Kantarci, “Deep reinforcement learning for
reducing latency in mission critical services,” in Proc. IEEE Glob. Com-
mun. Conf., 2018, pp. 1–6.
[23] Y. Wei, F. R. Yu, M. Song, and Z. Han, “User scheduling and resource
allocation in HetNets with hybrid energy supply: An actor-critic rein-
forcement learning approach,” IEEE Trans. Wireless Commun., vol. 17,
no. 1, pp. 680–692, Jan. 2018.
REFERENCES [24] Z. Wang, L. Li, Y. Xu, H. Tian, and S. Cui, “Handover optimization via
[1] C. Zhang, P. Patras, and H. Haddadi, “Deep learning in mobile and wire- asynchronous multi-user deep reinforcement learning,” in Proc. IEEE
less networking: A survey,” IEEE Commun. Surv. Tut., vol. 21, no. 3, Int. Conf. Commun., 2018, pp. 1–6.
pp. 2224–2287, Jul.–Sep. 2019. [25] J. Liu, B. Krishnamachari, S. Zhou, and Z. Niu, “Deepnap: Data-driven
[2] F. D. Calabrese, L. Wang, E. Ghadimi, G. Peters, L. Hanzo, and base station sleeping operations through deep reinforcement learning,”
P. Soldati, “Learning radio resource management in RANs: Framework, IEEE Internet Things J., vol. 5, no. 6, pp. 4273–4282, Dec. 2018.
opportunities, and challenges,” IEEE Commun. Mag., vol. 56, no. 9, [26] S. Rajendran, W. Meert, D. Giustiniano, V. Lenders, and S. Pollin,
pp. 138–145, Sep. 2018. “Deep learning models for wireless signal classification with distributed
[3] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted low-cost spectrum sensors,” IEEE Trans. Cogn. Commun. Netw., vol. 4,
MMSE approach to distributed sum-utility maximization for a MIMO no. 3, pp. 433–445, Sep. 2018.
interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59, [27] S. Zheng, P. Qi, S. Chen, and X. Yang, “Fusion methods for cnn-based
no. 9, pp. 4331–4340, Sep. 2011. automatic modulation classification,” IEEE Access, vol. 7, pp. 66496–
[4] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, 66504, 2019.
“Learning to optimize: Training deep neural networks for interference [28] J. Tian, Y. Pei, Y.-D. Huang, and Y.-C. Liang, “Modulation-constrained
management,” IEEE Trans. Signal Process., vol. 66, no. 20, pp. 5438– clustering approach to blind modulation classification for MIMO sys-
5453, Oct. 2018. tems,” IEEE Trans. Cogn. Commun. Netw., vol. 4, no. 4, pp. 894–907,
[5] F. Liang, C. Shen, W. Yu, and F. Wu, “Towards optimal power control Dec. 2018.
via ensembling deep neural networks,” 2018, arXiv:1807.10025. [29] Q. Cao, S. Zeng, M.-O. Pun, and Y. Chen, “Network-level system per-
[6] R. Sato, M. Yamada, and H. Kashima, “Approximation ratios of graph formance prediction using deep neural networks with cross-layer
neural networks for combinatorial problems,” in Proc. Adv. Neural Inf. information,” in Proc. IEEE Int. Conf. Commun., 2020, pp. 1–6.
Process. Syst., 2019, pp. 4083–4092. [30] R. Patidar, S. Roy, T. R. Henderson, and A. Chandramohan, “Link-to-
[7] E. Dahlman, S. Parkvall, and J. Skold, “4G: LTE/LTE-advanced for system mapping for NS-3 Wi-Fi OFDM error models,” in Proc. Work-
mobile broadband,” New York, NY, USA: Academic Press, 2013. shop NS-3, 2017, pp. 31–38.
[8] E. Ghadimi, F. D. Calabrese, G. Peters, and P. Soldati, “A reinforcement [31] Z. Hanzaz and H. D. Schotten, “Comparison of link to system interface
learning approach to power control and rate adaptation in cellular models for wimax system,” in Proc. 3rd Int. Congr. Ultra Modern Tele-
networks,” in Proc. IEEE Int. Conf. Commun., 2017, pp. 1–7. commun. Control Syst. Workshops, 2011, pp. 1–6.

[32] E. Chu, J. Yoon, and B. C. Jung, “A novel link-to-system mapping tech- Man-On Pun (Senior Member, IEEE) received the
nique based on machine learning for 5G/IoT wireless networks,” Ph.D. degree in electrical engineering from the Univer-
Sensors, vol. 19, no. 5, 2019, Art. no. 1196. sity of Southern California, Los Angeles, CA, USA, in
[33] D. Petrov, A. Oborina, L. Giupponi, and T. H. Stitz, “Link performance 2006. He was a Postdoctoral Research Associate with
model for filter bank based multicarrier systems,” EURASIP J. Adv. Sig- Princeton University, Princeton, NJ, USA, from 2006
nal Process., vol. 2014, no. 1, 2014, Art. no. 169. to 2008. He is currently an Associate Professor with the
[34] A. Masaracchia, R. Bruno, A. Passarella, and S. Mangione, “Analysis of School of Science and Engineering, The Chinese Uni-
MAC-level throughput in LTE systems with link rate adaptation and versity of Hong Kong, Shenzhen (CUHKSZ), China.
harq protocols,” in Proc. IEEE 16th Int. Symp. World Wireless Mobile Prior to joining CUHKSZ in 2015, he held research
Multimedia Netw., 2015, pp. 1–9. positions with Huawei (USA), Mitsubishi Electric
[35] S. Lagen, K. Wanuga, H. Elkotby, S. Goyal, N. Patriciello, and Research Labs (MERL) in Boston and Sony in Tokyo,
L. Giupponi, “New radio physical layer abstraction for system-level Japan. His research interests include AI Internet of Things (AIoT) and app-
simulations of 5G networks,” 2020, arXiv:2001.10309. lications of machine learning in communications and satellite remote sensing.
[36] D. Tse, “Multiuser diversity in wireless networks,” in Wireless Commun. Prof. Pun was the recipient of best paper awards from IEEE VTC’06 Fall,
Seminar. Stanford, CA, USA: Standford Univ., 2001. IEEE ICC’08 and IEEE Infocom’09. He was an Associate Editor for IEEE
[37] Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum TRANSACTIONS ON WIRELESS COMMUNICATIONS in 2010–2014. He is the found-
learning,” in Proc. 26th Annu. Int. Conf. Mach. Learn., 2009, pp. 41–48. ing chair of the IEEE Joint SPS-ComSoc Chapter, Shenzhen.
[38] L. Jiang, Z. Zhou, T. Leung, L.-J. Li, and L. Fei-Fei, “Mentornet: Learn-
ing data-driven curriculum for very deep neural networks on corrupted
labels,” in Proc. Int. Conf. Mach. Learn., 2018, pp. 2304–2313.
[39] C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding Yi Chen received the B.S. degree in communication
deep learning requires rethinking generalization,” 2016, arXiv:1611.03530. engineering from the Beijing University of Posts and
[40] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep Learning, Telecommunications, Beijing, China, in 2007 and the
vol. 1. Cambridge, MA, USA: MIT Press, 2016. Ph.D. degree in information engineering from The Chi-
[41] D. Arpit et al., “A closer look at memorization in deep networks,” in
nese University of Hong Kong, Hong Kong, in 2012.
Proc. 34th Int. Conf. Mach. Learn., 2017, pp. 233–242.
She is currently a Research Assistant Professor with the
[42] L. Jiang, Z. Zhou, T. Leung, L.-J. Li, and L. Fei-Fei, “MentorNet: School of Science and Engineering, the Chinese Univer-
Learning data-driven curriculum for very deep neural networks on cor- sity of Hong Kong, Shenzhen, Hong Kong. She is also a
rupted labels,” 2017, arXiv:1712.05055. Research Scientist with the Shenzhen Research Institute
[43] B. Han et al., “Co-teaching: Robust training of deep neural networks of Big Data. Her research interests include wireless com-
with extremely noisy labels,” in Proc. Adv. neural Inf. Process. Syst., munication, resource allocation, and machine learning.
2018, pp. 8527–8537.

Qi Cao received the bachelor’s degree in electronic

communication engineering from the University of Liv-
erpool, Liverpool, U.K., in 2013, the M.S. degree in
communications and signal processing from Imperial
College London, London, U.K., in 2014, and the Ph.D.
degree from the School of Information and Control Engi-
neering of China University of Mining and Technology,
Xuzhou, China, in 2018. He joined the Chinese Univer-
sity, Hong Kong, Shenzhen as a Postdoctoral Research
Associate in 2019. His research interests include MIMO
wireless communications, machine learning and artificial
intelligent (AI)-driven network optimization.

Authorized licensed use limited to: University of the West Indies (UWI). Downloaded on July 03,2023 at 20:32:44 UTC from IEEE Xplore. Restrictions apply.