AI Technology For NoC Performance Evaluation
AI Technology For NoC Performance Evaluation
AI Technology For NoC Performance Evaluation
Abstract—An on-chip network has become a powerful plat- be avoided using alternate technology like machine learn-
form for solving complex and large-scale computation problems ing (ML)-based methods. Recent researches in this way are
in the present decade. However, the performance of bus-based discussed in [4], [5], [6], [7], [8], [9], [10], [11], [12]. These
architectures, including an increasing number of IP cores in
systems-on-chip (SoCs), does not meet the requirements of schemes are comparative to each other. However, they are not
lower latencies and higher bandwidth for many applications. A adequate to the expected level. For example, their prediction
network-on-chip (NoC) has become a prevalent solution to over- error and speedup ranges from 9-45% and 2-1300×, respec-
come the limitations. Performance analysis of NoC’s is essential tively, including high evaluation time. It motivates to design a
for its architectural design. NoC simulators traditionally investi- practical framework to improve these parameters and quickly
gate performance despite they are slow with varying architectural
sizes. This work proposes a machine learning-based framework estimate vagarious performance metrics in NoCs of larger size.
that evaluates NoC performance quickly. The proposed frame- This brief proposes an ML-based performance evaluation
work uses the linear regression method to predict different scheme for NoCs. The scheme uses a Linear Regression algo-
performance metrics by learning the trained dataset speedily and rithm trained by data collected from the BookSim simulator.
accurately. Varying architectural parameters conduct thorough Several rounds of experiments are conducted at a varying
experiments on a set of mesh NoCs. The experiments’ highlights
include the network latency, hop count, maximum switch, and system configuration to predict multiple performance param-
channel power consumption as 30-80 cycles, 2-11, 25µW, and eters for a set of mesh NoCs. The anticipated results by
240µW, respectively. Further, the proposed framework achieves the proposed scheme highlight that average network latency
accuracy up to 94% and speedup of up to 2228×. and hop count are about 30-80 cycles, and 2-11, respectively.
Index Terms—Networks-on-chip, machine learning, linear Other metrics like the total area, maximum switch, channel,
regression, dataset, performance evaluation. and total power consumption are predicted as 0.08-5µm2 ,
25µW, 240µW, and 300µW, respectively. The advantages of
the proposed approach are that it achieves a minimum speedup
I. I NTRODUCTION of 260× and a maximum speedup of 2228× compared to
OWADAYS, multiprocessor SoCs (MPSoCs) cannot
N meet the high performance and scalability requirements
to handle the complicated on-chip communications due to
the BookSim simulator. Also, the prediction error lies in the
range of 6-8% resulting in up to 94% accuracy. Compared to
previous works, the proposed prediction framework provides
communication bottleneck. An NoC fabric has become a up to 44% more accuracy and renders up to 2200× more
prevalent solution interconnecting hundreds to thousands of speedup than the previous ones. Note that these improvements
cores in many-core systems [1], [2]. The primary way of grow with the NoC size.
evaluating an NoC architecture is to measure its performance. The rest of this brief is organized as follows. Section II
The performance is traditionally evaluated via simulations [3]. presents the proposed scheme. Section III provides the exper-
Researchers use various cycle-accurate NoC simulators like imental results. Section IV concludes this brief.
BookSim, Nirgam, Noxim, etc. [4]. However, the time
increases for providing the results by these simulators while II. P ROPOSED W ORK
the NoC size increases. Thus, there is a need to develop a
Artificial intelligence (AI) technology such as machine
faster performance evaluation scheme for NoCs.
learning has surpassed human efforts and become a signifi-
Conventional performance measurement approaches rely on
cant milestone for many applications, including VLSI systems
simulations at lower abstraction levels to run the architec-
and applications like performance evaluation in NoCs. The
tures for accurate results. But running at lower abstraction
proposed performance evaluation framework using a machine
levels takes a long time. On the other hand, running simu-
learning technique is based on the linear regression algo-
lations at higher abstraction levels is time efficient but could
rithm. The proposed framework is expected to predict multiple
not deliver accurate results [5]. So traditional approaches can
performance metrics in an NoC having a large size and
Manuscript received September 15, 2021; revised October 23, 2021; increase the computing flexibility of the NoC platform.
accepted October 25, 2021. Date of publication November 1, 2021; date
of current version November 24, 2021. This brief was recommended
by Associate Editor Y. Ha and E. Bonizzoni. (Corresponding author: A. Linear Regression Framework for NoCs
Biswajit Bhowmik.)
The authors are with the Department of Computer Science and Engineering, Linear Regression (LR) is a supervised machine learning
National Institute of Technology Karnataka, Mangalore 575025, India (e-mail: algorithm that predicts continuous/actual or numeric values. It
brb@nitk.edu.in). is advantageous and easier to interpret, implement, and train
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TCSII.2021.3124297. efficiently. It handles overfitting well using reduction tech-
Digital Object Identifier 10.1109/TCSII.2021.3124297 niques, regularization, and cross validation [13] dimensionally.
1549-7747
c 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: UB Siegen. Downloaded on November 30,2021 at 22:01:53 UTC from IEEE Xplore. Restrictions apply.
3484 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 68, NO. 12, DECEMBER 2021
Algorithm 1 Dataset_Generation
1: procedure DatasetGenerator()
2: Set V← setVirtualChannel(), B← setBufferSize(), R←
setRoutingAlgorithm(), T← setTopologyType(), F←
setTrafficPattern(), I← setInjectionRate(), S← setTopolo-
gySize(), P← setPacketSize(), and A← setSamplePeriod()
3: C← loadConfiguration(V,B,R,T,F,I,S,P,A)
4: for each(s∈ S|S={2×2, ..., 15×15}) do
5: for each(t∈ T|T={uniform,tornado}) do
6: for each(v∈ V|V={2,3,4,5}) do
7: for each(b∈ B|B={6,8}) do
8: for each(i∈ I|I={0.001,..,0.009}) do
9: beginSimulation(command, C)
10: result←read(APL,ANL,AHC,
CWPC,SPC,TPC,TA)
11: dataset←appendResult(result)
12: end for
Fig. 1. Proposed LR Framework. 13: end for
14: end for
15: end for
The LR has two parameters- weight (regression coefficient) 16: end for
m and bias c. Regression coefficients are estimates of the 17: end procedure
unknown dataset parameters. In other words, it is the scale
factor to each input data. Bias is a factor that offsets all
predictions. The prepared data is divided into “attributes”
generation and collection. This work employs BookSim
and “labels”. Attributes are the independent variables, while
2.0 simulator to generate the training dataset for the
labels are dependent variables whose values are predicted–the
set of NoCs taken. Various performance metrics are
algorithm models a target prediction value for the dependent
measured via simulations, and the proposed scheme pre-
variables (labels) based on the attributes. We need to check for
dicts the same. These parameters are average packet
variables that have some linear relationship between the labels
latency (APL), average network latency (ANL), aver-
and the attributes from our dataset. And accordingly, we select
gare hop count (AHC), channel wire power consumption
the attributes that are required to build the model. By training
(CWPC), switch power consumption (SPC), total power
our model on the selected pair of attributes and labels, the
consumption (TPC), and total area (TA). The collected
algorithm learns the best combination of m and c. The best
dataset is now ready to train the model using the Linear
combination of values makes the most accurate predictions
Regression algorithm. In other words, the model learns
with a minor error. Training gets completed when the model
the parameters from the trained dataset.
returns an acceptable error value. The model finds the best-
• Testing Phase-The second phase is the testing phase. The
fit regression line that best fits the data points. The evaluation
underlying regression model is trained to validate the
framework calculates this metric through the mean square error
results predicted for networks under consideration. The
(MSE) cost function. The metric finds the error (difference)
validation in this phase is mainly performed by compar-
between the actual results and the predicted results.
ing the simulation and predicted results generated for the
networks.
B. Designing Prediction Framework The simulation results may differ from the predicted results
Figure 1 describes the steps for designing the proposed by the framework for an NoC. This difference in the MSE is
prediction framework. The framework consists of two phases: measured using the Gradient descent. The MSE is defined in
the training and testing phase. A brief description of these Equation (1). Here, N = number of data points, yt is the actual
phases is stated here. value, and y∗t is the predicted output. Gradient descent updates
LR parameters by minimizing the cost function and reducing
1 ∗
N
MSE = (yt − yt )2 (1) MSE by evaluating the cost function’s gradient. Pseudocode
N for the proposed LR framework is stated in Algorithm 2 where
t=1
one finds how the MSE gets calculated.
• Training Phase-The training phase is the first phase of
the prediction model. In this phase, the training dataset
for an NoC is collected on running a simulator for the III. E XPERIMENTAL R ESULTS
network. The simulations evaluate multiple performance This section evaluates the performance of NoC architec-
metrics, which are fed for training then for testing too. tures by the proposed learning-based framework. A set of 2D
Algorithm 1 reveals the procedure of training dataset mesh NoCs with a size that ranges from 2 × 2 − 15 × 15
Authorized licensed use limited to: UB Siegen. Downloaded on November 30,2021 at 22:01:53 UTC from IEEE Xplore. Restrictions apply.
BHOWMIK et al.: AI TECHNOLOGY FOR NoC PERFORMANCE EVALUATION 3485
Algorithm 2 Linear_Regression_Framework
1: procedure LinearRegressionFramework()
2: loadFile(dataset)
3: Partition dataset into train and test data
4: Determine Attributes and Labels
5: Training and testing
6: regressor.fit(train data)
7: regressor.predict(test data)
8: Framework Evaluation
9: calculateMSE(test data, prediction data)
10: timingComparison(simulation time, framework time)
TABLE I
B OOK S IM S IMULATION C ONFIGURATION S ETUP
Authorized licensed use limited to: UB Siegen. Downloaded on November 30,2021 at 22:01:53 UTC from IEEE Xplore. Restrictions apply.
3486 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 68, NO. 12, DECEMBER 2021
Fig. 3. Predicted results by the proposed scheme at (a-d) Traffic = Uniform, Fig. 4. Predicted performance metrics at Traffic = Uniform, VC = 5,
VC = 3, PIR = 0.0025, and Buffer = 6; (e-f) Traffic = Tornado, VC = 4, Buffer = 8 on the 11 × 11 mesh NoC.
PIR = 0.0015, and Buffer = 8.
Authorized licensed use limited to: UB Siegen. Downloaded on November 30,2021 at 22:01:53 UTC from IEEE Xplore. Restrictions apply.
BHOWMIK et al.: AI TECHNOLOGY FOR NoC PERFORMANCE EVALUATION 3487
TABLE II
C OMPARISON ON THE E VALUATION T IME proposed LR framework achieves comparatively high accuracy
and speed up concerning the previous works discussed above.
Further, our approach takes significantly very little time to
predict the performance metrics. Thus, one can say that our
scheme provides about 44% and 2200× more accuracy and
speedup than the previous works.
Authorized licensed use limited to: UB Siegen. Downloaded on November 30,2021 at 22:01:53 UTC from IEEE Xplore. Restrictions apply.