Review - Machine Learning Techniques in Analog - RF Integrated Circuit Design, Synthesis, Layout, and Test

Version of Record: https://www.sciencedirect.
com/science/article/pii/S0167926020302947
Manuscript_018ca6b0a1abc21a51fee5fd71fd8c24
Review: Machine Learning Techniques in Analog/RF Integrated Circuit Design,

Synthesis, Layout, and Test
Email address: engin.afacan@lip6.fr (Engin Afacan1 )
Preprint submitted to Integration, The VLSI journal October 11, 2020
© 2020 published by Elsevier. This manuscript is made available under the Elsevier user license
https://www.elsevier.com/open-access/userlicense/1.0/
Review: Machine Learning Techniques in Analog/RF Integrated Circuit Design,
Synthesis, Layout, and Test
Engin Afacan1∗, Nuno Lourenço2 , Ricardo Martins2 , Günhan Dündar3

Laboratoire d’Informatique de Paris 6, Sorbonne University, Paris, France1
Instituto de Telecomunicaçŏes, Lisbon, Portugal2
Department of Electrical and Electronics Engineering, Boğaziçi University, Istanbul, Turkey 3
Abstract
Rapid developments in semiconductor technology have substantially increased the computational capability of computers.
As a result of this and recent developments in theory, machine learning (ML) techniques have become attractive in many
new applications. This trend has also inspired researchers working on integrated circuit (IC) design and optimization. ML-
based design approaches have gained importance to challenge/aid conventional design methods since they can be employed
at different design levels, from modeling to test, to learn any nonlinear input-output relationship of any analog and radio
frequency (RF) device or circuit; thus, providing fast and accurate responses to the task that they have learned. Furthermore,
employment of ML techniques in analog/RF electronic design automation (EDA) tools boosts the performance of such tools.
In this paper, we summarize the recent research and present a comprehensive review on ML techniques for analog/RF circuit
modeling, design, synthesis, layout, and test.
Keywords: Artificial Neural Network, Analog and Radio Frequency, Deep Learning, Machine Learning, Artificial Intelligence,
Integrated Circuits, Synthesis, Optimization.
1. Introduction Targeted Targeted Targeted

Specificaitons-I Specificaitons-II Specificaitons-N
Analog and RF devices and circuits are fundamental elec-
tronic components in the broadest type of electronic devices.
In addition to consumer electronics markets, the IC industry
is, more than ever, pressed by the enormous demand of med-
ical, healthcare, automotive, or security electronics, for ex- Analysis ML
Training
ample. Analog/RF components are already present in more +
than 50% of the total IC shipments yearly; thus, their de- Design Simulation
sign, test, and validation are fundamental tasks to meet the
stringent time-to-market constraints and production costs. Simulation
Computer-aided design (CAD) tools are quintessential in the Model
design of analog IC. In consumer electronics, the massifi-
No
cation balances the design effort of the analog/RF circuits. Satisfied?
However, the lack of EDA challenges the design of the custom
ICs needed to produce state-of-the-art customized equip- Yes
ment and created barriers to innovation. The adoption of
automation mechanisms can significantly reduce their de-
velopment time while simultaneously improving their per- Solution
formance. However, design automation in analog IC design
flow is far from being the norm, despite the enormous efforts Figure 1: Conventional and ML-based design flows.
made by the EDA community over the past few decades. Ana-
log IC design is in sharp contrast to the digital IC design flow,
where plenty of EDA tools are available and established. Ana- nanometer integration technologies, only further increase
log ICs’ nonlinear behavior, the increasing complexity ob- the difficulties faced on analog/RF IC design and test, plac-
served in nowadays applications, and the challenges in deep ing additional pressure on analog/RF IC designers and EDA
development teams. ML has been the subject of intensive re-
∗ Corresponding author search, and it is reshaping society in many different ways. ML
Email address: engin.afacan@lip6.fr (Engin Afacan1 ) also opens new perspectives on how computational intelli-
Preprint submitted to Integration, The VLSI journal October 11, 2020
gent EDA tools for analog and RF IC design can help the IC nomial regression, decision trees, support vector machines
designers to be more productive. (SVMs), and ANNs, among others. In unsupervised learn-
Fig. 1 illustrates the general flows of the conventional and ing, the data is unlabeled and algorithms group data points
ML-based design methodologies. When following the con- based on their features. Clustering, visualization, dimension-
ventional flow, the designer repeats the flow for each differ- ality reduction, and anomaly detection are examples of un-
ent targeted specifications, even for the same problem. De- supervised learning. Common unsupervised learning algo-
signer’s own experience, knowledge, and instincts are of the rithms are the k-means and principal component analysis
utmost importance, but still, the lack of formalization sub- (PCA), and their variants. In Fig. 2a logistic regression (that
stantially limits knowledge dissemination and reuse. On the despite the name is a classifier) illustrates supervised classi-
other hand, the ML-based design expeditiously produces so- fication, in Fig. 2b polynomial regression is used to model Y
lutions. The caveat is how to obtain such successful models. as a function of polynomials of X, and in Fig. 2c, k-means is
This paper addresses the efforts made by the EDA research used for clustering.
community and how the traditional and computational in-
telligence tools can take advantage of the advances in ML.
This paper’s organization is as follows. In Section 2, the ML
foundations are briefly overviewed, including models with
different types of supervision. Section 3 presents the existent
techniques for modeling of analog/RF ICs based on ML tech-
niques. In Section 4, the focus is given on the ML synthesis,
whereas in Section 5, we outline the most recent ML tech-
niques for layout generation. Studies on fault testing and di- (a) (b)
agnosis that exploit ML techniques are discussed in Section 6.
Finally, in Section 7, the conclusions and future research di-
rections are drawn.
2. Background
Machine (or statistical) learning foundations are from arti-

ficial intelligence, but while the latter aims at building expert
systems, the former focus on the statistical properties of data
(c)
[1]. Bayes’ essay on Probability Theory [2] laid the theoretical
foundations for statistical learning and is the base for some
early ML techniques, such as Naive Bayes or Markov Chains. Figure 2: (a) Logistic regression for the classification of two classes; (b) Poly-
nomial regression that describes Y as a function of X, solid line shows a good
In 1951, the first neural network machine was proposed, but regressor, dashed line shows and overfitfed regressor; (c) Group data into 3
was only after Frank Rosenblatt’s perceptron [3] and back- different clusters using k-means.
propagation [4], in 1958 and 1986, respectively, that artificial
neural networks (ANNs) began to receive more attention. In There is also semi-supervised learning, where the data that
the meanwhile, many other advances have been achieved, is used to train the system is partially labeled, and the sys-
and today many techniques to design ML systems for solving tem is trained with combinations of supervised and unsuper-
classification and regression tasks are available. In a classi- vised learning algorithms. For example, deep belief networks
fication problem, the objective is to categorize the data. For (DBN) build upon restricted Boltzmann machines (RBMs)
example, an email spam filter aims to assign incoming emails or autoencoders trained in an unsupervised manner, and
to the "spam" or "no-spam" categories. Whereas in regres- then the whole system is fine-tuned using supervised learn-
sion, the systems try to describe one or more continuous- ing techniques [7]. A different approach is taken in reinforce-
valued dependent variables as functions of the observations ment learning (RL). In an RL system, an agent observes and
in the data. Critically to all ML systems is their ability to gen- interacts with the environment by selecting and executing ac-
eralize well to new data and avoid overfitting to the training tions. The agent is trained to learn a policy that maximizes
data. Overfitting occurs when an ML system starts to learn the expected outcome of the actions over time [8]. These sys-
the noise in the training data instead of learning the underly- tems can teach in robots to learn motor skills [9] or play com-
ing mechanisms that generated the data [1, 5, 6]. plex board games [10]. It is also essential to distinguish the
Another critical characteristic of ML systems is the amount application from the algorithm as the same underlying ML al-
and type of supervision. In supervised learning, the data used gorithms can be applied in several or all of these approaches.
to train the system must include the desired solution, called ANNs, for example, can be used in all the approaches men-
a label. The label can be categorical (in classification prob- tioned above. ANNs, in the form of convolutional neural net-
lems) or continuous valued (in regression problems). Some works (CNNs), are incredibly efficient image classifiers in a
important supervised learning algorithms are linear discrim- supervised learning setting. On the other hand, autoencoder
inant analysis, linear regression, logistic regression, poly- networks can be trained without supervision to learn latent
3
space, and deep reinforcement learning has shown impres- variance of the i t h coordinate in the projected space, where
sive results in beating human experts on several games [11]. u i x n is given by u 1T S u1 .|.
ML is widespread and horizontally suited for many appli-
1 XN
cations, including EDA. While most algorithms can perform S= (x n − x̄)(x n − x̄)T . (2)
identically on curated large datasets [12], data can be diffi- N n=1
cult and expensive to acquire, and small to medium-sized Hence to maximize the variance of the projection and con-
datasets are usual. Selecting the most suitable method for the straining the kui k to prevent it from going to infinity, the so-
target application is an important design choice. The avail- lution to the Lagrangian is Su 1 = λ1 u 1 , meaning that u i is
able options are many, and, in the next sub-sections, some an eigenvector from S and the corresponding variance is a
methods found in EDA are briefly described. maximum if the corresponding eigenvalue is the largest. The
additional principal components are the eigenvectors cor-
2.1. Clustering responding to the higher eigenvalues. By keeping only the
components with more variance, data is represented with
Clustering algorithms are unsupervised learning algo- fewer features. [15] used PCA to reduce the number of design
rithms that group unlabeled data into K predefined clus- variables in the optimization of an amplifier and a voltage-
ters, using some distance, d (x i , x j ) metric between the data controlled oscillator (VCO). PCA is a linear operator and does
points. The objective of a clustering algorithm is to find the not handle nonlinearity in the data, however, the kernel trick
mapping C ? (x) = k, k ∈ 1, 2, .., K that minimizes (1). [16] can be used to extend it to nonlinear relations in the data.
1X K X X 2.3. Linear Discriminant Analysis

W (C ) = d (x i , x j ). (1)
2 k=1C (xi )=k C (x j )=k Linear discriminant analysis (LDA) is, like PCA, a linear
method to reduce the dimensionality. However, instead of
The possible mappings between the input data points and maximizing variability, it aims at maximizing the separa-
the clusters grow very sharply with the number of data points tion between classes. Fisher’s LDA is commonly used and
and number clusters, quickly becoming intractable. There- finds the linear combination L = a T X , that maximizes the
fore, clustering is usually solved using iterative greedy de- between-class covariance relative to the within-class covari-
scent methods, such as K-means. K-means start by assign- ance, as defined in (3).
ing centers (randomly or using some spreading criteria) to the
max(a T S B ) subject to a T S W a = 1. (3)
clusters, then iterates the two following steps, until no further a
improvement is possible:
The between-class variance of L is a T S B a, whereas the
• for each center, identify the training points that are within-class variance of L is a T S W a, with S B and S W being
closer to that center than to the other centers; the covariance matrix of the class centroid matrix and the
within-class covariance matrix, respectively. The solution to
• update each cluster’s center to become the mean of the the generalized eigenvalue problem in 2, results in the i t h
data points identified as belonging to it. discriminant variable being given by L i = (S B −1/2 v i )T X , for
the i t h eigenvector of S B1/2 S W
−1 1/2
S B with i t h largest eigenvalue.
Clustering methods can be effective solutions to reduce the Also, like PCA, LDA can be extended using the kernel trick to
amount of data to be processed without losing too much in- learn nonlinear mappings. In [17] kernel PCA is extended to
formation. In [13] clustering is used to reduce up to 10% the consider separability and used pre-process features on for an
data required to train an SVM classifier for analog IC fault ANN-based fault diagnosis method. Also, for fault diagnosis,
diagnostics, whereas in [14], fuzzy c-means groups the ele- [18] uses kernel LDA to reduce dimension before training a
ments of the population during analog IC sizing optimiza- naïve Bayes classifier.
tion to apply time-expensive Monte Carlo simulations only
to a handful of meaningful tentative solutions. While cluster- 2.4. Decision Tree
ing can result in significant savings, determining the number Decision trees (DTs) formalize a decision-making process
of clusters without losing information can be difficult. Also, in a directed acyclic graph. There are 2 types of nodes in DTs:
clustering is sensitive to the distance metric and scaling be- the decision nodes and the terminal nodes. The first rep-
tween features. resents decision criteria, while the latter represents the out-
come of the sequence of decisions. DTs offer a clear insight
into data, and it is easy to extrapolate conclusions from them.
2.2. Principal Component Analysis
The DTs are often trained with the CART (classification and
Another unsupervised learning algorithm is PCA. Like clus- regression tree) algorithm, which splits the training set in two
tering, PCA can be used to reduce the amount of data without subsets using a single feature k and a threshold t k , and then,
losing information. PCA is a linear operation that transforms tries to minimize the cost function given by:
the feature space in a latent space maximizing the variance. ml e f t m r i g ht
Formally, taking the data’s covariance, S, defined in (2), the J (k, t k ) = Gl e f t + G r i g ht , (4)
m m
4
where G l e f t is the amount of impurity of the left/right subset 2.6. Support Vector Machines
and m l e f t /m r i g ht is the number of instances in the left/right SVM is a supervised learning algorithm for data separa-
subset. Nevertheless, complex datasets lead to over-complex tion, fitting the boundary, h(x)T β + β0 = 0 that maximizes
trees that end up overfitting. Random Forests improve gen- the margin, 2/kβk between classes as shown in Fig. 4. h(x)
eralization by creating ensembles of DTs where features and is a transformation of the feature space that enlarges the de-
data points are randomly sampled with replacement. [19] cision space to improve the performance of the linear clas-
uses DTs to automate the selection of a circuit topology given sifier, and typically translates to nonlinear boundaries in the
the target specifications, as illustrated in Fig. 3. [20] use original space. In the case of non-separable classes, as shown
the random forest to identify possible rare events during the in Fig. 4 the margin is maximized subject to a total budget
Monte Carlo simulation. ξi ≤ constant as defined in (8).
P
< 35dB > 35dB

GDC
< 10MHz ...

50% 50% GBW
< 20 mW > 20mW

Power
< 1MHz GBW < 80 dB GDC > 80 dB

...
Figure 4: Concept of Margin for non-separable classes. The point on the

0
wrong side of their margin are identified by ξ j = M ξ j [5].

Figure 3: DT for the selection of a circuit topology given the target specifica-
³ ´
y i h(x i )T β ≥ 1 − ξi ∀i ,

tions. Adapted from [19]. min kβk subject to (8)
ξi ≥ 0, P ξi ≥ constant.
Where ξi represents how far a point is on the wrong side of the

2.5. Naive Bayes
margin. It is zero for points on the proper side of the margin,
The Naive Bayes classifier algorithm affords fast, highly
and y i ∈ {1, 1} is the class identifier. The solution to this prob-
scalable training and scoring. The Naive Bayes classifier
lem is obtained by maximizing the dual lagrangian, expressed
works by selecting the class Ĝ to a new data point represented
in (9), subject to 0 ≥ αi ≥ C and αi y i = 0. The correspond-
P
by features x from R n , according to (5), where G is the set of
ing decision boundary is given by (10).
classes, i.e., it chooses the class with maximum a posteriori.
¡ ¢ N 1X N X N
Ĝ = arg max p(G/x) (5) αi − αi αi 0 y i y i 0 〈h(x i ), h(x i 0 )〉.
X
LD = (9)
G∈G
i =1 2 i =1 i 0 =1
The classifier uses the Bayes rule to compute the posterior, N
αi αi 0 y i y i 0 〈h(x i ), h(x i )〉 + β0 .
P
assuming the features’ independence, as indicated in (6). f (x) = (10)
i 0 =1
m
Where αi are the Lagrange multipliers, and 〈h(x i ), h(x i 0 )〉 is
Q
p(G) p(x i /G)
i =1
p(G/x) = P (6) the inner product in the transformed feature space, or ker-
p(g )p(x/g ) nel function K (x, x 0 ). Only those observations i for which the
g ∈G
constraints in are exactly met have nonzero coefficients αi .
Since the denominator is a constant for a given feature vector Therefore, the boundary is a linear combination of some data
x, the naive Bayes classification decision rule can be formal- points at the edge of the class, also called the support vec-
ized only with the prior and the likelihood, as indicated in (7). tors. SVM quickly identifies the best linear separator if the
data is linearly separable, for nonlinear patterns, the kernel
Ã !
m
Y
Ĝ = arg max p(G) p(x i /G) (7) trick allows the SVM to do the separation in very high dimen-
G∈G i =1 sion spaces (even infinite). However, grasping insights from
These classifiers are relatively easy to understand and build. the parameters is very hard, making hyper-parameters tun-
They are easily trained and do not require large datasets to ning and selection of the correct kernel difficult challenges.
produce effective results. Despite the assumption of inde- SVMs also underperform if the dimension of the data exceeds
pendence of the feature, which is not valid for most real-life the number of points. In [21], SVMs that identify infeasible
situations, naive Bayes is a practical approach in many appli- regions of the solution space avoid unnecessary circuit simu-
cations. [18] uses a naive Bayes classifier for fault diagnosis. lation during sizing optimization.
5
2.7. Artificial Neural Networks and Deep Learning to achieve the targeted design specifications considering ac-
curate physical properties of the circuit and devices. Further-
Deep learning has become quite popular in the last few
more, the approximation errors during modeling and the dif-
years in image processing, speech recognition, and other ar-
ficulty of the circuit analysis due to the countless trade-offs
eas where a high volume of data is available. Its basic ele-
dramatically increase the design time. The idea behind us-
ment is the perceptron, a single layer of linear threshold units
ing ML techniques in analog/RF circuit design is to gener-
(LTUs) which one computing a weighted sum of its inputs, z,
ate functional models of devices/circuits/systems that accu-
and then applying a nonlinear activation function:
rately mimic their functional behaviors and exploit them for
different contexts. Recently, ANNs have become a viable al-
h w (x) = activation(z)
ternative to numerical modeling methods, analytical meth-
= activation(W T .x) (11)
ods, and empirical models. These models can immediately
= activation(w 1 .x 1 + w 2 .x 2 + · · · w n .x n ),
generate the solution for a pre-trained problem; hence, the
where x is the input vector values, and, w the vector of designer can bypass numerous expensive simulations. Over
weights of the linear threshold unit. This single layer of LTUs, the years, ML-based modeling has been utilized at different
or perceptron, makes a prediction for each instance of x, and levels (from a single device to a complicated system) and for
thus, its training can be done by reinforcing the connection different applications (analog, RF, and heterogeneous). For
weights that contribute to correct the prediction, according the reader’s convenience, the key properties of the reviewed
to: studies are summarized in Table 1.
next st ep cur r ent st ep
wi , j = wi , j + λ(y j − ŷ j )x i , (12)
3.1. ML in Analog Circuit Modeling
where w i , j is the weight between the i t h input and the j t h SVMs and ANN-based approaches are commonly em-
output, x i the i t h input value of the training instance, y j the ployed to obtain the functional models of analog circuits.
target j t h output for the current training instance, ŷ j the pre- SVMs are usually preferred in analog circuit modeling since
dicted j t h output for the current training instance, and, λ they do not get easily stuck at local minima and suffer from
the learning rate. An ANN is essentially a multi-layer per- the curse of dimensionality when the data points are deter-
ceptron, i.e., one of more layers of LTUs, which can be effi- mined considering the dimensions. In [26], the authors pro-
ciently trained using the backpropagation training algorithm, pose the use of SVMs to model analog circuits. As a kernel, the
developed in 1986 by D. E. Rumelhart [4]. ANNs can build ef- authors choose Gaussian Radial Basis Functions. The regres-
fective end-to-end ML systems, and they are replacing entire sion method utilized is ²-SV regression. This modeling is ap-
processing pipelines in applications such as computer vision plied to a Source Coupled FET Logic (SCFL) buffer, a resistive
and natural language processing. ANN is an extremely flexi- mixer, and a GaAs ring oscillator. The generated models are
ble construct. ANNs can also incorporate application-related validated through SPICE simulations. SVMs are also the pre-
knowledge both in the model structure and in the cost func- ferred method for modeling in [21]; however, the aim is not
tion. These multi-faceted tools allow the implementation of to create a full mapping from the input space to the output
different tasks in the same network. The price paid when us- space, but to identify infeasible regions and prune them. A
ing ANNs is the sheer amount of hyperparameters that can be committee of SVM classifiers is utilized to exclude a large por-
tweaked. They go from the network’s structure and activation tion of the entire design space, and only the feasibility region
functions to the optimizer that finds the best combination of and its neighbors are sampled. The feasibility design space
hyperparameter. Unlike SVMs, whose solutions are the opti- is defined by the so-called geometry constraints, which in-
mum of a convex function, ANNs weights’ optimization often clude not only device sizing constraints, but also constraints
leads to local optima of the cost function. Therefore initial- on voltage and current source values, functional constraints
ization is also an essential part of the training. Still, ANNs are which are in terms of node voltages and branch currents,
widely used in EDA for modeling [22], synthesis [23], layout and performance constraints. An active learning approach
generation [24], and fault testing [25]. is employed to train the classifier, where very few samples
are taken from the large infeasible space, and most of them
are concentrated around the boundaries. This is achieved
3. Modeling of Analog/RF Circuits and Systems with Ma- by checking sample candidates against a committee of clas-
chine Learning Techniques sifiers and discarding those candidates rejected by all. The
classifier is tested on two examples, an operational transcon-
Conventional analog IC design is particularly time- ductance amplifier (OTA) and a mixer.
consuming due to the complicated non-linear relationship ANN-based modeling approaches have become more pro-
between the design parameters and device/circuit/system nounced in recent years. ANN can also be used to improve
specifications. Typically, hand calculations may facilitate this the accuracy of the behavioral models of transistor level de-
design process, which considerably narrows the design space sign, where some specifications such as, power consumption,
and provides a good starting point for the designer. Never- area overhead, etc. are not taken into account during the
theless, design time still depends on the experience of the de- behavioral simulations of the systems. [27] presents a novel
signer, who performs a large number of iterative simulations methodology for ANN aided inclusion of power consumption
6
Table 1: Summary of modeling of Analog/RF device and systems with ML techniques.
Reference Application-Device Method(s) Contributions

Robust and accurate modeling of GaAs
[26] Analog Circuits-GaAs transistor SVMs (²-SV regression)
transistors and circuits
Efficient active learning scheme for
[21] Analog Circuits-CMOS SVMs
feasibile design space selection
Robust modeling of power consumption
[27] AMS circuits-CMOS ANN (TDNN)
for AMS circuits
A generic modeling of power consumption
[22] Analog-n/d ANN (Back propogation)
for heteregenous systems
RF- microwave components Review of ANN based CAD for microwave
[28] ANN (several)
and MESFET designs
RF-microwave components, Review of model development and nonlinear
[29] ANN (several)
HMT and MESFETs modeling of microwave devices
Efficient modeling of CPW components for
[30] RF-CPW components ANN (EM based)
accurate performance estimations
RF-UC-PBG Efficient modeling of RF devices for nonlinear
[31] ANN (RBF-MLP)
rectangular waveguide microwave applications
Faster design of large signal hard- nonlinear
[32] RF-MESFET ANN (WNN-MLP)
power transistors and circuits
+1
both at transistor level and with the augmented functional
model. According to the reported results, the simulation time
+1 +1
ŝ 1
ŝ 10
decreases to 12 s from 168 s while the estimation error in en-
s1 Y
ergy is only 2.7%.
a11 ŝ 10
-1
Z ŝ 11 f̂ (s) Y
-1
f(s) A different application of ANN-based modeling is pre-
a12 ŝ 11 sented in [22], where power consumption of analog circuits is
Inverse
Z
-1
ŝ 12 Output modeled and then estimated via empirical-based ANN rather
Scaling
a1k ŝ 12
than achieving performances through the input parameters.
ŝ n The idea behind this study is to estimate the mathematical
sn Y
description of the power consumption as a function of var-
Hidden
Scaling ied input parameters of any analog circuit using neural net-
Z
-1
ŝ nd works. The proposed approach is generic and even suitable
for heterogeneous systems. Moreover, one can perform on-
Delay Input
line power consumption estimations via the proposed strat-
egy. First, analog circuit power measurements are performed
Figure 5: TDNN delay neural network model [27].
via a measurement set-up including a PC for generating dif-
ferent input patterns and saving the power data. Second, the
obtained data is used to train the ANN to obtain a continu-
information of circuits to their purely functional models of ous mathematical function of the power consumption. The
AMS blocks. Due to the nature of the problem, an improved neural networks include three levels: one input, one hidden,
version of the Multilayer Perceptron (MLP) approach, which and one output layer. The activation functions for the hidden
is called time delay neural network (TDNN) shown in Fig. 5, layer and the output layer are sigmoid and linear, respectively.
is utilized in this study. In this approach, the inputs pass A backpropogation-based training (Levenberg-Marquardt) is
through a delay cell and are given as the inputs of the network employed. Once the power model is obtained, it is com-
in order to capture the temporal changes. The flow of the bined with a data flow-based generic functional model of the
proposed approach is as follows. First, the behavioral model circuit. Hence, both circuit performances and the instanta-
(Verilog-AMS) of the circuit is constructed. Meanwhile, tran- neous power consumption are obtained, which makes possi-
sistor level simulations are performed to extract signal traces ble to estimate circuit performance without performing any
for power calculation. Then, the TDNN is trained and the empirical measurements. By combining this framework with
power consumption model is obtained. Once the model is digital power consumption estimation techniques, the power
obtained, it is translated into the behavioral model compat- consumption of heterogeneous systems can be predicted. A
ible with the circuit simulators. Finally, the first behavioral wireless sensor system is provided as the case study, where
model is integrated with the power model. As a case study, the main focus is to estimate the power consumption of ana-
a low power relaxation oscillator is designed and simulated log parts (a temperature sensor, an amplifier, an analog to
7
digital converter, and a wireless transceiver.) According to the
results, the maximum and the average estimation errors are
3.06% and 1.53%, respectively.
3.2. ML in RF Circuit Modeling
Neural networks have been used for RF and microwave

modeling and design, where ANN-based passive/active com-
ponent/circuit models are then employed at higher design
levels. Thus, an accurate response of the whole system can be
obtained within shorter durations compared to the expensive Figure 6: The proposed framework in [31].
conventional approaches. In [28], ANN for RF/microwave
modeling and design is discussed from theory to practice.
The authors state that neural networks are attractive alterna- ANN. The RBF/MLP structure modules are organized in or-
tives to conventional methods such as numerical modeling der to take advantage of the local and global approximation
methods, which could be computationally expensive, or ana- characteristics of the RBF and MLP neural networks, where
lytical methods which could be difficult to obtain for new de- the RBF network is a local approach while the MLP network is
vices, or empirical modeling solutions whose range and ac- a global approach and acts as an output network, since it im-
curacy may be limited. They provide examples where neu- proves the generalization capacity of the modular structure.
ral networks are used to model signal propagation delays of The uniplanar compact-photonic bandgap (UC-PBG) rectan-
a VLSI interconnect network in printed circuit boards (PCBs), gular waveguide and a patch antenna with PBG substrate are
coplanar waveguide (CPW) discontinuities, and MESFETs, all used to demonstrate the developed approach. Compared to
from previous works in the literature. Finally, they illustrate the single usage of RBF and MLP, the combination of them
the use of CPW models to optimize microwave circuits. The (modular model) presents a major generalization capacity,
same authors present a detailed study on modelling issues which is independent of the number of hidden neurons.
and ANN-based nonlinear modelling techniques in [29] in- Wavelet neural networks are chosen over simple MLP and
cluding small/large signal modeling of transistors and dy- Gaussian radial basis (GRB) function networks In [32]. The
namic recurrent neural network (RNN) modeling of circuits. first example is a transistor modeling example, where 10 neu-
Practical microwave examples are used to illustrate the re- ral networks are used as shown in Fig. 7.
viewed modeling techniques.
Another method of modeling CPW circuit components DC Neural Networks
by ANN is based on electromagnetic (EM) simulations [30].
CPW transmission lines (frequency dependent Zo and ²r e ), NN1 Ids
900 bends, short-circuit stubs, open-circuit stubs, step-in-
NN2 Igs
width discontinuities, and symmetric T-junctions are indi-
vidually modeled through EM-based ANN. To train the mod-
els, a number of EM simulations that exhibit meaningful in-
NN3 Re(Y11)
put/output relationships, which directly affect the model ac- NN4 Im(Y11)
curacy. A multilayer feedforward ANN consisting of three Vds NN5 Re(Y12)
layers (one input, one hidden, and one output), which uti- Vgs NN6 Im(Y12)
lizes the error-backpropagation learning algorithm, is used. w NN7 Re(Y21)
NN8 Im(Y21)
The developed models are then employed to design a CPW NN9 Re(Y22)
folded double-stub filter and a 50- 3-dB power-divider cir- NN10 Im(Y22)
cuit, without performing expensive EM simulations. The pro-
posed framework is also available for the other component of Yij Neural Networks
microwave/RF design.
Since EM-based ANN approaches need a relatively long Figure 7: Volterra-ANN device model [32].
training phase for accurate modeling, the efficiency can be
low. [31] presents a solution for modeling of RF devices with Two of them utilize Vd s and Vg s to obtain Ids and Igs. The
Radial Basis Function(RBF)/MLP modular structure, where remaining 8 use Vd s , Vg s , and to yield real and imaginary val-
the efficient Resilient Backpropagation (Rprop) algorithm is ues for Yi j . The total number of parameters is 25 each for
used during the training phase. The authors use a well- the first two and 76 each for the remaining 8. The 10 neural
known plan, "divide and conquer", where the propose frame- networks are trained separately, on 350 measurement points
work is provided in Fig. 6. The complicated design problem for DC characteristics and 7000 measurement points for Y-
is divided in sub-problems, distributed over the neural net- parameters. The results on test points agree perfectly with
works of the modular structure. Their claim is that this type lumped equivalent circuits. For the circuit modeling exam-
of modular structure can improve the efficiency of EM-based ple, 4 neural networks were utilized. The 5 inputs are ω and
8
Table 2: Summary of ML-based IC circuit synthesis applications.
Reference Application Method(s) Contribution

[33] Analog Circuit Optimization KNN Large-scale data mining with boosted regressors
Efficient optimization via replacing the simulator
[34] Analog Circuit Optimization ANN +SPEA2
by ANN based model
[35] Analog Circuit Optimization ANN +GA Fast and accurate layout-aware Op-Amp synthesis
Bayesian Optimization
Fast and accurate optimization of analog circuits
[36], [37] Analog Circuit Optimization (GP+LCB+NSGA-II
to obatine better PFs
BNN+LCB+MOEA/D)
Bayesian Regression Accelerated large-scale design space search via
[38] Performance Space Exploration
(GA+SVMs) multiple ML approaches
Automatic generation of POFs for new design
[39] Performance Space Exploration Polynomial Regression
context without simulation
ANN based text mining A global performance space search on the
[40] Performance Space Exploration
+Sparse regression Internet via knowledge harvesting
Technology independent sizing of analog
[41],[23] Analog Circuit Synthesis ANN (GRP+MLP)
building blocks
Automatic generation of training dataset for
[42],[43] Analog Circuit Synthesis ANN (MLP)
analog circuit sizing
Generation of better FOMs for Op-Amps via
[44] Analog Circuit Synthesis ANN
ANN based circuit synthesis
Efficient multiple performance estimation of
[45] Analog Circuit Synthesis DL+RELU
Op-Amps with DL based models
Examining the effect of ANN hyperparameters
[46] Analog Circuit Synthesis ANN
on analog circuit synthesis
Efficient synthesis of RF circuits via GA
[47] RF Circuit Synthesis GA+ANN(MLP)
assisted ANN
Polynomial Regression Generation of reusable POFs for analog
[48] Analog Circuit Synthesis
+ ANN circuit design
Efficient sizing of analog circuits
[49],[50] Analog Circuit Synthesis RL (L2DC)
(25x faster than hand design)
Efficient layout parasitics-aware circuit synthesis
[51] Analog Circuit Synthesis Deep RL
(40x faster than GA)
the real and imaginary parts of the input and output voltages, type of evaluation, simulation-based approaches is the most
whereas the outputs are the real and imaginary parts of the prevalent ones in terms of accuracy. However, the cost of
input and output currents. This type of modeling allows the SPICE-based circuit synthesis may be expensive in terms of
model to take into account input and output loading. Learn- computation time due to the need of running large number
ing was performed on 2625 measurement points and results simulations (ten and even hundreds of thousands) to achieve
on new data were encouraging. The use of more generic the targeted performances. Hereby, ML-based synthesis ap-
neural network-based models could overcome the problems proaches have become popular to overcome this time effi-
associated with lumped equivalent electrical circuit models, ciency problem. The idea behind employing ML in circuit
which are the most common models in use. These models of- synthesis is to replacement of the simulations by the func-
fer the advantage of being computationally efficient and ac- tional model(s) generated via ML techniques; thus, the exces-
curate, but at the expense of very complex model parameter sive number of simulations can be avoided during the syn-
extraction carried through numerical fitting and optimiza- thesis process. A summary of reviewed papers related to ML-
tion as well as the requirement to an accurate circuit struc- based IC synthesis applications is provided in Table 2.
ture.
4.1. ML for Optimization-based Circuit Synthesis
4. Machine Learning for IC Circuit Synthesis The most established method to automate the circuit syn-
thesis is optimization-based circuit synthesis that uses an op-
Conventionally, circuit synthesis is described as an auto- timization method to explore the design space. Analog/RF
matic process in order to determine the dimensions of the circuit optimization tools certainly accelerate the design
devices, such that the resultant circuit meet a given target time, in which several nature-inspired algorithms (evolution-
specification on a given technology node. Considering the ary, particle swarm, reinforcement learning, etc.) are em-
9
ployed to search the design space and find an optimal so- Circuit Specs.
Simulation Phase
lution for a given circuit problem. However, a large design ML Phase
space should be visited via simulations through the itera-
tions, and, more dramatically a few of them are only used
Analog/RF
at the end of optimization process, which means that a large Circuit
No
Trained? SPICE
portion of the simulation data is wasted during the optimiza- Optimizer Circuit
Variables
tion process. Integration of ML techniques into the conven-
Circuit Yes
tional optimization loop is highly promising to mitigate this Variables
computational cost by reuse of the simulation data in order
to learn circuit behavior and to develop a model that will re- Circuit Specs. ML
MODEL
place the circuit simulator once the model is obtained, shown (ANN ,SVM ,RL, etc.)
Fig. 8. Since the model is generated by using real and filtered

(satisfying all constraints and biasing conditions) data, the
Figure 8: A general flow of the ML-based analog/RF circuit optimization.
accuracy of the optimization does not degrade after the re-
placement of the simulator by the model.
In general, circuit optimization tools optimize the cir-
cuit design parameters (device dimensions, values of passive ded into a multi-objective analog sizing tool in order to in-
components, and bias voltages and currents) considering the crease the efficiency of the optimization process in [34]. The
design objectives and constraints. Conventionally, the em- flow of the proposed approach is very similar to the flow il-
ployed ML typically emulates the circuit behavior thorough a lustrated in Fig. 8. Strength Pareto Evolutionary Algorithm-
model as a function of design variables. On the other hand, 2 (SPEA), which is a well-known evolutionary algorithm for
it is also possible to change the roles of objectives and design multi-objective evolutionary algorithms, is employed as an
variables during modeling, where the circuit design variables optimization engine, where SPICE is used as the performance
are modeled as a function of design objectives. There are sev- estimator. The optimization starts with SPICE simulations
eral attempts that integrate the ML approaches into the con- and goes on for several tens of generations. Meanwhile, the
ventional optimization flow in the literature. produced simulation data is used to train an ANN. Thanks to
Exploiting optimization tools for dataset generation is one the present mechanisms in the optimization tool (constraint
application of ML-based optimization techniques [33]. The violation check, operating region elimination, performance
optimizer manipulates many parameters during its course, selection), a filtration is automatically applied to avoid infea-
creating data points suitable for data mining. Then, they try sible solutions; thus, an accurate model can be efficiently de-
to select a model and fit it through regression. For such a veloped. Once the ANN model is successfully trained, the op-
large population of high-dimensional data points, it is diffi- timization moves to a second phase, where the simulator is
cult to find a suitable functional for the regressor that can replaced by the ANN model; thereafter the performance es-
adequately fit the data, while remaining simple enough to timation is performed without running any simulation. Con-
allow the solution of the fitting parameters. Some authors ventional feed-forward neural networks are used to construct
have suggested radial basis functions in and posynomial- the ANN, which has 4 layers: one input, one output, and two
approximated signomials to this end. In this paper, a com- hidden layers. To demonstrate the proposed approach two
mittee of regressors is built, each of which fits perfectly in different types of amplifiers, a single-stage amplifier and a
some portions of the design space, and with high error in oth- folded cascode amplifier, are optimized with both the con-
ers, rather than struggling to build a single regressor capable ventional and proposed tools. According to the results, the
of superior fitting across a very large sample space. To com- proposed tool can reduce the execution time by up to 64.8%.
bine the results of the regressors, they use a k-nearest neigh- Another ANN-based methodology is proposed in [35] for
bors (KNN) algorithm to select the best fitting K regressors, creating fast and efficient models for estimating the perfor-
and apply weighted averaging to combine the outputs of the mance parameters of CMOS operational amplifier topolo-
regressors, where the weights are determined by the distance gies. The flow of the algorithm is very similar to the flow
from the point to be projected. The regressors themselves shown in Fig. 8. A uniform sampling of the parameter set was
are two hidden layer feedforward networks with 10 neurons performed to create 3125 different sizings of the Op-Amps
in each hidden layer. This approach is illustrated on two ex- to be used as training samples. Seven neural networks were
amples, an RF low noise amplifier (LNA) circuit and a more set up for seven performance parameters. These neural net-
complex LNA circuit with about 50 devices. One peculiarity works were of feedforward type with one hidden layer. Then,
of this study is that the models are developed over a single the neural network models were used in a synthesis flow in-
objective synthesizer so that the synthesizer evolves towards side a combinatorial optimizer, namely genetic algorithms.
a small region in space. Hence, the models developed dur- The approach was demonstrated on several Op-Amps and
ing the last 20% of the synthesis cluster more closely about was found to yield reasonably good results.
the final solution, whereas those developed in the 20% try Bayesian optimization based approaches are commonly
to cover the whole design space, but have more error be- used for expensive black-box functions. The approach has
cause they have more outlier points. An ANN is embed- two important compartments: the probabilistic surrogate
10
modeling and the acquisition function. The surrogate mod- collected via the simulations at the initial phase of the opti-
els are used for performance prediction while the acquisi- mization process. Then, a circuit model is developed through
tion function is used to explore the space considering the fitting the obtained data to a suitable function; thus, per-
surrogate model optimally. In [36], a Bayesian based multi- formance evaluations can be performed by using the circuit
objective algorithm is proposed for automatic synthesis of model without SPICE simulations and the design space can
analog/RF circuits, in which Gaussian processes (GP) are be efficiently explored.
used as the online surrogate models for multiple objectives A learning-based performance space exploration for
and the lower confidence bound (LCB) functions are em- analog/RF circuit approach is presented in [38]. The devel-
ployed as the acquisition functions to select data points. First, oped methodology has a hierarchical structure, which con-
a GP model is trained using the existing simulation data. The sists of three major steps: device/circuit model fitting, evalu-
GP models are only updated when a new data point is se- ation, and design boundary determination and adjustment.
lected from the PF. Hereby, the circuit simulation is called to First, all geometry and biasing variables are explored and
obtain the performances. Then, the LCB functions are con- simulated to obtain the circuit-level design variables for dif-
structed and optimized by using a modified version of the ferent technology nodes. Numerous simulations are per-
NSGA-II algorithm. The optimum of the acquisition func- formed and the obtained data are used to fit the circuit behav-
tions are selected as the next data points to be evaluated. As ior into a model, where Bayesian regression is employed. Sec-
the case study, a three-stage low power amplifier, a 60 GHz ond, the evaluation takes place, in which SVMs is employed
transformer, and a power amplifier, are optimized with pro- in parallel with a genetic algorithm to reduce the runtime. Fi-
posed algorithm. According to the results, the proposed tool nally, the sample boundaries are dynamically adjusted con-
achieves better PFs than the state-of-the art algorithms with sidering the density of feasible samples. The core of the pro-
considerably lower simulation effort. posed software is developed with C++, where MATLAB con-
In [37], a similar multi-objective optimization approach vex optimization tool and SPICE simulator are also integrated
that uses Bayesian optimization is proposed. To model mul- for searching and performance estimation, respectively. To
tiple performance of interest (PoI) of any analog circuit, in- demonstrate the proposed framework, a folded-cascode op-
stead of GP, a single Bayesian Neural Network (BNN) is used. erational amplifier and an RF distributed amplifier are opti-
To train the BNN efficiently, automatic differential variational mized and the results are compared with the results of two
inference (ADVI) method [52] is employed. The BNN is then different circuit synthesis approaches. According to the com-
combined with a Bayesian optimization framework, in which parison results, the proposed tool successfully generates so-
a modified MOEA/D algorithm is used as optimization en- lutions for given design specifications within considerably
gine. The BNN model is built with a training data set and shorter runtimes.
the acquisition function, namely LCB, is defined based on the Design space exploration is part of a larger framework,
BNN model, which is minimized through optimization. The which is used as a design assistant tool for analog intellec-
proposed approach is initialized with generation of "pseudo" tual property (IP)-(DATA-IP) [39]. The idea behind this study
Pareto points with BNN. Then, transistor-level simulations is to generate the Pareto-optimal fronts (POFs) for different
are performed for each point to obtain the actual result. The design contexts (different load, power, etc.) for a particular
BNN model is recalibrated using the new transistor-level sim- circuit without performing any simulations. The proposed
ulation results. The procedure is repeated until the conver- framework presents a number of different options to users:
gence is achieved. A charge pump and a three stage ampli- generation of POFs with the embedded optimization algo-
fier circuits are used to demonstrate the proposed approach. rithm, using an existing POF to determine the design pa-
According to synthesis results, the proposed BNNBO method rameters for a given circuit problem, generation of POFs for
can achieve accurate POFs with almost 0.5× reduction in new contexts using the model, topology selection, and verifi-
computation cost. cation with existing SPICE-based evaluation. The proposed
framework uses a Strength Pareto Evolutionary Algorithm-
4.2. ML for Design Space Exploration 2 (SPEA2) as the multi-objective optimization engine. First,
Design space exploration is another cumbersome prob- a number of POFs are generated for either different loading
lem, in which the whole design space, which is theoretically or power consumption constraints via the optimization algo-
infinite, should be scanned in order to determine the design rithm. Then, the obtained POFs are fitted into a model by
boundaries for a given problem. Furthermore, this process using polynomial regression. Once a circuit is successfully
must be repeated for new contexts, i.e., supply voltage, tech- modelled, the POF of a new circuit context can be readily
nology node, etc., although the problem is the same. Sev- generated without any optimization run. The framework is
eral automatic sizing approaches have been developed to fa- also capable of verification of the solution points on the POFs
cilitate the search of such infinite design spaces; however, generated through the circuit model. Furthermore, all those
they suffers from course of dimensionality due to the exces- options are integrated and a user-friendly interface is devel-
sively increased simulation workload. The use of ML tech- oped. A folded cascade amplifier is selected to demonstrate
niques for design space exploration is based on extracting re- the proposed approach. According to the results, the pro-
gression models of the circuit that is being optimized. Simi- posed design framework successfully generates the POFs for
lar to the other approaches, the data for fitting the model is new design context. The analog library can also be extended
11
by adding new topologies and circuits. Neural Network (GRNN) is used as the ANN approach. Ac-
In [40], design space exploration for large-scale analog cir- cording to the results, the ANN-based design approach can
cuits is examined via a pretty marginal way. The proposed design the circuit for a newer technology. Furthermore, the
approach is based on harvesting the huge design knowledge proposed methodology can achieve better specifications (im-
from published papers and datasheets on the Internet and proved monotonicity and reduction in DNL, INL and gain er-
encoding the knowledge as PFs rather than using an opti- ror.)
mization based framework. Furthermore, the obtained high Similarly, an ANN assisted technology independent sizing
dimensional PFs for large-scale analog systems also include of building blocks (basic current mirrors and differential am-
layout parasitics and process nonidealities since only silicon plifiers) for analog integrated circuits is studied in [23]. The
verified results are used. The approach has two major func- models are trained using different technologies; 1.5 µm, 0.5
tions: harvesting the design knowledge from the Internet and µm, 0.35 µm, and 0.25 µm while the test data was obtained
modeling of PFs by using the collected data. For data collec- for only 0.18 µm technology to demonstrate the technology
tion, an ad hoc text mining technique is adopted such that it independency of the approach. The ANN-based models pro-
provides high-quality information from different sources by vide the corresponding circuit design parameters for a new
analyzing the patterns based on statistical learning. The col- technology without any circuit simulation. GRNN and MLP
lected data is then preprocessed to fit the POF for a given cir- utilizing the Rprop algorithm are used as ANN. The proposed
cuit, since many of them may not be Pareto optimal. To select approach is based on developing a relatively larger database
the Pareto optimal points from enormous data, an efficient for different technologies, where properly sized circuits are
algorithm is proposed, which is based on filtration of feasi- simulated and the results and the corresponding transistor
ble points for each performance metric and determination of dimensions are recorded. Basic, cascode, Wilson, and regu-
Pareto dominated ones. Then, the basis function selection lated Wilson current mirrors are selected for current mirror
takes place. Since using a fixed set of basis functions are not examples while the conventional differential amplifier is se-
applicable to all cases, an adaptive selection mechanism is lected as case study circuit. For both circuit topologies, the
proposed, which uses an adapted version of sparse regres- ANN provides the width of the transistors for the targeted
sion with grid discretization. The algorithm basically uses a specifications. To make the approach technology indepen-
brute-force approach by iteratively selecting of important ba- dent, as a straightforward method, the minimum channel
sis function from a huge candidate pool. By using the basis length is defined as input parameter as well as the perfor-
function and the model coefficients, the nonlinear function mances (i.e. reference current for mirror circuits, gain, gain-
of each performance metric is constructed and Pareto front of bandwidth product, slew-rate etc.). According to the reported
interest is defined. Demonstrated examples indicate that the results, GRNN estimates the transistor sizes for current mir-
proposed tool can accurately model POFs for complex and ror circuits with 94% accuracy while MLP can estimate the
high-level analog systems. sizes for the differential amplifier circuit with 90% accuracy,
in which a 10% tolerance was determined for circuit perfor-
4.3. ML-based Circuit Synthesis mances.
Generation of large training dataset is generally a problem
Conventionally, as the first step of the design process, a de- for ANN-based circuit optimization. The proposed approach
signer usually selects an appropriate topology among a num- [42] addresses this problem for a current to voltage converter
ber of different topologies and sizes the circuit of that partic- circuit. Two levels are utilized to data generation for testing
ular topology. The designer should re-design the circuit for and training and application of this data to developed ANN.
a different technology or for different specifications even if An MLP is employed as the ANN structure since it can im-
there is no change in the topology. Typically, it is supposed plement the arbitrary mappings between inputs, i.e., current
that the technology parameters are the inputs of the circuit as and gate to source voltage, and the outputs, channel length
well as the device dimensions. As a result, once a topology is and width. To generate the training data, SPICE simulations
accurately trained via ML, the model can generate solutions are performed by varying transistor dimensions and the in-
for different technologies without running any simulations. put current. Then, the circuit is modeled through the de-
In [41], ANN assisted technology independent design of veloped ANN and the results are validated by SPICE simula-
current steering PMOS only digital-to analog converter (DAC) tions. According to the presented results, the developed mod-
is presented. The motivation behind the study is to obtain els can estimate output voltage of the converter with 99.69%
design parameters of a pre-trained circuit for the newer tech- accuracy. The same methodology is applied for modeling
nologies without any circuit simulation effort. For that pur- and design of inverter threshold quantization-based current
pose, a large database for the current steering DAC is con- comparator in [43].The comparator is decomposed into two
structed by numerous simulations for different technologies; stages: current to voltage comparator and inverter stages, in
1.5 µm, 0.5 µm, and 0.35 µm. Static specification parameters which a particular MLP-based ANN is constructed for each
(SSP), Differential Nonlinearity (DNL) error, Integral Non- stage. The input current and gate to source voltage are the
linearity (INL) error, monotonicity, and gain error) are de- inputs while transistor width and length are the outputs of
fined as the inputs of the ANN while the transistor dimen- the first ANN. Considering the inverter stages, the transistor
sions are the outputs of the network. General Regression lengths and input-output voltages are determined as the in-
12
put of the system while transistor widths are explored. Ac- Performance
Vector Design
cording to the post-layout simulation results, the maximum ANN-1
parameter-1
errors were measured as 0.31% and 0.65% for stages 1 and 2,
respectively.
1 Design
ANN-based circuit synthesis approach has been applied ANN-2
parameter-2
for more complicated circuits such as a three stage Op-Amp
circuit [44]. 200 samples were generated through SPICE sim- 2
ulations, 150 of which were used to train the model and the ANN-3 Design
parameter-3
remaining 50 were used for testing. The ANN consists of four
layers: input layer, two hidden layers, and an output layer,
where dc gain, bandwidth, phase margin, slew-rate, power
M-1
consumption, etc. are defined as the input parameters, while Design
ANN-M
the transistor dimensions are defined as the output parame- parameter-M
ters of the network. According to the training results, after 134
epochs, the error decreased to the desired level (less than 1%). Figure 9: The block diagram of the ANN array methodology [47].
Then, the obtained results are validated through SPICE sim-
ulations to ensure that they still satisfy the targeted specifi-
cations. Even though considerable differences exist between uate the ANN model, 80%-90% of the datasets are allocated
the requested specifications and the simulated performances for training. The remaining data is kept for model valida-
of the predicted sizing, all targeted design specifications are tion; meanwhile, a small portion is used to verify the prac-
satisfied. Furthermore, the authors also demonstrate that tical application of the model. The number of nodes is kept
they achieve high figures of merit (FOMs) for both large and high at initial layers and decreased through the further lay-
small signal operations. ers to have high performance in the training data, at the cost
Another deep learning-based circuit sizing prediction of overfitting, which is then addressed using L2 weight regu-
methodology to archive the targeted specifications of Op- larization. Then, a grid search is applied over the hyperpa-
Amp is presented in [45]. The methodology is based on learn- rameters (number of layers, number of nodes per layer, non-
ing the correlation between circuit specifications and circuit linearity and regularization factor) to fine tune the model.
elements and determining the particular sizing that satisfies During the ANN data sampling for p predictions, an accep-
the targeted circuit performances. The flow starts with de- tance coefficient of γ=0.15 is used to expand the model va-
termination of the prediction target. Once a circuit is deter- lidity beyond the dataset limits. The selection of solutions
mined, the prediction of circuit element values from perfor- from these P predictions is performed by simulating the pre-
mances is obtained via a regression analysis. In the second dicted circuit sizing, and, and either using of the FOMs or us-
step, the data for learning phase is collected through three ing the Pareto dominance sorting. To demonstrate the pro-
sub-steps: data classification, data collection, and normal- posed approach, and a single stage amplifier using voltage
ization. 13 specifications, such as gain, bandwidth, power, combiners for gain enhancement is selected. A dataset con-
etc., and transistor widths are selected as the outputs and the sists of 16,600 different design points (before data augmenta-
inputs of the network, respectively. To generate the learning tion) is used. DC Gain, bias current (IDD), gain-bandwidth
data, an initial input set consisting of 100 different elements product (GBW), and phase margin (PM) are determined as
is randomly generated and simulated. The generated solu- performance metrics. There different ANNs were trained in
tions are evaluated against a pre-defined FOM and the out- this study: ANN-1 is trained by original dataset with 5000
performing solution is selected as the initial point for the next epochs, ANN-2 is trained by augmented (40 times) datasets
generation, which is also randomly generated in the range of with 5000, and ANN-3 is trained by the augmented dataset
±30% of the initial solution. This flow is repeated until 13500 with 500 epochs, whose weights are initialized with the ANN-
data points are obtained. Then, the obtained performance 1. According to the experimental results, ANN-1 is able to find
values are normalized to prevent any error due to the differ- solutions for new specifications, however, it suffers from vari-
ences in the units. In the third step, a feedforward learning ability and produces worse designs. On the other hand, ANN-
scheme is constructed to implement the regression models, 2 and ANN-3 generate better designs when sampled inside
in which transistors sizes and performances are the inputs the training data. On the other hand, ANN-2 shows more lim-
and the outputs of the proposed network. In the fourth step, itations when trying to explore new specifications. ANN-3,
the network is trained with 13400 data collected in the second because it has transferred information from ANN-1, is more
step. The remaining 10 sets are used to validate the prediction flexible to new specifications, but still lags when compared to
accuracy. According to the validation results, 13 circuit per- ANN-1.
formances are predicted with an average accuracy of 93.3%. Neural networks can also be used in sequence during map-
The effect of ANN hyperparameters (dataset, number of ping circuit performances to circuit sizing [47]. The block di-
epochs, data augmentation, etc.) on the performance circuit agram of the proposed synthesis methodology is shown in
synthesis approaches is explored in [46]. The ANN models Fig. 9. Compared to the previous approaches, the inputs of
have fully connected layers without weight sharing. To eval- the ANNs are the performances while a particular ANN is
13
used to obtain each design parameter at the output. The first dard automatic optimization-based tools. To demonstrate
neural network takes the set of desired performances as in- the proposed approach, a folded cascode amplifier is used.
puts and has only the chosen design parameter as output. A First, the circuit is optimized for different loading conditions
genetic algorithm (GA) controls the learning process of this (100fF, 250fF, 350fF, and 450fF); hence, a POF set is obtained to
ANN. That is, the GA selects which architecture to use (MLP train the model at the CIPE level. The fronts for 150fF, 400fF,
or RBF), and determines its size and which design parame- and 500fF are used to validate the model. The training of the
ter should be output. Once this ANN is ready, the output be- model takes less than 10ms and predicting 200 samples for
comes the input to a second ANN, who has the task of spec- a new load takes around 1ms at this level. Considering the
ifying a second design parameter as a function of the perfor- CSP, an ANN with 20 input variables, one hidden layer with
mance criteria and of the first design parameter. This pro- 100 nodes, and 19 output layers for device parameters is em-
cess continues until all design parameters are covered. This ployed. ANN operation takes 15ms to obtain the 100 sizing
procedure was applied to a classical cascode LNA circuit. 235 solutions for the three loads took less than 15ms. Accord-
valid LNA designs were randomly generated for training. 10 ing to the validation results the proposed tool exhibits quite
design parameters were targeted and 6 variables were used. good accuracy compared to the SPICE simulations. Authors
The models were observed to correctly predict the behavior claim that these performance trade-offs are obtained by us-
of the LNA to within 5% error. ing only 300 circuit simulations, whereas a conventional op-
A framework for reusable POFs for multi-objective optimization would need almost 120K simulations to perform
timization is presented in [48]. The proposed approach the same task.
has two-levels: context independent performance estimator
(CIPE) and circuit sizing predictor (CSP). The flow chart of the 4.4. Reinforcement Learning-based Circuit Synthesis
approach is shown in Fig. 10. RL is used to solve complex problems in several system-
s/applications. RL techniques are inspired by human learn-
ing mechanisms, where an agent, working as a human brain
CIPE (DATA-IP)
cortex, is assigned for learning process based on iterative trial
Design and error process. RL is based on learning from positive and
Objectives negative assigned rewards. The learning loop of the RL ap-
Optimization
proach is shown in Fig. 11. An agent is a function that trans-
Circuit forms the current (St) state and reward (Rt) into an action; en-
vironment is a function that converts an action taken in the
Estimate current (At) state into the next state (St+1) and reward (Rt+1).
Model new POFs P1
New P1 This loop yields a sequence of states, actions and rewards.
Context
P2 P2
Circuit
Performances Agent
CSP (ANN)
State Reward Action

Design (St) (Rt) (At)
Parameters
(W, L, etc.)
Environment
Figure 10: The flow chart of the proposed tool in [48].

Figure 11: The agent-environment interaction in reinforcement learning
loop [49].
In the first step, DATA-IP [39] is assigned as the perfor-
mance predictor, provides circuit performances and device As discussed in previous section, generation of training
sizing for new design context. Moreover, the CIPE is extended dataset for supervised learning to model circuits is difficult.
to predict other performance metrics as well as the design ob- This is due to the fact that circuit simulation is slow, thus ren-
jectives. The predicted outputs are then fed into an ANN, dering generation of a large-scale dataset as time-consuming
which eventually predicts the device sizing to achieve the and that most circuit designs are proprietary IPs within in-
corresponding design specifications. Since the input of the dividual IC companies, making it expensive to collect large-
CSP given by the CIPE that always follows the optimal per- scale datasets. As a result, RL engine that know nothing about
formance trade-off, there is no need for filtering the data, re- analog design is proposed in [49]. The RL agent first learns to
sulting in faster training of the model. The efficient training meet hard constraints, and then learns to optimize the tar-
of the model enables the user to use it iteratively with stan- gets. Compared with grid search-aided human design, L2DC
14
can achieve 250× higher sample efficiency with comparable tory of multiple environment steps, accumulates the rewards
performance. The RL agent generates circuits’ data by it- at each step, and updates the NN weights until the objective
self and learns from the data to search for best parameters. criterion is met or the maximum iteration count is reached.
The RL agents were trained from scratch without giving it any During the rewarding process, hard design constraints and
rules about circuits. In each iteration, the agent obtains ob- objectives that are being minimized are taken into account.
servations from the environment, produces an action (a set of The reward increases as the RL agent’s observed performance
parameters) to the circuit simulator environment, and then gets closer to the target specification. The training termi-
receives a reward to optimize the desired FOMs composed of nates once all targeted specifications are satisfied. During de-
several performance metrics. By maximizing the reward, RL ployment, the trained agent is used to generate trajectories
agent can optimize the circuit parameters. The system was for new specifications. Moreover, the proposed approach is
demonstrated on several Op-Amp examples successfully. combined with a layout generator tool to perform the post-
A ML analog circuit sizing framework that uses deep rein- layout extracted simulations. Once learning is performed at
forcement learning approach is presented in [50]. Policy gra- schematic level; it is directly transferred to a different envi-
dient neural network (PGNN), is built to predict the changes ronment. Here, the layout generator is employed and the
of circuit parameter values, which yields the probability dis- parasitic extracted netlist is given to the rained to deploy the
tribution over all valid actions. The objective of the agent is to agent. To demonstrate the proposed approach, a transimpe-
learn sequences of actions that will maximize its expected cu- dence amplifier, a two stage OTA, and a two stage OTA with
mulative rewards. Considering the circuit sizing problem, cir- negative g m load are optimized. The results indicates that the
cuit parameters (device dimensions, voltages, capacitances, approach is almost 40× more sample efficient than a typical
etc.) are encoded as the states. Thereby, the actions (incre- genetic algorithm. Also, the proposed post-layout simulation
ment and decrement) are defined as the change of those cir- framework is 9.6× more sample efficient than the state-of-
cuit variables. To adjust the amount of change, two other the-art thanks to transfer of learning at the schematic level.
parameters are also defined: change rate and change capac-
ity. Once an action is taken, it should be evaluated to ensure 5. Machine Learning in Analog/RF Layout Synthesis
whether it satisfies the design constraints. Actions that vi-
olate the design constraints are directly eliminated without The widespread application of ML to different areas, in-
running any simulation, resulting in significantly reducing in cluding analog/RF IC layout automation, opens new per-
the execution time. Then, the design objectives are encoded spectives for developing push-button solutions that simulta-
into the rewards in order to manage the learning. To classify neously incorporate legacy data or expert design insights in
the objectives, positive and negative weights are assigned for a manner that was not possible in the previous generations
each objective, where a positive coefficient intends to maxi- of EDA tools. These recent ML applications for layout au-
mize the objective while the negative one aims to minimize it. tomation range from placement tools to routing drafters, but
Before the simulator, a rough pre-evaluation is performed by also, pre- and post-placement processing. Table 3 summa-
a symbolic filter, where the small signal parameters of devices rizes the different ML techniques for layout automation that
are extrapolated by mapping of circuit variables, which then are overview within this section.
turns into performance estimations. If the candidate is ver-
ified by the symbolic analysis, it is simulated through SPICE 5.1. Pre-Placement Processing
and rewarded by considering with the design objectives and An expert IC designer can examine a circuit schematic and
constraints. Otherwise, the reward is set to zero. To demon- instinctively recognize several building blocks (e.g., differen-
strate the proposed approach, a folded cascode amplifier is tial pairs, level-shifters, current mirrors, etc.) formed by var-
optimized, where dc gain, the bandwidth, the phase and the ious basic primitives (blocks), based solely on his prior ex-
gain margins are chosen as the design specifications. Accord- perience. These primitives and larger building blocks de-
ing to the reported results, the proposed tool is able to opti- fine more complex structures (e.g., operational amplifiers or
mize the circuit satisfying all the design specifications. voltage-controlled oscillators), and ultimately, are built up in
Post-layout circuit parameters are found for a given target the hierarchy to form complete systems (e.g., an analog-to-
specification using deep RL [51]. The approach can be clas- digital converters or RF transceivers). Existing methodolo-
sified under two steps: training and deployment. In the first gies able to recognize such structures are usually based on
step of training, the performance trajectories are obtained for graph representations of the netlist [58], and, take advantage
a given problem, where the objective specifications are ob- of its sub circuits defined explicitly. However, while subgraph
tained via SPICE simulations and given to the RL. Then, the isomorphism operations are somewhat possible at building-
RL agent observes the state of the environment and operates block level, the number of combinations becomes imprac-
according to its knowledge at each step. The neural network tical at higher levels, as a countless number of circuit/sys-
uses the observed and targeted specifications as well as de- tem variations can be implemented for similar functionali-
sign parameters to decide the action whether to increment, ties. ML are opening new possibilities in this recognition, as
decrement, or retain the same value for each circuit param- proposed in ALIGN (Analog Layout, Intelligently Generated
eter. The environment returns a new state for calculation of from Netlists) [53]. This framework receives as input an unan-
the reward. The agent iteratively operates through a trajec- notated netlist, and, identifies hierarchies to recognize the
15
Table 3: Summary of the ML applications for layout automation.
Reference Design Step Model Training Contributions

Building block
[53] n/d n/d Alternative to subgraph isomorphism
identification
ANN with Weights assigned by
[54] Placement ANN used as discrete WxH layout plane
nxWxH neurons hill-climbing
ANN with 3
[55] Placement Supervised Reproduces legacy data patterns
hidden layers
ANN with 4
[56] Placement Unsupervised Trained with sizing data only
hidden layers
[57] Well definition GAN Supervised Reproduces legacy data patterns
ANN used
[24] Routing Semi-supervised Acquired knowledge used to assist A∗ search
as VAE
building blocks of the design so that they may be appropri- minimized weights simultaneously several factors, includ-
ately optimized. The primitives at the lowest levels are set, ing interconnection estimates between pairs of cells, overlap,
and then, ML handles the ambiguities in the way these prim- symmetry, proximity and boundary, the later used to keep
itives are assembled, attempting to mimic the expert IC de- cells inside the W×H plane. Recently, ANNs were used to
signer. pursue the knowledge mining route on placement automa-
tion [55]. A model with 3 hidden layers with 250 to 1000 neu-
5.2. Placement by ANNs rons each was used to learn the design patterns (including
the inherent topological constraints) of more than 10.000 dif-
Analog/RF IC layout design is usually split into placement
ferent placement solutions with conflicting guidelines among
and routing. In the placement task, many requirements must
them (validated symmetry and current-flow constraints) of
be considered to produce a robust floorplan solution against
the same circuit topology. The output layer is used to pro-
parasitic structures or process variations, e.g., minimizing the
vide the exact placement coordinates of each cell of the cir-
layout area while satisfying several topological constraints
cuit for any given sizing in the 2-D plane, as illustrated in
that span from symmetry, proximity, or boundary, among
Fig. 12. The model training is made by minimizing the mean
others without hindering its potential to be routed effec-
squared error (MSE) between the predicted floorplan and its
tively. Analog/RF IC placement automation has been inten-
corresponding solution from the training set. Unlike pre-
sively studied in the last few decades, and the works proposed
vious deterministic knowledge mining approaches [65], this
usually follow a descriptive approach or an optimization-
end-to-end approach does not require to define any kind
based approach. Descriptive procedural [59] or template-
of tie breaker manually, with the trained model embedding
based [60] approaches are applied with a moderate level of
reusable design patterns that generalize beyond the train-
success on migration of legacy layouts [61] or layout-aware
ing data, and, provide different placement alternatives (e.g.,
sizing methodologies [62], where fast generation techniques
different aspect-ratios) for the same circuit sizing at push-
in-the-loop are required to be executed in-the-loop. Opti-
button speed.
mization mechanisms, mostly based on simulated anneal-
ing kernels that either change the absolute coordinates of
the cells on a 2-dimensional plane [63] or perturb a topo-
logical representation that encodes the floorplan [64]. While
presenting a reduced setup time, its execution can be time-
consuming. This trade-off between setup time and compu-
tational efficiency marked the previous generations of auto-
matic placement tools, a reality that ML promises to change
by pursuing, for the first-time, flexible push-button solutions.
An initial approach based on an ANN architecture was pro-
posed in [54], whose goal was to place the cells within a seg-
mented plane of fixed size W×H. A mean-field neural network
with n×W×H neurons, where n are the number of cells to be
placed, is used. Each neuron is assigned with a binary out-
Figure 12: ANN architecture used to solve the map from the physical and
put value, whose ’1’ corresponds to the assignment of that
effective Pcells’ dimensions to the placement coordinates, where topologi-
cell to a respective panel of the WxH plane. While an ANN cal constraints are implicit (shaded box) [55]. Topological constraints can be
structure is used to represent the problem, the hill-climbing added in the input layer for topological loss function training [56].
algorithm is still used to solve its gradients as a new set of
cells’ dimensions are requested. The energy function being In [56], a nonlinear ANN model is also applied but used to
16
train topological loss function on legacy sizing data only, that methodologies have not been popular among industrial IC
learns how to fulfill the topological constraints. It promotes design environments. GeniousRoute [24] attempts to extract
the application of the acquired "knowledge" instead of penal- routing strategies of legacy layouts and apply the acquired
izing it with high MSE errors as in [55]. Additionally, the work knowledge in guiding a routing algorithm. Similar to Well-
took one step further towards the prediction of floorplan so- GAN, in the pre-processing of the training data, placement
lutions for circuit topologies, which the model has never been and routing are represented as 2-D images, where routing-
trained before, by supporting different circuit topology en- relevant information is extracted. For each data point, the
codings (with different number of devices) on the input layer pins of the entire design and pins for the given net are
of the same ANN, reusing knowledge among topologies. mapped into two separate 64 × 64 channels. These channels
are then used on a semi-supervised model training, where,
5.3. Post-Placement Processing first, the ANN used as variational autoencoder (VAE) is initial-
ized in an unsupervised fashion, and only after, supervised
When designing a floorplan, experienced IC designers of-
decoder training. GeniousRoute then uses a classical A* path-
ten have the locations of the well regions in mind, i.e., areas
finding algorithm assisted by the model’s inference, which
where the doping is uniformly shared among a group of de-
generates the routing probability map to guide search. Tradi-
vices. Although abutment techniques help [60], embedding
tional rip-up and reroute techniques are still used to ensure
this information during automatic placement methodologies
that a successful solution is attained. However, the legacy de-
is not always straightforward. In WellGAN [57], n-type well
sign patterns will be present on the automatically generated
definition is left for post-placement placement, where a gen-
routing solutions.
erative adversarial network (GAN) is used to mimic the be-
havior of experienced designers, by reusing the knowledge
embedded on previous manually-crafted layouts. To extract 6. ML In Analog IC Fault Testing and Diagnosis
the information from legacy data, the oxide diffusion (OD)
Specification testing and fault diagnosis are of the utmost
layer of layouts is used as input pattern, and, an RGB chan-
importance for robust circuits and systems. Analog circuit
nel encoding is used to differentiate the ODs, i.e., OD inside
testability analysis is significantly more complicated than its
n-type wells (red) and OD outside n-type wells (green), while
digital counterpart. The main culprits are the diversity of
wells are assigned to the blue channel. Thus, after training,
analog circuits with both linear and nonlinear characteristics
the model receives as input images with patterns R and G,
and a multitude of performance metrics that create barriers
and outputs images with RGB channels. To convert this in-
to a standard definition of fault models. Fault diagnosis for
formation into a floorplan, a post-refinement stage is used
electronics-rich analog systems with industrial-application is
to rectilinearize and legalize the wells based on these guiding
usually accomplished by monitoring the deviation of output
regions, fulfilling design rules (e.g., minimum spacing, enclo-
signals in voltage or current caused by the inevitable degrada-
sure, width, and area design rules). This approach was incor-
tion of one or more of its components. The degradation arises
porated on MAGICAL framework [66].
not only from inherent circuit mechanisms but also from im-
proper technician operation or environmental changes, for
5.4. Routing example.
Routing has a determinant impact on the post-layout per- Researchers in the area of analog IC testing since long
formance of analog/RC ICs, especially at deep nanometer in- turned to ML algorithms for the automation of analog spec-
tegration nodes, where the increasing congestion causes dis- ifications testing and fault identification [83]. Table 4 sum-
proportionate growth of the interwire capacitances. Differ- marizes the different ML techniques for IC fault testing and
ent types of automatic analog/RF IC routers were proposed in diagnosis that are overview within this section. In [72] a fault-
the last few years, based on: (1) procedures [67] or template model-based diagnosis for analog ICs was proposed. The
descriptions [68]; (2) heuristics that encode different rout- method is based on an ML-based defect filter [73] that dis-
ing techniques as constraints (e.g., wiring symmetry), and tinguishes failing devices due to hard faults, i.e., completely
then, path-finding algorithms (e.g., maze search [69]) are ap- malfunction, or soft faults, i.e., failing due to parametric de-
plied to draw a wire that connects two different terminals viations. Two types of diagnosis are handled based on the de-
of a net in the presence of obstacles; (3) integer linear pro- cision of the defect filter, and then an SVM-based multi-class
gramming (ILP) [70], by constructing a priori high quality ML classifier is used to identify which catastrophic fault has
routes for individual nets, and then, using ILP to commit occurred, and, inverse regression functions to localize and
each net to only one of its candidate routes; and, optimiza- identify the soft faults. This approach was demonstrated on
tion [71], where an evolutionary algorithm performs struc- an RF LNA. In [80], a sparse relevance vector machine [84]
tural and layer changes in the physical representation of a with Gaussian and polynomial kernels is used for fault prog-
population of independent routing solutions, allowing to op- nostic and remains useful performance estimation. The ap-
timize all wires of all nets simultaneously. Still, due to its proach uses AC voltage values over time as features to esti-
high setup configuration and customization, only procedural mate the health degree of the circuit. The authors define this
or template-based approaches are usually capable of repro- health degree as the cosine distance between the measured
ducing the IC designer preferences. Thus, automatic routing features and those at nominal value, and its value decreases
17
Table 4: Summary of the ML applications for Analog IC fault testing, diagnosis and calibration.
Reference Application Method(s) Contributions

[17] Fault Dianosis MLP Haar wavelet followed by kPCA to reduce dimensionality of features
A defect filter identifies hard and soft faults, and, for the soft faults
[72],[73] Fault Dianosis SVM
inverse regression is used to locate the fault cause
[74] Fault Dianosis ANN Wavelet and PCA to reduce dimensionality of features
[75] Fault Dianosis ANN Dictionary and PCA reduce dimensionality of features
[76] Fault Dianosis Fisher DT LDA to improve class separability while compressing the feature space
[77] Fault Dianosis Naive Bayes Wavelet followed by kLDA
[78] Fault Dianosis DBN End-to-end learning simplifies the feature engineering
[79] Fault Dianosis DBN End-to-end with integrated random sampling for data gathering
Remaining Useful PSO is used to train a RVM that predict the trajectories of the circuits
[80] Kernel RVM
Performance health and predict the remaining useful performance
NSGA optimization to select the smallest set of features sufficient to
[81] Test Set Compression ONN
diagnose the CUT, resulting in a cheaper test procedure
Post-fabrication calibration to counter performance deviation due to
[25] One-Shot Calibration ANN
fabrication in a single calibration step
Uses cheap pre-silicon simulation data, together with a small dataset
[82] Post-Layout Modeling BMF
of fabricated circuits for efficient post silicon modeling
from 1 for non-fault circuits as the circuit’s elements degrade. Data Acquisition
The sparse kernel coefficients are obtained by minimizing the (AC, DC, MC Simulation,
MSE using particle swarm optimization (PSO). Experiments Measurement)
with a Sallen–Key bandpass filter, leapfrog filter, and nonlin-
ear rectifier circuit showed that the methodology was able to
accurately estimate the trajectories of the health degrees of Feature Extraction
the most relevant devices and accurately predict the remain- (Wavelet, dictionary, ...)
ing useful performance of the circuit.
6.1. Pre-Processing with Dimensionality Reduction Feature Selection and

Dimensionality Reduction
As the dimensionality of feature space increases, fault diag-
(PCA, kPCA, LDA, KLDA, ...)
nosis methods started to use longer data processing pipelines
with a structure similar to that shown in Fig. 13. These meth-
ods start by collecting the raw data that is then pre-processed
and transformed, e.g., wavelet transformations are a com- Train Classifier
mon approach to compress the raw data into smaller but sig- (DT, SVM, NB, ANN, ...)
nificant coefficients [74]. The next step is dimensionality re-
duction, which is done with PCA or LDA and their kernel ex- Figure 13: Common pipeline used on recent fault diagnosis systems.
tensions, and finally, the classifier is trained.
In [75], the authors define the fault dictionary exercising
each analog structure with different input signals. This fault a solar power converter. The experiments showed promising
dictionary is essentially a table that contains all the fault accuracy levels above 95%. [17] applied Haar wavelet trans-
characteristics of the circuit-under-test (CUT), and, used as form to obtain the coefficients for low- and high-frequency
a lookup table. PCA is then used to perform an orthogonal components of the time-response. Then, a modified kernel
transformation of the fault space to a lower dimension, and PCA that selects the kernels to maximize class separability
a quantitative measure of distance separation, designated following a criterion similar to LDA. The authors use an MLP
Bhattacharyya coefficient, is used to selection of the relevant multi-class classifier with a one-hot encoding at the output.
features. The ANN is trained in this reduced space, and then, The cost function optimized during training is not specified.
used to classify the applied input signal in the CUT, as faulty The method is applied for a single and double fault model on
or fault-free output. A distinguishing factor in this work is the a Sallen-Key bandpass filter, a biquad high-pass filter, and a
use of the rider optimization algorithm to train the ANN. This nonlinear rectifier circuit, achieving a 100% accuracy rate on
algorithm follows the analogy that a group of riders race to- the two first with just 5 principal components. In [85] sinu-
wards a target objective. The approach was tested in differ- soidal excitation [76] is used to gather voltage amplitude fea-
ent analog structures, including a triangular wave generator, tures for the CUT. Then LDA is used to increase class sepa-
a low noise bipolar transistor amplifier, a differentiator, and rability, and an oblique Fisher decision tree [86] is induced.
18
The authors also use fuzzification to soften the hard deci-
Unsupervised learning
sion criteria of the DT and increase the performance of the
classifier. The method was applied to an active filter and an
audio amplifier circuit. In [77], the authors further extend RBM
the data processing pipeline that starts from raw time signals
that describe the faults. The signals are then subject to cross
wavelet transform to obtain time-frequency coefficients that
produce time-frequency matrix representations. These ma- Directed
trices may present redundant patterns. A variation of the lo- Network
cal binary pattern that considers the 8-Kirsch masks identi-
fies repeating patterns regardless of rotations. Then, the ex-
Output
tracted features are selected. First features are selected us-
Layer
ing the Hilbert-Schmidt independence criterion. Then kernel
LDA produces lower dimensionality features that are used to Supervised learning
train a Naive Bayes classifier. Despite the simplicity of clas-
sifier Naive Bayes, the involved pre-processing and feature Figure 14: DBN structure, showing hidden RBM that are trained in unsuper-
selection scheme shows promising results on the two analog vised learning.
circuits tested. As a downside, the feature pre-processing is
computationally demanding, which limits its use in real-time
nominal parameters, and their tolerances; (2) for each poten-
monitoring systems. The authors also identify some limita-
tial fault mode obtain via simulation the raw time-series sig-
tions in finding the most discriminative features of nonlinear
nals, from time-domain transient, and Monte Carlo analysis;
circuits.
(3) build the dataset using the raw time-series signals, where
each time-series signal of each Monte Carlo sample corre-
6.2. Shorten the ML pipeline with DBNs sponds to one input instance; (4) data engineering for ML
model construction and separation of the dataset into train-
Unlike the previously described fault diagnose approaches
ing and test sets; (5) used the data to train the DBN, which is
that measure, analyze, and collapse the high-dimensional
accomplished by a layer-by-layer unsupervised pre-training
raw systems’ output signals (obtained as time-domain,
stage and a fine-tuning stage using backpropagation; and (6)
frequency-domain, or time-frequency simulation) into a
the DBN is then used to for fault classification. The exper-
lower-dimensional feature set to help to isolate the fault. In
imental results conducted over a Sallen-Key bandpass filter
[78], DL is applied to identify a hierarchical structure that
and four-opamp biquad high-pass filter, show higher classi-
captures the different levels of semantic representations of
fication accuracy when compared with existing data-driven
the raw output signals. As only the response of the circuit
methods, with lower dependency on the data. In this method,
under test is monitored, it does not suffer from the acces-
the use of raw time-series signals directly enables the detec-
sibility problems to internal nodes of analog ICs, common
tion of faults whose effects are only reflected in segments of
to equation-based approaches. A Gaussian–Bernoulli (GB)-
the time series output signals.
DBN classifier is trained in a semi-supervised learning ap-
proach. This framework, shown in Fig. 14, is composed of
two training phases on raw data, pre-training, performed in 6.3. Post-Fabrication Automatic Calibration
an unsupervised fashion and independently on the different In other works, more than fault identification and diagno-
layers of a stacked RBM, and then, fine-tuning, performed sis, ML is used for specification testing and calibration of ana-
in a supervised fashion, where all the RBM layers are fine- log circuits against process variation. These methods ensure
tuned with respect to the classification errors. The objective quality electronic devices and systems while reducing test
is to obtain a latent space that enlarges the interclass distance costs. The latter also increases production yield. [81] uses ML
for different fault classes and reduces the intraclass distance to identify a subset of tests that is sufficient for performance
among each fault, improving the classifier’s ability to iden- testing. A circuit that fails any test in the subset is immedi-
tify them. This approach was experimentally validated on ately discarded. Those that pass all tests in the subset are then
two typical analog filter circuits, learning more discriminative classified, using KNN and ontogenic neural networks trained
features than traditional feature extraction, and thus, produc- using the tests on the subset. The selection of the subset of
ing better diagnostic results with a smaller dataset. [79] adap- tests is made by NSGA-II multi-objective optimization that
tively extracts features from Monte Carlo sampling and then explores the trade-off between test cost and accuracy. The
uses them to train a general-purpose DBN classifier, while method was applied on an RF device and showed that test-
simultaneously embedding dimensionality reduction into it. ing the RF circuit without RF specific testing equipment was
The sampling is embedded into the methodology flow, dis- able to detect most of the failing devices. Including some RF
carding most of the human involvement in feature prepara- testing allowed the method to detect all failing devices with
tion. The complete flow is composed by the following steps: less than 10% of the cost. In [25], the authors propose a One-
(1) firstly, for the CUT are identified the potential fault modes, Shot calibration mechanism, similar to the one in Fig. 15 that
19
relies on an ANN predictor trained to estimate the perfor- 7. Conclusion and Future Directions
mance given the test measurements and the settings of the
tuning knobs. The dataset is built of 67 fixed combinations Recently, ML-based techniques have been efficiently uti-
of turning knobs for each measured circuit. Statistical sam- lized in several applications, where enhanced learning ca-
pling mechanisms [87] can also be used to build the train- pability makes them unique to solve any complex/nonlinear
ing set. Once the system is in place, the CUT is measured, if problem. IC design has also benefited from ML techniques at
performance is not within specifications, it used the trained different design levels, from device modeling to test of man-
regressor to search for the best settings of the tuning knobs. ufactured chips.
The method was applied to an RF power amplifier and was The attempts in device/circuit/system modeling have
able to recover 96% of devices failing specifications. How- aimed to generate accurate models at different levels of ab-
straction and replace the simulator, especially in RF ap-
plications, by these models; hence, the human effort and
Test Inputs the design time can be mitigated. Furthermore, creating
technology-independent models can enable their use in fur-
ther technologies for a given problem. Considering the as-
Automated cending popularity of ANNs, researchers may revisit their ap-
Built-in Test Test plication to analog/RF IC modeling, where variability and re-
Equipment liability problems have not been fully addressed in this man-
Circuit ner. Furthermore, the capability of technology-free modeling
of CMOS devices via ANNs will undoubtedly contribute to ap-
Tuning Knobs pearance of analog/RF IPs in EDA tools.
Measurements ML-based modeling of analog/RF devices and circuits has
IC paved the way for ML-based circuit sizing. The main prob-
& Performance
Calibration Code lem with the reported sizing approaches is the trade-off be-
tween the accuracy and the efficiency of the synthesis pro-
Trained ML Calibration Model
cess. Symbolic/analytical model-based approaches are quite
efficient, but, they suffer from poor accuracy of those models,
Figure 15: One-shot calibration of fabricated circuits.
while simulation-based approaches are quite precise. How-
ever, they are computationally expensive. Incorporation of
ML modeling with conventional sizing algorithms has be-
ever, in practice, the number of silicon measurements that
come a remedy for this bottleneck, in which the circuit per-
can be made to create the training data is limited. Still, accu-
formances are accurately modeled via ML techniques, and
rate modeling of parasitic and losses related to all the testing
the generated models are employed during the evaluation of
setup is intricate and prevents the direct use of simulation-
circuits. Furthermore, putting this mechanism into the opti-
based data to train the model. Nonetheless, there is valuable
mization loop may move the efficiency further, where a por-
information that can be gathered from simulations. Heed-
tion of the initial solutions are used as the filtered data (sat-
ing to this, in [82], the authors used Bayesian model fusion,
isfying all constraints) to train the model; thus, the learn-
as indicated in Fig. 16. Simulation-based data defines a prior
ing phase is automatically carried-out without any external
that reduces the amount of post-fabrication measurements
effort. ML-based circuit synthesis also enables fast and ac-
needed to build an accurate performance regressor.
curate search of the vast analog/RF design space for multi-
/many-objective optimization tools. Recent progress in this
area shows that the proposed approaches are still immature,
Simulation data and several developments are needed, i.e., integration pro-
cess, selection of NN, and intermediate model verification,
etc. Moreover, the efficiency problems with variability- and
Few post-silicon reliability-aware circuit synthesis can be solved via ML-based
Prior knowledge
measurements
synthesis, in which the generated models may be reused dur-
ing these analyses. Hierarchical synthesis of analog/RF cir-
cuits will be an important part of the future directions. The
Bayesian inference
feasibility of the conventional bottom-up approaches may
become much more efficient, since the transitions from dif-
ferent levels of hierarchy can be facilitated through the devel-
Late-stage post-silicon model oped accurate ML-based models.
Automatic layout generation for analog/RF ICs suffers
Figure 16: BMF for post-layout modeling. from similar problems with sizing approaches. Several
ML-based methodologies, mostly on floorplan design, have
been proposed. While the previous generation of au-
20
tomatic placement tools was marked by the high setup many new developments can be expected in the near future.
(descriptive/template-based approaches) versus high com-
putational effort (optimization-based approaches) trade-off,
ML-based methodologies, mostly via ANNs, are attempting Acknowledgement
to change this reality by pursuing flexible push-button solu-
tions. On top of it, the models can also be based on previ- This work is funded by FCT/MCTES through national
ous legacy knowledge. Future research directions in this field funds and when applicable co-funded EU funds under
will most likely be focused on how to reuse those design pat- the project UIDB/50008/2020, including internal research
terns learned and generalize them well beyond the training projects HAICAS (X-0009-LX-20) and LAY(RF)2 (X-0002-LX-
data. These generalization capabilities are also expected to 20).
be achieved for newer circuit topologies. Still, an actual prob-
lem is how to generate a dataset robust enough for that pur-
References
pose. While acquiring robust sizing data for several topolo-
gies is still quite feasible, acquiring validated legacy layouts [1] K. P. Murphy, Machine learning: a probabilistic perspective (adaptive
is not straightforward. The approach taken recently was to computation and machine learning series). MIT Press., 2012.
use other EDA tools to generate synthetic data. Nonetheless, [2] M. Bayes and M. Price, “An Essay towards Solving a Problem in the Doc-
trine of Chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by
some (error-prone) mechanism or human inspection is still
Mr. Price, in a Letter to John Canton,” Philosophical Transactions (1683-
necessary to consider them "legacy-proved". For automatic 1775), vol. 53, pp. 370–418, 1763.
routing, the application of ML is still taking its first steps, [3] F. Rosenblatt, “The perceptron: A probabilistic model for information
yet promising. Before working on the proper deep models, storage and organization in the brain,” Psychological Review, vol. 65,
no. 6, pp. 386–408, 1958.
the most urgent matter to be researched is data engineering, [4] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representa-
how to concisely and accurately represent routing data on a tions by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533–
dataset, which data is relevant to be fed to the model, and, 536, 1986.
how far it could be generalized. The problems of acquiring [5] J. Hastie, Trevor, Tibshirani, Robert, Friedman, The Elements of Statis-
tical Learning The Elements of Statistical LearningData Mining, Infer-
robust legacy data previously found on placement, are only ence, and Prediction, Second Edition, 2009.
further aggravated here. While a handful of robust manually [6] C. M. Bishop, Pattern Recognition and Machine Learning, ser. Informa-
routed designs can still be acquired and used, the necessity of tion science and statistics. Springer, 2007, vol. 16, no. 4.
tens of thousands will ultimately rely on previous EDA tools, [7] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press,
2016.
e.g., template-based, heuristic-based of ILP routing proce- [8] R. S. Sutton and A. G. Barto, Reinforcement Learning, Second Edition,
dures. 2018.
The use of ML for fault diagnosis is well established in the [9] J. Peters and S. Schaal, “Reinforcement learning of motor skills with pol-
icy gradients,” Neural Networks, vol. 21, no. 4, pp. 682–697, 2008.
research community. Several classification methods appear
[10] D. Silver and et. al., “Mastering the game of Go without human knowl-
in the literature. There is no particular method standing out, edge,” Nature, vol. 550, no. 7676, pp. 354–359, 2017.
but supervised learning is the most common approach. Fault [11] V. Mnih and et. al., “Human-level control through deep reinforcement
diagnosis ML systems often show a pipeline that includes learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
[12] Z. H. Zhou and et. al., “Big data opportunities and challenges: Discus-
feature transformation and pre-processing, such as wavelet
sions from data analytics perspectives [Discussion Forum],” IEEE Com-
transform, discriminative feature selection, such as LDA or putational Intelligence Magazine, vol. 9, no. 4, pp. 62–74, 2014.
Kernel LDA, and the classification algorithm, such as Naive [13] A. Zhang, C. Chen, and B. Jiang, “Analog circuit fault diagnosis based
Bayes, SVM, or ANNs. A few works explore semi-supervised UCISVM,” Neurocomputing, vol. 173, pp. 1752–1760, 2016.
[14] A. Canelas and et. al., “FUZYE: A Fuzzy C-Means Analog IC Yield Opti-
learning, using DBNs that embed the feature selection in the mization using Evolutionary-based Algorithms,” IEEE Transactions on
classifier. However, the latent variables are more obscure and Computer-Aided Design of Integrated Circuits and Systems, vol. 0070,
harder to understand their physical meaning. Future devel- no. 3, 2018.
opment for ML-based fault diagnosis should strive further to [15] T. Pessoa and et. al., “Enhanced analog and RF IC sizing methodology
using PCA and NSGA-II optimization kernel,” in Proceedings of the De-
address fault localization, aging, and time-dependent perfor- sign, Automation and Test in Europe Conference and Exhibition (DATE),
mance effects, and identify the right strategy for hyperparam- 2018, pp. 660–665.
eter tuning. The hyperparameter tuning and training strate- [16] B. Schölkopf and A. Smola, Learning with Kernels | The MIT Press, 2001.
gies have a significant impact on the performance of the clas- [17] Y. Xiao and Y. He, “A novel approach for analog fault diagnosis based on
neural networks and improved kernel PCA,” Neurocomputing, vol. 74,
sifiers, but there are not many criteria given in the literature. no. 7, pp. 1102–1115, 2011.
Another major challenge is to decrease the computation re- [18] W. He, Y. He, B. Li, and C. Zhang, “A Naive-Bayes-Based Fault Diagnosis
quirements, as the current state-of-the-art approaches are Approach for Analog Circuit by Using Image-Oriented Feature Extrac-
tion and Selection Technique,” IEEE Access, vol. 8, pp. 5065–5079, 2020.
too sophisticated for real-time or embedded applications.
[19] T. McConaghy and et. al., “Automated extraction of expert knowledge in
ML techniques can also be employed to ease the transi- analog topology selection and sizing,” in IEEE/ACM International Con-
tions between design levels, e.g., incorporating layout infor- ference on Computer-Aided Design, Digest of Technical Papers, ICCAD,
mation into sizing, ultimately, leading to widespread use of 2008, pp. 392–395.
[20] R. El-Adawi and M. Dessouky, “Monte Carlo general sample classifica-
analog/IPs. Besides, the use of ML in analog/RF IC design is
tion for rare circuit events using Random Forest,” in 14th International
still in its infancy and there is room for further developments. Conference on Synthesis, Modeling, Analysis and Simulation Methods
The aforementioned future directions are not exclusive and and Applications to Circuit Design (SMACD), 2017.
21
[21] M. Ding and R. Vemuri, “An active learning scheme using support vec- [39] E. Kaya, E. Afacan, and G. Dundar, “An Analog/RF Circuit Synthesis and
tor machines for analog circuit feasibility classification,” in 18th Inter- Design Assistant Tool for Analog IP: DATA-IP,” in 15th International
national Conference on VLSI Design held jointly with 4th International Conference on Synthesis, Modeling, Analysis and Simulation Methods
Conference on Embedded Systems Design. IEEE, 2005, pp. 528–534. and Applications to Circuit Design (SMACD). IEEE, 2018, pp. 1–9.
[22] A. Suissa and et. al., “Empirical method based on neural networks for [40] J. Tao, C. Liao, X. Zeng, and X. Li, “Harvesting design knowledge from
analog power modeling,” IEEE Transactions on Computer-Aided Design the internet: High-dimensional performance tradeoff modeling for
of Integrated Circuits and Systems, vol. 29, no. 5, pp. 839–844, 2010. large-scale analog circuits,” IEEE Transactions on Computer-Aided De-
[23] N. Kahraman and T. Yildirim, “Technology independent circuit sizing sign of Integrated Circuits and Systems, vol. 35, no. 1, pp. 23–36, 2015.
for fundamental analog circuits using artificial neural networks,” in Ph. [41] R. Vural and et. al., “Process independent automated sizing methodol-
D. Research in Microelectronics and Electronics (PRIME). IEEE, 2008, ogy for current steering dac,” International Journal of Electronics, vol.
pp. 1–4. 102, no. 10, pp. 1713–1734, 2015.
[24] K. Zhu and et. al., “Geniusroute: A new analog routing paradigm using [42] V. Bhatia and et. al., “Modelling a simple current to voltage converter
generative neural network guidance,” in Procceddings of International using ANN,” in IEEE 1st International Conference on Power Electronics,
Conference on Computer Aided Design (ICCAD), 2019. Intelligent Control and Energy Systems (ICPEICES). IEEE, 2016, pp. 1–4.
[25] M. Andraud, H. Stratigopoulos, and E. Simeu, “One-Shot Non-Intrusive [43] V. Bhatia, N. Pandey, and A. Bhattacharyya, “Modelling and design of
Calibration Against Process Variations for Analog/RF Circuits,” IEEE inverter threshold quantization based current comparator using arti-
Transactions on Circuits and Systems I: Regular Papers, vol. 63, no. 11, ficial neural networks.” International Journal of Electrical & Computer
pp. 2022–2035, 2016. Engineering (2088-8708), vol. 6, no. 1, 2016.
[26] V. Ceperic and A. Baric, “Modeling of analog circuits by using support [44] A. Jafari, S. Sadri, and M. Zekri, “Design optimization of analog in-
vector regression machines,” in Proceedings of the 2004 11th IEEE Inter- tegrated circuits by using artificial neural networks,” in 2010 Interna-
national Conference on Electronics, Circuits and Systems (ICECS). IEEE, tional Conference of Soft Computing and Pattern Recognition. IEEE,
2004, pp. 391–394. 2010, pp. 385–388.
[27] M. Grabmann, F. Feldhoff, and G. Gläser, “Power to the model: [45] N. Takai and M. Fukuda, “Prediction of element values of OpAmp for
Generating energy-aware mixed-signal models using machine learn- required specifications utilizing deep learning,” in International Sym-
ing,” in 16th International Conference on Synthesis, Modeling, Analysis posium on Electronics and Smart Devices (ISESD). IEEE, 2017, pp. 300–
and Simulation Methods and Applications to Circuit Design (SMACD). 303.
IEEE, 2019, pp. 5–8. [46] N. Lourenço and et. al., “On the exploration of promising analog ic de-
[28] Q.-J. Zhang, K. C. Gupta, and V. K. Devabhaktuni, “Artificial neural signs via artificial neural networks,” in 15th International Conference on
networks for RF and microwave design-from theory to practice,” IEEE Synthesis, Modeling, Analysis and Simulation Methods and Applications
Transactions on Microwave Theory and Techniques, vol. 51, no. 4, pp. to Circuit Design (SMACD). IEEE, 2018, pp. 133–136.
1339–1350, 2003. [47] E. Dumesnil, F. Nabki, and M. Boukadoum, “RF-LNA circuit synthesis
[29] V. K. Devabhaktuni and et. al., “Neural networks for microwave mod- using an array of artificial neural networks with constrained inputs,” in
eling: Model development issues and nonlinear modeling techniques,” IEEE International Symposium on Circuits and Systems (ISCAS), 2015,
International Journal of RF and Microwave Computer-Aided Engineer- pp. 573–576.
ing: Co-sponsored by the Center for Advanced Manufacturing and Pack- [48] N. Lourenço and et. al., “Using Polynomial Regression and Artificial
aging of Microwave, Optical, and Digital Electronics (CAMPmode) at the Neural Networks for Reusable Analog IC Sizing,” in 16th International
University of Colorado at Boulder, vol. 11, no. 1, pp. 4–21, 2001. Conference on Synthesis, Modeling, Analysis and Simulation Methods
[30] P. M. Watson and K. C. Gupta, “Design and optimization of CPW cir- and Applications to Circuit Design (SMACD). IEEE, 2019, pp. 13–16.
cuits using EM-ANN models for CPW components,” IEEE Transactions [49] H. Wang and et. al., “Learning to design circuits,” arXiv preprint
on Microwave Theory and Techniques, vol. 45, no. 12, pp. 2515–2523, arXiv:1812.02734, 2018.
1997. [50] Z. Zhao and L. Zhang, “Deep reinforcement learning for analog circuit
[31] M. G. Passos, P. d. F. Silva, and H. C. Fernandes, “A RBF/MLP modular sizing,” in IEEE International Symposium on Circuits and Systems (IS-
neural network for microwave device modeling,” International Journal CAS). IEEE, 2020, pp. 1–5.
of computer science and network security, vol. 6, no. 5A, pp. 81–86, 2006. [51] K. Settaluri and et. al., “AutoCkt: Deep Reinforcement Learning of Ana-
[32] Y. Harkouss and et. al., “The use of artificial neural networks in non- log Circuit Designs.”
linear microwave devices and circuits modeling: An application to [52] A. Kucukelbir, D. Tran, R. Ranganath, A. Gelman, and D. M. Blei, “Au-
telecommunication system design (invited article),” International Jour- tomatic differentiation variational inference,” The Journal of Machine
nal of RF and Microwave Computer-Aided Engineering, vol. 9, no. 3, pp. Learning Research, vol. 18, no. 1, pp. 430–474, 2017.
198–215, 1999. [53] K. Kunal and et. al., “Align: Open-source analog layout automation from
[33] H. Liu and et. al., “Remembrance of circuits past: macromodeling by the ground up,” in Proceedings of the 56th Annual Design Automation
data mining in large analog design spaces,” in Proceedings of the 39th Conference (DAC), 2019, pp. 1–4.
annual Design Automation Conference, 2002, pp. 437–442. [54] R. He and L. Zhang, “Artificial neural network application in analog lay-
[34] G. İslamoğlu and et. al., “Artificial Neural Network Assisted Analog IC out placement design,” in Canadian Conference on Electrical and Com-
Sizing Tool,” in 16th International Conference on Synthesis, Modeling, puter Engineering. IEEE, 2009, pp. 954–957.
Analysis and Simulation Methods and Applications to Circuit Design [55] D. Guerra and et. al., “Artificial Neural Networks as an Alternative for
(SMACD). IEEE, 2019, pp. 9–12. Automatic Analog IC Placement,” in 16th International Conference on
[35] G. Wolfe and R. Vemuri, “Extraction and use of neural network models Synthesis, Modeling, Analysis and Simulation Methods and Applications
in automated synthesis of operational amplifiers,” IEEE Transactions on to Circuit Design (SMACD). IEEE, 2019, pp. 1–4.
Computer-Aided Design of Integrated Circuits and Systems, vol. 22, no. 2, [56] A. Gusmao and et. al., “Semi-Supervised Artificial Neural Networks to-
pp. 198–212, 2003. wards Analog IC Placement Recommender,” in IEEE International Sym-
[36] W. Lyu, F. Yang, C. Yan, D. Zhou, and X. Zeng, “Multi-objective bayesian posium on Circuits and Systems (ISCAS), 2020, pp. 1–5.
optimization for analog/rf circuit synthesis,” in Proceedings of the 55th [57] B. Xu and et. al., “Wellgan: Generative-adversarial-network-guided well
Annual Design Automation Conference, 2018, pp. 1–6. generation for analog/mixed-signal circuit layout,” in 56th ACM/IEEE
[37] Z. Gao, J. Tao, F. Yang, Y. Su, D. Zhou, and X. Zeng, “Efficient perfor- Design Automation Conference (DAC). IEEE, 2019, pp. 1–6.
mance trade-off modeling for analog circuit based on bayesian neu- [58] M. Eick and et. al., “Comprehensive generation of hierarchical place-
ral network,” in 2019 IEEE/ACM International Conference on Computer- ment rules for analog integrated circuits,” IEEE Transactions on
Aided Design (ICCAD). IEEE, 2019, pp. 1–8. Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 2,
[38] P.-C. Pan, C.-C. Huang, and H.-M. Chen, “Late Breaking Results: An Effi- pp. 180–193, 2011.
cient Learning-based Approach for Performance Exploration on Analog [59] A. Bhaduri and et. al., “Parasitic-aware synthesis of RF LNA circuits con-
and RF Circuit Synthesis,” in 56th ACM/IEEE Design Automation Confer- sidering quasi-static extraction of inductors and interconnects,” in The
ence (DAC), 2019, pp. 1–2. 47th Midwest Symposium on Circuits and Systems. MWSCAS’04, vol. 1.
22
IEEE, 2004, pp. I–477. [82] F. Wang and et. al., “Bayesian Model Fusion: Large-Scale Performance
[60] R. M. Martins, N. C. Lourenço, and N. C. Horta, Generating analog IC Modeling of Analog and Mixed-Signal Circuits by Reusing Early-Stage
layouts with LAYGEN-2. Springer Science & Business Media, 2012. Data,” IEEE Transactions on Computer-Aided Design of Integrated Cir-
[61] S. Bhattacharya, N. Jangkrajarng, and C.-J. Shi, “Multilevel symmetry- cuits and Systems, vol. 35, no. 8, pp. 1255–1268, 2016.
constraint generation for retargeting large analog layouts,” IEEE Trans- [83] V. Rajan, Jie Yang, S. Chakrabarty, and K. Pattipati, “Machine learning
actions on Computer-Aided Design of Integrated Circuits and Systems, algorithms for fault diagnosis in analog circuits,” in SMC’98 Conference
vol. 25, no. 6, pp. 945–960, 2006. Proceedings. IEEE International Conference on Systems, Man, and Cy-
[62] R. Martins and et. al., “Two-step RF IC block synthesis with preopti- bernetics, 1998, pp. 1874–1879.
mized inductors and full layout generation in-the-loop,” IEEE Trans- [84] M. E. Tipping, “Sparse bayesian learning and the relevance vector ma-
actions on Computer-Aided Design of Integrated Circuits and Systems, chine,” J. Mach. Learn. Res., vol. 1, p. 211–244, 2001.
vol. 38, no. 6, pp. 989–1002, 2018. [85] Y. Cui, J. Shi, and Z. Wang, “Analog circuits fault diagnosis using multi-
[63] ——, “Current-flow and current-density-aware multi-objective opti- valued Fisher’s fuzzy decision tree (MFFDT),” International Journal of
mization of analog IC placement,” Integration, vol. 55, pp. 295–306, Circuit Theory and Applications, vol. 44, no. 1, pp. 240–260, 2016.
2016. [86] A. López-Chau and et. al., “Fisher’s decision tree,” Expert Systems with
[64] A. Patyal and et. al., “Analog placement with current flow and symmetry Applications, vol. 40, no. 16, pp. 6283–6291, 2013.
constraints using PCP-SP,” in 55th ACM/ESDA/IEEE Design Automation [87] F. Cilici and et. al., “Efficient generation of data sets for one-shot sta-
Conference (DAC). IEEE, 2018, pp. 1–6. tistical calibration of RF/mm-wave circuits,” in 16th International Con-
[65] P.-H. Wu and et. al., “A novel analog physical synthesis methodology ference on Synthesis, Modeling, Analysis and Simulation Methods and
integrating existent design expertise,” IEEE Transactions on Computer- Applications to Circuit Design (SMACD), 2019, pp. 17–20.
Aided Design of Integrated Circuits and Systems, vol. 34, no. 2, pp. 199–
212, 2014.
[66] B. Xu and et. al., “MAGICAL: Toward Fully Automated Analog IC Lay-
out Leveraging Human and Machine Intelligence,” in IEEE/ACM Inter-
national Conference on Computer-Aided Design (ICCAD), 2019, pp. 1–8.
[67] E. Chang and et. al., “BAG2: A process-portable framework for
generator-based AMS circuit design,” in IEEE Custom Integrated Cir-
cuits Conference (CICC), 2018, pp. 1–8.
[68] A. Unutulmaz, G. Dündar, and F. V. Fernández, “A template router,”
in 20th European Conference on Circuit Theory and Design (ECCTD).
IEEE, 2011, pp. 334–337.
[69] E. Yilmaz and G. Dundar, “Analog layout generator for CMOS circuits,”
IEEE Transactions on computer-aided design of integrated circuits and
systems, vol. 28, no. 1, pp. 32–45, 2008.
[70] C.-Y. Wu, H. Graeb, and J. Hu, “A pre-search assisted ILP approach to
analog integrated circuit routing,” in 33rd IEEE International Confer-
ence on Computer Design (ICCD), 2015, pp. 244–250.
[71] R. Martins, N. Lourenco, and N. Horta, “Routing analog ICs using a
multi-objective multi-constraint evolutionary approach,” Analog Inte-
grated Circuits and Signal Processing, vol. 78, no. 1, pp. 123–135, 2014.
[72] K. Huang, H. Stratigopoulos, and S. Mir, “Fault diagnosis of analog cir-
cuits based on machine learning,” in Design, Automation Test in Europe
Conference Exhibition (DATE 2010), 2010, pp. 1761–1766.
[73] H. Stratigopoulos and et. al., “Defect filter for alternate rf test,” in 14th
IEEE European Test Symposium, 2009, pp. 101–106.
[74] M. Aminian and F. Aminian, “A modular fault-diagnostic system for
analog electronic circuits using neural networks with wavelet transform
as a preprocessor,” IEEE Transactions on Instrumentation and Measure-
ment, vol. 56, no. 5, pp. 1546–1554, 2007.
[75] D. Binu and B. S. Kariyappa, “RideNN: A New Rider Optimization
Algorithm-Based Neural Network for Fault Diagnosis in Analog Cir-
cuits,” IEEE Transactions on Instrumentation and Measurement, vol. 68,
no. 1, pp. 2–26, 2019.
[76] Feng Li and Peng-Yung Woo, “Fault detection for linear analog IC-the
method of short-circuit admittance parameters,” IEEE Transactions on
Circuits and Systems I: Fundamental Theory and Applications, vol. 49,
no. 1, pp. 105–108, 2002.
[77] W. He and et. al., “A Naive-Bayes-Based Fault Diagnosis Approach for
Analog Circuit by Using Image-Oriented Feature Extraction and Selec-
tion Technique,” IEEE Access, vol. 8, pp. 5065–5079, 2020.
[78] Z. Liu and et. al., “Capturing high-discriminative fault features for
electronics-rich analog system via deep learning,” IEEE Transactions on
Industrial Informatics, vol. 13, no. 3, pp. 1213–1226, 2017.
[79] G. Zhao and et. al., “A novel approach for analog circuit fault diagnosis
based on Deep Belief Network,” Measurement, vol. 121, pp. 170–178,
2018.
[80] C. Zhang and et. al., “A multiple heterogeneous kernel RVM approach
for analog circuit fault prognostic,” Cluster Computing, vol. 22, no. 2,
pp. 3849–3861, 2019.
[81] H. Stratigopoulos and et. al., “RF Specification Test Compaction Using
Learning Machines,” IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, vol. 18, no. 6, pp. 998–1002, 2010.
23

Review - Machine Learning Techniques in Analog - RF Integrated Circuit Design, Synthesis, Layout, and Test

Uploaded by

Copyright:

Available Formats

Review - Machine Learning Techniques in Analog - RF Integrated Circuit Design, Synthesis, Layout, and Test

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Review - Machine Learning Techniques in Analog - RF Integrated Circuit Design, Synthesis, Layout, and Test

Uploaded by

Copyright:

Available Formats

Version of Record: https://www.sciencedirect.

Review: Machine Learning Techniques in Analog/RF Integrated Circuit Design,

Email address: engin.afacan@lip6.fr (Engin Afacan1 )

Preprint submitted to Integration, The VLSI journal October 11, 2020

Engin Afacan1∗, Nuno Lourenço2 , Ricardo Martins2 , Günhan Dündar3

1. Introduction Targeted Targeted Targeted

Machine (or statistical) learning foundations are from arti-

1X K X X 2.3. Linear Discriminant Analysis

< 35dB > 35dB

< 10MHz ...

< 20 mW > 20mW

< 1MHz GBW < 80 dB GDC > 80 dB

Figure 4: Concept of Margin for non-separable classes. The point on the

Where ξi represents how far a point is on the wrong side of the

Reference Application-Device Method(s) Contributions

3.2. ML in RF Circuit Modeling

Neural networks have been used for RF and microwave

Reference Application Method(s) Contribution

Fig. 8. Since the model is generated by using real and filtered

State Reward Action

Figure 10: The flow chart of the proposed tool in [48].

Reference Design Step Model Training Contributions

Reference Application Method(s) Contributions

6.1. Pre-Processing with Dimensionality Reduction Feature Selection and

You might also like