Memory Bi Stare A Estimator Using Ann
Memory Bi Stare A Estimator Using Ann
Memory Bi Stare A Estimator Using Ann
II. ISSUE, STATE OF THE ART & PROPOSED SOLUTION For each memory, four ANN are obtained through a learning
phase:
A. Core issue • Standard configuration
For a given memory CUT (words number, bits number, • Mask_bits
MUX factor, etc…) calculate instantly, during the system level • Redundancy
design step and before synthesis, the area overhead due to BIST • Mask_bits and redundancy.
insertion. As a prelude, we decided that the rough calculation
delta between estimated area and the synthesis resulting one During the learning phase of those four ANN, the parameters,
should be less than 10%. Words, Bits and BIST frequency vary within limits we fix
As far as we know, such issue has never been treated in the according to memories and technologies constraints.
different IEEE publications. The unique project which
succeeded to deliver a solution has been conducted within one In another hand, it has been decided to ignore the MUX
of the STMICROELECTRONICS divisions. We will not go factor which doesn’t have any impact on the BIST area except
through its details; however in the introduction it has been in the specific case of small sized memories where the MUX
mentioned. factor can change the BIST configuration itself. The results
The solution we present uses an artificial neural networks presented in (Figure 1) show that:
technique. Several ANN are built and the principle is to make • In 97.4% of cases, the area variation due to MUX one
them learn the BIST area variation while varying some is lower than 4%.
parameters like the number of words or bits for and right • In 88.5% of cases, the area variation due to MUX one
through different memories or BIST specific configurations is lower than 2%
such as redundancy, mask_bits, and also synthesis constraints The average of variation is 1.06%.
like frequency.
Learning phase produces a set of small files storing the ANN
pertinent information (set of specific weights). BARES
assembles all those files, each one integrating an ANN of a
specific memory BIST configuration, and once invoked,
through a simple command, it uses the appropriated ANN
weights to calculate instantly the estimated area. For the
following, let’s make the convention that integrating a memory
within BARES means that we generate its ANN set of files and
we link them to BARES kernel.
Figure 1: BIST area variation for different memories with
III. PRELIMINARY STUDIES several mux factors compared to mux factor 4.
B. Targeted memories
In order to specify the different parameters that can have an
BARES targets SRAM memories:
influence on BIST area estimation we did several preliminary
studies. The idea was to only concentrate the learning phase on • Single port SRAM
those who really impact area when varying and in the same • Dual port SRAM
time increase the ANN convergence speed. Without going in C. Black boxes or synthesis memory synthesis models
details, let’s say that we reached the conclusion that the This study scope was to evaluate the influence, on the BIST
following memory or BIST specific characteristics have a real area, of synthesizing the BIST with, or without, a memory
impact on the area when we are varying: synthesis models. The obtained results (Table 1) show that, for
A. Memory & BIST specific parameters the targeted memories, the area variation between using a
memory synthesis model and considering the memory as black
• CMOS technology
box doesn’t exceed in any case 9%, while the average
• Words
variation is about 2%. The cases where the difference exceeds
• Bits
5% are very limited. This low variation allows us to consider
• BIST frequency
the use of black boxes instead of memory models. In fact, this
• Redundancy
will give us the possibility to easily and quickly build a large
• Mask_bits
database for the learning phase. We simply avoid memory
• Synthesis timing constraint
synthesis model generation for each element of the database
The synthesis timing constraints are related to slack
(long internal procedure).
problems on the memory busses. In order to avoid them we
fixed the maximum delay on those busses to 1.1 ns which
according to our specific case will avoid such problems.
3
Table 1 : BIST area variation with and without use of on the learning convergence of each network (i.e. memory
memory model type). The neurons connection weights are all initialized to a
Synthesized BIST very small random values for all networks.
Memory
BIST area (µm2)
parameters Variation
frequency with without a) Learning rate
(%)
(MHz) memory memory The learning procedure requires that the change in weight
word bit mux
model model should be proportional to the error delta between two output
128 4 4 200 3320.24 3403.66 2.51 patterns. True gradient descent requires that infinitesimal steps
256 4 4 200 3404.75 3485.98 2.39 are considered. The constant of proportionality is the learning
2048 4 4 200 3641.83 3730.74 2.44 rate. For practical purposes we choose a learning rate as large
128 8 4 200 3521.11 3584.76 1.81 as possible while controlling at the same time the oscillations.
1024 4 4 200 3827.33 3822.94 0.15 In our case, we did choose a learning rate equal to 0.3; this is
256 32 4 200 4637.36 4788.83 3.27 coming from several experimental tests.
2048 32 4 200 4876.64 5024.81 3.04
4096 32 4 200 4949.08 5084.18 2.75 b) Building databases
128 64 4 200 6009.36 6150.97 2.35 As mentioned before, the use of ANN for the BIST area
512 64 4 200 6160.82 6315.59 2.51 estimation requires a large amount of data. These databases,
4096 64 4 200 6392.42 6560.35 2.63 containing massive information on a given memory in the
shape of CUTs configuration associated to their synthesized
IV. IMPLEMENTATION BIST area, are used for the learning step of an ANN. Those
databases are tabular files containing the inputs (words, bits,
A. The Multi-layer feed-forward networks mux and frequency) and outputs (BIST area) of the ANN. The
As mentioned before, the neuronal network used in our case same databases are used later in order to test and validate the
is the multilayer feed-forward network with a backpropagation built networks.
learning rule. A feed-forward network has a layered structure. The first step of a memory integration within BARES is
Each layer consists of units which receive their inputs from building the database. Making the decision to use black boxes
units from the layer below (n-1) and send their outputs to units for memories simplified a lot this step since we just need to
in a layer above (n+1). There are no connections within a synthesize BIST which we is generated automatically with our
layer. Although backpropagation can be applied to networks internal STMICROELECTRONICS BIST generator ( ugnBIST ).
with any number of layers, it has been shown in ([5], [6], [7] The task is done by an automatic database generator, which
and [8]) that only one layer of hidden units suffices to reads a specification file indicating the targeted memory and
approximate any function with finitely many discontinuities to the BIST specific configuration, and then goes through the
arbitrary precision, provided the activation functions of the synthesis flow and reports the areas.
hidden units are non-linear (the universal approximation The resulting database is then split into two parts, one used
theorem). In our application the feed-forward network with a for the learning phase (75% of the initial database) and the
three layers of hidden units is used with a sigmoid activation second one for test and validation (the remaining 25%). This
function for the units. split is randomly done.
The backpropagation algorithm [10] is a gradient-descent Finally, both databases are formatted to fit the input and
method minimizing the squared error cost function. It is the output format of the neural network where all the values must
algorithm used for the learning phase. be in the range] 0, 1[.
V. RESULTS
The results shown in this section are the calculated delta
errors between synthesized and estimated BIST area as obtained
during the validation step (section IV-3). Histograms in figures
hereafter ( Figures 2 to 7) plot the error distribution for single
port memory BIST reporting the error percentile range on X
axis and the number of memory CUT on Y axis. Figure 3: Error distribution for single port memory
Experiments have been performed on a large set of (SPHD) with default BIST options.
STMICROELECTRONICS SRAM (single or dual port) memories.
The worst delta never exceeded 10%, (the threshold fixed The last example shown in Figure 4 is about the SPHS
for this estimator). Moreover, the maximum one was only 6% memory BIST used with both redundancy and mask_bits
and in about 90% of cases we got a delta under 2%. options.
With these configurations, this BIST is the biggest one
5
amongst the ones shown for the single port memories. The The last example shown in Figure 7 is about the DPHS
obtained results show that in about 14% of cases we got delta memory BIST used with both redundancy and mask_bits
error greater than 2%. Meanwhile, the cases where the delta options. The obtained results show that in about 25% of cases
error didn’t exceed 1% were only about 60%. we got delta error greater than 2%. Meanwhile, the cases
where the delta error didn’t exceed 1% were only about 51%.
[3] B.M. Wilamowski, O. Kaynak, S. Iplikci & M.Ö. Efe, “An algorithm
for fast convergence in training neural networks”.
Neuronal network topology [4] B. Kröse and P. van der Smagt, “An introduction to neural networks”,
online document, Univ. of Amsterdam, 8th edition, Nov 1996,
Memory parameters Available: http://
http://neuron.tuke.sk/math.chtf.stuba.sk/pub/vlado/NN_books_texts/Kro
BIST parameters se_Smagt_neuro-intro.pdf.
[5] K. Hornik, M. Stinchcombe and H. White, “Multilayer feedforward
Synthesis and ugnBIST version networks are universal approximators,” Neural Networks, Vol. 2, 1989,
CMOS Corelib technology files pp. 359-366.
[6] K. Funahashi, “On the approximate realization of continuous mappings
by neural networks,” Neural Networks, Vol. 2, 1989, pp. 183-192.
Specification files [7] G. Cybenko, “Approximation by superposition of a sigmoidal function,”
Mathematics of control, signals and systems, Vol. 2, 1989, pp. 303-314.
[8] E. J. Hartman, J. D. Keeler and J. M. Kowalski “Layered neural
networks with Gaussian hidden units as universal approximations,”
Database generation Neural Computation, Vol. 2, 1990, pp. 210–215.
[9] C. Touzet “Les réseaux de neurons artificiels, Introduction au
Synthesis connexionnisme,” Jul. 1992, Available: http://www.up.univ-mrs.fr.
[10] Gorsche, S.S. “An efficient memory fault-test technique for ASIC-based
memories” Communications, 1992. ICC 92, Conference record,
Test pattern Learning pattern SUPERCOMM/ICC '92, Discovering a New World of Communications.
IEEE International Conference on 14-18 June 1992 Page(s):136 – 141
vol.1
Normalization [11] Yuejian Wu; Gupta, S. “Built-in self-test for multi-port RAMs”, Test
Symposium, 1997. (ATS '97) Proceedings., Sixth Asian 17-19 Nov.
1997 Page(s):398 – 403
! Learning [12] Schober, V.; Paul, S.; Picot, O. “Memory built-in self-repair using
redundant words” Test Conference, 2001. Proceedings. International 30
! Oct.-1 Nov. 2001 Page(s):995 – 1001
Testing resulting network [13] Bahl, Swapnil; Srivastava, Vishal; “Self-Programmable Shared BIST for
Testing Multiple Memories” European Test, 2008 13th 25-29 May 2008
Automatic neuronal network integration platform Page(s):91 - 96
VII. CONCLUSIONS
REFERENCES
[1] H. Narazaki & A.L. Ralescu, “An improved synthesis method for
multilayered neural networks using qualitative knowledge,” IEEE
Trans. on fuzzy systems, vol. 1, No. 2, May 1993.
[2] P. Ruzicka, “Learning neural networks with respect to weight errors,”
IEEE Trans. on circuits and systems – Fundamental theory and
applications, Vol. 40, No. 5, May 1993.