Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

Module 5 Notes

Uploaded by

bcruchitha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Module 5 Notes

Uploaded by

bcruchitha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

MODULE 5

Design Methodology and Tools and Timing Debugging


and Verification

Design Methodology
Introduction
IC designers have developed and adapted strategies from allied disciplines such as software engineering to
form a cohesive set of principles to increase the likelihood of timely, successful designs.
While the broad principles of design have not changed in decades, the details of design styles and tools have
evolved along with advances in technology and increasing levels of productivity..
an integrated circuit be described in terms of three domains: (1) the behavioral domain, (2) the structural
domain, and (3) the physical domain.
The behavioral domain specifies what we wish to accomplish with a system., at the highest level, we might
want to build an ultra-low-power radio for a distributed sensor network.
The structural domain specifies the interconnection of components required to achieve the behavior we desire.
by way of example, our sensor radio might require a sensor, a radio transceiver, a processor and memory (with
software), and a power source connected in a particular manner.
the physical domain specifies how to arrange the components in order to connect them, which in turn allows
the required behavior.
Design flows from behavior to structure and ultimately to a physical implementation via a set of manual or
automated transformations. At each transformation, the correctness of the transformation is tested by
comparing the pre- and post-transformation design.
In each of these domains there are a number of design options that can be selected to solve a particular problem.
Classically, these have included the following for digital chips:

For analog and RF circuits, the block diagram level replaces the logic level.
The relationship between description domains and levels of abstraction is elegantly shown by the Gajski-Kuhn
Y chart in Figure 8.1
In this diagram, the three radial lines represent the behavioral, structural, and physical domains. Along
each line are enumerated types of objects
In the behavioral domain, we have represented conventional software hardware description language
categories. As we move out along any of the radial axes, the increasing level of design abstraction is able
to represent greater complexity. Thus, in the behavioral domain, the lowest level of abstraction is an
Instruction or statement in software or HDL descriptions, respectively.
Circles represent levels of similar design abstraction: the architectural, RTL, logic, and circuit levels. The
particular abstraction levels and design objects may differ slightly depending on the design method.

Structured Design Strategies


The viability of an IC is in large part affected by the productivity that can be brought to bear on the design. This
in turn depends on the efficiency with which the design can be converted from concept to architecture, to logic
and memory, to circuit, and ultimately to physical layout good VLSI design system should provide for consistent
description in all description domams (behavioral, structural, ard physical) and at all relevant levels of
abstraction (e.g., architecture, RTL/b10ck, logic, circuit).
These parameters can be summarized in terms of:

Design is a continuous tradeoff to achieve adequate results for all of the above parameters. As such, the tools
and methodologies used for a particular chip will be a function of these parameters. Certain end results have to
be met,but other constraints may depend on economics (i.e. size of die affecting yield) even subjectivity (i.e.,
what one designer finds easy, another might find incomprehensible).
Given that the process of designing a system on silicon is complicated, the role of good VLSI-design aids is to
reduce this complexity, increase productivity, and assure the designer of a working product.
A good method of simplifying the approach to a design is by the use of constraints and abstractions. By using
constraints, the tool designer hao some hope of automating procedures and taking a lot of the "legwork" (effort)
out of a design.
By using abstractions, the designer collapse details and arrive at a simpler object to handle.
The successful implementation of almost any integrated circuit requires attention to the details of the
engineering design process. Over the years, a number of structured design techniques have been developed to
deal with complex hardware and software projects. The techniques have a great deal of commonality. Rigorous
application of these techniques can drastically alter the amount of effort that has to be expended on a givcn
project .

A Software Radio—A System Example


To guide you through the process of structured design, we fill as an example a hypothetical "software radio," as
illustrated in Figure 8.2.

This device is used to transmit and receive radio frequency (RF) signals. Information is modulated onto an RF
carrier to transmit data, voice, or video. The RF carrier is demodulated to receive information.
An ideal software radio could receive any frequency and decode or encode any type of information at any data
rate, but given the limitations of current processes there are some bounds. To understand the impact of design
methods on system solutions, we will examine the software radio in more detail. This system will then form the
basis for discussion about structured approaches to design.
Figure 8.3 illustrates a typical transmit path for a generic radio transmitter, which is called an IQ modulator.
An input data stream is encoded into in phase (T) and quadrature (Q) signals. The I and Q represent Signal
amplitudes of a (voltage) vector that vary instantaneously time as shown in the bottom of Figure 8.3. For
appropriate I and Q values, any form of modulated carrier can be synthesized. I is multiplied by an oscillator
(sine) operating at a frequency of Fo . The quadrature (Q) signal is multiplied by the cosine of this frequency.
The resultant signals are summed and passed to a digital-to-analog converter (DAC). In the design shown,
this generates whaT we term an Intermediate Frequency or IF.
Typical IQ constellations are shown in Figure 8.4.

Amplitude Modulation (AM) depicted in Figure 8.4(a), varies only in the magnitude of the carrier that varies
in accordance with the amplitude of the modulation waveform. This is shown as a signal with an arbitrary phase
angle (which we don't care about) a vector that travels from the origin to a point on a circle that represents the
maximum value of the carrier.
In the case of an AM radio, the carrier frequency might be 800 KHz (in the AM band) and the modulation
frequencies range from roughly 300 Hz to 6 KHz (voice and music frequencies).
Phase Modulation is shown in Figure 8.4(b). Here, the vector travels around the maximum carrier amplitude
circle varying the phase angle ) as the modulation changes. This is a constant amplitude modulation, which
might be used with a carrier frequency of 100 MHz, in the FM broadcast band—we are loosely associating
phase modulation withfrequency modulation (FM) as they are closely related and could have modulation
frequencies of 200 Hz to 20 KHz (Finally, Figure 8.4(c) shows Quadrature Phase Shift Keying (QPSK)
modulation, which is typical of data transmission systems. Two bits of data are encoded onto four phase points
as shown in the diagram.
A typical carrier frequency might be 2.4 GHz in the Industrial Scientific and Medical (ISM) band and the
modulation data rate might be 10 Megabits/second. Generally, for high carrier frequencies, the modulation can
be performed at a moderate frequency and then "mixed" up to a frequency by analog multiplication. This is
completed in the analog domain and is illustrated bv the blue components on the right side of Figure 8.3. An
analog multiplier (called a mixer in RF terminology) takes an analog Local Oscillator (LO) and the Intermediate
Frequency (IF) signal that we have generated and produces sum and difference frequenciesAnalog bandpass
filtering uses slightly more sophisticated mixer can be used to select the mixing component (LO+IF or LO-IF)
that we desire.
To complete the software radio, the receive path is shown in Figure 8.5
It is roughly the reverse of the transmit path. As in the transmit case, higher frequencies can be downconverted
to lower IF frequencies that are suitable for processing by practical ADCs.
The RF signal is mixed with the LO and low pass filtered to produce the difference frequency. For example, if
a 2.4 GHz LO is mixed with the 2402 GHz RF signal, the 20 MHz IF signal is restored. An analog-to-digital
converter (ADC) converts the modulated IF carrier into a digital stream of data. This data is mixed (multiplied)
in the digital domain by an oscillator operating at the IF frequency.
After digital low pass filtering (LPF), the original I and Q signals be reconstructed and passed to a demodulat0f.
.
In summary, we see that multiplication, sine wave generation, and filtering are important for a software radio.
While the modulation and demodulation have not been described in detail, operations can include equalization
(multiplication), time to frequency conversion (fast Fourier transform), correlation, and other specialized coding
operations.

Hierarchy
The use of hierarchy, or "divide and conquer," involves dividing a system into modules, then repeating this
process on each module until the complexity of the submodules is at an appropriately comprehensible level of
detail.
The process parallels the software strategy in which large programs are split into smaller and smaller sections
until simple subroutines with well-defined behaviour and interfaces can be written. In the case of predefined
modules, the design task involves using library code intended for the required function.
The notion of "parallel hierarchy" can be used to aggregate descriptions in each of the behavioral, structural,
and physical domains that represent a design
Equivalency tools can ensure the consistency of each domain. Because these tools can be applied hierarchically,
one can progress in verification from the bottom to the top of a design, checking each level of hierarchy where
domains are intended to correspond. For Instance, a RISC processor core can have an HDL model that describes
the behavior of the processor; a gate netlist that describes the type and interconnection of gates required to
produce the processor; and a placement and routing description that describes how to physically build the
processor in a given process.
Hierarchy uses the use of virtual components, soft versions of the more conventional packaged Virtual
components are placed into a design as pieces of code and come with support documentation such as verification
scripts. They can be supplied by an independent intellectual ProPerty (IP) provider or can be reused from a
previous product developed in organization.
REGULARITY
Hierarchy involves dividing a system into a set of submodules. However, hierarchy alone does not solve the
complexity problem. For instance, we could repeatedly divide the hierarchy of a design into different
submodules but still end up with a large number of different submodules. With regularity as a guide, the designer
attempts to divide the hierarchy into a set of similar building blocks.
Regularity can exist at all levels of the design hierarchy. At the circuit level, uniformly sized transistors can be
used, while at the gate level, a finite library of fixed-height, variable-length logic gates can be used
At the logic level. parameterized RAMs and ROMs could be used in multiple places. At the architectural level,
multiple identical processors can be used to boost performance.
Regularity in verification cfforts by reducing the number of subcomponents to validate and by allowing formal
verification programs to operate more efficiently. Design reuse depends on the principle of regularity to use the
same virtual component in multiple places or products.
Modularity
The tenet of modularity states that modules have well-defined functions and interfaces. modules are "well-
formed," the interaction with other modules can be well characterized. The notion of "well-formed" may differ
from situation to situation, but a good starting Point is the criteria placed on a "well-formed" software
subroutine. First of all, a clearly defined interface is required. In the case of software, this is an argument list
typed variables. In the IC case, this corresponds a cleerly defined behavioural structural, and physical interface
that indicates the function as well as the name, electrical and timing constraints of the ports on the design.
Reasonable load capacitance and drive capability should be required for IMO ports.
Too large a fan-in or too small a drive capability can lead to unexpected timing problems that take effort to
solve, where we are trying to minimize effort. For noise immunity and predictable timing, inputs should only
drive transistor gates, not diffusion terminals. The physical interface specification includes such attributes as
position, connection layer, and wire width.
In common with HDL descriptions. We usually classify ports as inputs, outputs,bidirectional, power or ground.
In addition, we would note whether a port is analog or digital. Modularity helps. the designer clarify end
document an approach to a problem, also allows a design system to more easily check the attributes of a module
as it is constructed (i.e., that outputs are not shorted to each other). The ability to divide the task into a set of
well-defined rules also aids in System-On-Chip (SOC) designs where a number of IP <sources have to be
interfaced to complete a design.
LOCALITY
By defining well-characterized interfaces for a module, we are effectively stating that other than the specified
external interfaces, the internals of the module are unimportant to other modules. In this way we are performing
a form of "information hiding" that reduces the apparent complexity of the module.
In the software and HDL world, this is paralleled by a reduction of global variables to a minimum Increasingly,
locality often means temporal locality or adherence to a clock or timing protocol. One of the central themes of
temporal locality is to reference all signals to clock. This, input signals are specified with required setup and
hold times relative to the clock, and outputs have delays related to the edges of the clock.

TESTING DEBUGGING AND VERIFICATION


Introduction
While in real estate the refrain is "Location! Location! Location!" the comparable advice in IC design should
be "Testing! Testing! Testing!"
For many chips, testing accounts for more effort than design.
Tests fall into three main categories. The first set of tests verifies that the chip performs its intended function.
These tests are run before tapeout to verify the functionality of the circuit and are called functionality tests or
logic verification.
The second set of tests are run on the first batch of chips that return from fabrication. These tests confirm that
the chip operates as it was intended and help debug any discrepancies. They can be much more extensive than
the logic verification tests because the chip can be tested at full speed in a system.
The third set of tests verify that every transistor, gate, and storage element in the chip functions correctly. These
tests are conducted on each manufactured chip before shipping to the customer to verify that the silicon is
completely intact. These will be called manufacturing tests. In some cases, the same tests can be used for all
three steps, but often it is easier to use one set of tests to chase down logic bugs and another, separate set
optimized to catch manufacturing defects.
Because of the complexity of the manufacturing process, not all die on a wafer function correctly. Dust particles
and small imperfections in starting material or photomasking can result in bridged connections or missing
features. These imperfections result in what is termed a fault. The goal of a manufacturing test procedure is to
determine which die are good and should be shipped to customers.
Testing a die (chip) can occur at

By detecting a malfunctioning chip early, the manufacturing cost can be kept low. For instance, the
approximatecost to a company of detecting a fault at the various levels is:

Obviously, if faults can be detected at the wafer level, the cost of manufacturing is lower.
It is interesting to note that most failures of first-time silicon result from problems with the functionality of the
design; that is, the chip does exactly what the simulator said it would do, but for some reason this functionality
is not what the rest of the system expects.
LOGIC VERIFICATION
Verification tests are usually the first ones a designer might construct as part of the design Figure 9.1
shows that we may want to prove that the RTL is equivalent to the design specification at a higher
behavioural or specification level of abstraction

The behavioral specification might be a verbal description; a plain language textual specification; a description
in some highlevel computer language such as C, FORTRAN, Pascal, or LISP; a program in a system modeling
language such as SystemC; or a hardware description language such as VHDL or Verilog; or simply a table of
inputs and required outputs.
Functional equivalence involves running a simulator at some level on the two descriptions of the chip and
ensuring that the outputs are equivalent at some convenient check points in time for all inputs applied. This is
most conveniently done in HDL by employing a test bench, i.e„ a wrapper that surrounds a module and provides
for stimulus and automated checking.
The most detailed check might be on a cycle-by-cycle basis. Increasingly, verification involves real-time or near
real-time emulation in FPGA-based system to confirm system-level performance This is recommended because
of the increasing level of complexity of chips the systems they implement.
One check functional equivalence through simulation at various levels of the design hierarchy. If the description
is at the RTL level, the behaviour at a system level may be able to be fully verified. For instance in the case of
a microprocessor, you can boot the operating system and run key programs for the behavioural description.
The advice respect to writing functional tests is to simulate as closely as Possible the way In which the chip or
system used in the real world. Often this is impractical due to slow simulation times and extremely long
verification sequences.
One approach is to move up the simulation hierarchy as modules become verified at lower levels. for instance,
one could replace the gate-level adder and register modules in a video ti with functional models and then in turn
replace the filter itself with a functional model. At level, one can write small tests to verify the equivalence
between the new higher level functional model and the lower-level gate or functional level.
At the top level, you can surround the filter functional model with a software environment that models the
realworld use of the filter.
Verification at the top chip level using an FPGA emulator offers several advantages over simulation and, for
that matter, the final chip implementation. Most noticeably, the emulation times can be real time. This means
that the actual analog signals (if used) can be interfaced with the chip. Additionally, to assess system
performance, one can introduce fine levels of observation and monitoring that might not be included in the final
chip..The amount of verification effort greatly exceeds the design effort.
"If you don-t test it, it won't work! (guaranteed)"

Basic Digital Debugging Hints


Many times, when a chip returns from fabrication, first set of tests are run in a lab environment, so you need to
prepare for this event. You can begin by constructing a circuit board that provides the following attributes:
You write software routines to interface to the chip Through the serial or parallel port or the bus interface. The
chip should a serial UAIQT port interface that can be used independently of the normal operation of the chip.
The lowest level of the software should provide for peeking (reading) and poking (writing) registers in the chip.
An alternate or complementary approach is to provide interfaces for a logic analyzer. These are easily added to
a PCB design in the form of multi-pinned headers.
Figure 9.2 shows a typical test board, illustrating the zero insertion force (ZIF) socket for the chip (in the center
of the board), an area for analog circuitry interface (on the left), a set of headers for logic analyzer connection
(at the top and bottom) and a set of programmable power supplies (on the right).

In addition, an interface is provided by a serial port of a PC (at the bottom left).


One should start with a "smoke test." This involves ramping the supply voltages from zero to Vdd while
monitoring the current without any clocks running. For a fully static circuit, the current should remain at zero.
Following this, you can enable the clock(s); beware chat many CMOS chips appear to operate when the clock
is connected but the power supply is turned off because the clock may partially power the chip through the input
protection diodes on the input pads. If possible, one should initially run the clock at reduced speed so that setup
time failures are not the initial culprit in any debug operation.
In the case of a digital circuit, one should examine various registers for health using PC-based peek and poke
software. This checks the integrity of the signal path from the PC to the chip. Often, designers place an ID in
the register at address zero. Peeking at this register proves the read path from the chip. If the chip registers are
reset to a known state, the register can be read sequentially and compared with the design values.
If the chip has built-in self-test , the commercial software that provides for this functionality over a boundary
scan interface can be run. This type of system automatically runs a set of tests on the chip that completely verify
the correct operation of all gates of registers as defined by the original RTL description. If this kind of a test
interface was not used,
If anomalous behaviour is detected, one must go about debugging. The basic method js to postulate a method
of failure, then test the hypothesis. is an art in itself, but some pointers for same debugging are as follows:
When the chip is demonstrated to be operational, you call measure more subtle aspects of the design such as
performance (power, speed, analog characteristics). This involves normal lab techniques of configure, measure,
and record. Where possible, store all results as computer readable results for communication with colleagues.
For the most part, if a digital hip simulates at the gate level and passes timing analysis checks during design, it
will do exactly the same in silicon. Possible deviations from the simulated circuit occur in the following cases:

With analog circuitry, a wide range of issues can affect performance over and above what was simulated. These
include power and ground noise, substrate noise, and temperature and process effects. However, you can employ
the same basic debug approaches.

Manufacturing Tests
Whereas verification or functionality tests seek to confirm the function of chip as a whole, manufacturing tests
are used to verify that every gate operates as expected. The need to do this arises from a number of
manufacturing defects that might occur during either chip fabrication or accelerated life testing Typical defects
include:

layer-to-layer shorts (e.g., metal-to-metal) discontinuous vvires (e.g., metal thins vilen crossing vertical
topology Jumps) missing or damaged vias shorts through the thin gate oxide to the substrate or well
These in turn lead to particular circuit maladies, including:

nodes shorted to power or ground nodes shorted to each other inputs floating/outputs disconnected
Tests are required to verify that each gate and register is operational and has not been compromised by a
manufacturing defect.

Apart from the verification of internal gates, I/O integrity is also tested, with the following tests being
completed:

I/O levels speed test


full-speed wafer testing can be completed with a minimum of connected pins. This can be important in reducing
the cost of the wafer test fixture.
In general, manufacturing test generation assumes the function of circuit/chip is connected. It requires ways of
exercising all gate inputs and monitoring all gate outputs.
Logic Verification Principles
Test Benches and Harnesses
A verification test bench or harness is a piece of HDL code that is placed as a wrapper around a core piece of
HDL. IT the simplest test bench, inputs are applied to the module under test and at each cycle, the outputs are
examined to determine whether they comply with a predefined expected data set. The data set can be derived
from another model and available as a file or the value to be computed on the fly.
The file comparison method is applicable to a Wide range of simulation scenarios as files form a common basis
for I/O between different design systems. The notion of a golden :model is frequently used as the reference for
establishing functional equivalence. A golden model might be a model for the system being designed in a high-
level language such as or in a design 1001 such as MATLAB. The golden model writes expected output files
that are used as the basis for comparison.
Simulators usually provide settable break points and single or multiple stepping abilities to allow the designer
to step through a test sequence while debugging discrepancies.

Regression Testing
High-level language scripts are frequently used when running large test benches, especially for regression
testing. Regression testing involves performing a suite of simulations to automatically verify that no
functionality has inadvertently changed in a module or set of modules. During a design, it is common practice
to run a regression script every night after design activities have concluded to check that bug Exes or feature
enhancements have not broken completed modules.

Version Control
Combined with regression testing is the use of versioning, that is, the orderly management of different design
Iterations. Unix Linux tools such as CVS are usetful for this.

Bug Tracking
Another important tool to use during verification (and in fact the whole design cycle) is a bug-tracking system.
Bug-tracking systems such as the Unix/Linux based GNATS allow the management of a wide variety of bugs.
In these systems, each bug is entered and the location, nature, and severity of the bug noted. The bug discoverer
is noted, along with the perceived person responsible for fixing the bug

Manufacturing Test Principles


A critical factor in all VLSI design is the need to incorporate methods of testing circuits. This task should
proceed concurrently with any architectural considerations and not be left until fabricated parts are available (as
is a recurring temptation to designers).
Figure 9.7(a) shows a combinational circuit with N input circuit to test this circuit exclusively, a sequence of
inputs must be applied and observed to fully exercise the circuit.

This combinational circuit is converted to sequential circuit with addition of M registers, as shown in Fig
9.70(b). state of the circuit is determined by the inputs and the previous state. A minimum of 2N+M test vectors
must be applied to exhaustively to test the circuit Manufacturing test engineers must cleverly devise test vectors
that detect any (or nearly any) defective node without requiring so many patterns.

Models
To deal with the existence of good and bad parts, it is necessary to propose a fault model, i.e., model for how
faults occur and their impact on circuits. The most popular model is called the Stuck-At model. The Short
Circuit/Open Circuit model can be closer fit to reality, but is harder to incorporate into logic simulation tools.
Stuck-at Faults
In the Stuck-At model, a faulty gate input is modeled as a stuck at zero (Stuck-At-0, S-A-0) or stuck at one
(Stuck-At-l, S-A-I). This model dates from board-level designs, where it was determined to be adequate for
modeling faults. Figure 9.8 illustrates how an S-A-0 or S-A-1 fault might occur. These faults most frequently
occur due to gate oxide shorts or metal-to-metal shorts.

Short circuit and open circuit Faults


Other models include stuck-open or shorted models . Two bridging or shorted fault are shown in Figure 9.9
The short Sl results in an S-A-1 fault at input A, while short S2 modifies the function of the gate. It is evident
to ensure most accurate modeling, faults should be modeled at the transistor level because it is only at this level
the complete circuit structure is known. For instance, in the case or a simple NAND the intermediate node
between the series nMOS transistors is hidden by the schematic..
A particular problem that arises with CMOS is that it is possible for a fault co convert a combinational circuit
into a sequential circuit. This is illustrated in Figure 9.10 for the case of a 2-input NOR gate in which one of the
transistors is rendered ineffective.

If nMOS transistor A is stuck open, then me function displayed by the gate will be

where is the previous state of the gate. As another example, if either pMOS transistor is missing. the node
would be arbitrarily charged until one of the nMOS transistors discharged the node. Thereafter, it would remain
at zero, barring charge leakage effects.
It is also possible for transistors to exhibit a stuck-open or stuck-closed state. Stuck closed states can be detected
by observing the static VDD current (Idd) while applying test vectors. Consider the fault shown in Figure 9.11,
where drain connection a pMOS transistor in a 2-input NOR gate is shorted to Vdd

This could physically occur if stray metal overlapped the Vdd line and drain connection as shown. If we apply
the test vector 01 or 10 to the A and B inputs and measure the static Idd current, we will notice that it rises to
some value determined by size of the nMOS transistors.
Observability
The observability of a particular circuit node is the degree to which you can observe that node at the outputs of
an integrated circuit . this metric is relevant when you want to measure the output of a gate within a larger
circuit to check that it operates correctly. Given the limited number of nodes that can be directly observed, it is
the aim of good chip designers to have easily observed gate outputs.
Adoption of some basic design for test techniques can aid tremendously in this respect. Ideally, you should be
able to observe directly or with moderate indirection (i.e., you may have to wait a few cycles) every gate output
within an integrated circuit. While at one time this aim was hindered by the expense of extra test circuitry and
a lack of design methodology, current processes and design practices allow you to approach this ideal.

Controllability
The controllability of an internal circuit node within a chip is a measure of the ease of setting the node to a 1 or
0 state. This metric is of importance when assessing the degree of

Automatic Test Pattern Generation (ATPG)


Historically, in the IC industry, logic and circuit designers implemented the functions at the RTL or schematic
level, mask designers completed the layout, and test engineers wrote the tests. In many ways, the test engineers
were Sherlock Holmes of the industry, reverse engineering circuits devising tests that would test the circuits in
an adequate manner.
For the longest Lime, test engineers implored Circuit designers to include extra circuitry to ease the burden of
test generation. Happily, as processes have increased in density and chips have increased in complexity,
inclusion of test circuitry has become less of an overhead for both the designer and the manager worried about
the cost of the die. In addition, as tools have improved, more of the burden for generating tests has fallen on the
designer. To deal with this burden, Automatic Test Pattern Generation (ATPG) methods have been invented.
The use of some form of ATPG is standard for most digital designs.
Commercial POPG tools can achieve excellent fault coverage. However, they are computation-intensive and
often must be run on servers or compute farms with many parallel processors. Some tools use statistical
algorithms to predict the fault coverage of a set of vectors without performing as much simulation.

Delay Fault Testing


The fault models dealt with until this point have neglected timing. Failures that occur in CMOS could leave the
functionality of the circuit untouched, but affect the timing. For instance, consider the layout shown in Figure
9.12 for an inverter gate composed of paralleled nMOS and pMOS transistors.
If an open circuit occurs in one of the nMOS transistor source connections to GND, then the gate would still

function but with increased In addition, the fault now becomes sequencial as the detection of the fault
depends on the previous state of gate.
Delay faults may be caused by crosstalk . Delay faults can also occur more often in SOI logic through the history
effect. Software has been developed to model the effect of delay faults and is becoming more important as a
failure mode as processes scale.

Design for Testability


The keys to designing circuits that are testable are controllability and observability. Restated, controllability is
the ability to set (to 1) and reset (to 0) every node internal to the circuit. Observability is the ability to observe,
either directly or indirectly, the state of any node in the circuit.
Good observability and controllability reduce the cost of manufacturing test because they allow high fault
coverage with relatively few test vectors. Moreover, they can be essential to silicon debug because physically
probing most internal signals has become so difficult.
The three main approaches to what is commonly called Design for Test— ability (DFT). These may be

Ad hoc Testing
Ad hoc test techniques, as their name suggests, are collections of ideas aimed at reducing the combinational
explosion of testing.. They are only useful for small designs where scan, ATPG, and BIST are not available. A
complete scan-based testing methodology is recommended for all digital circuits. Having said that, common
techniques for ad hoc testing involve:

A technique classified in this category is the use of the bus in a bus-oriented system for test purposes. Each
register has been made loadable from the bus and capable of being driven onto the bus. Here, the internal logic
values that exist on a data bus are enabled onto the bus for testing purposes.
Frequently, multiplexers can be used to provide alternative signal paths during testing. In CMOS, transmission
gate multiplexers provide low area and delay overhead.
Any design should always have a method of resetting the internal state of the chip within a single cycle or at
most a few cycles. Apart from making testing easier, this also makes simulation faster as a few cycles are
required to initialize the chip.
In general, ad hoc testing techniques represent a bag of tricks developed over the years by designers to avoid
the overhead of a systematic approach to testing, as will be described in the next section. while these general
approaches are still quite valid, process densities and chip complexities necessitate a structured approach to
testing.

Scan Design
The scan-design strategy for testing has evolved to observability and controllability at each register. In designs
with scan, the registers operate in one of two modes. In normal mode, they behave as expected. In scan mode,
they are connected to form a giant shift register called a scan chain spanning the whole chip. By applying N
clock pulses in scan mode, all N bits of state ill the system can be shifted out and new N bits of state can be
shifted in. Therefore, scan mode gives easy observability and controllability of every register in the system.
Modern scan is based on the use scan registers, as shown in Figure 9.13

The scan register is a D flip-flop preceded by a multiplexer. the SCAN signal is deasserted, the register behaves
as a conventional register, storing data on the D input. When SCAN is asserted, the data is loaded from the SI
pin, which is connected in shift register fashion to the previous register Q output in the scan chain.
For the circuit shown, to load the scan chain, SCAN is asserted and CLK is pulsed eight times to load the first
two ranks of 4-bit registers with data. SCAN is deasserted and CLK is asserted for one cycle to operate the
circuit normally with predefined inputs.
SCAN is then reasserted and CLK asserted eight times to read only stored data out. At the same time, the new
register contents can be shifted in for the next test. Testing proceeds in this manner of serially clocking the data
through the scan register to the right point in the circuit, running a single system clock cycle and serially clocking
the data out for observation.
In this scheme, every input to the combinational block can be controlled and every output can be observed. In
addition, running a random pattern of 1's and 0's through the scan chain can test the chain itself.
Test generation for this type of test architecture can be highly automated. ATPG can be used for the
combinational blocks the scan chain is easily tested. The prime disadvantage is the area and delay impact of the
extra multiplexer in the scan register. Designers (and managers alike) are in widespread agreement that this cost
is more than offset by the savings in debug tine and production test cost.

Parallel Scan
One can imagine that serial chains become quite long, and the loading and unloading can dominate testing time.
A fairly simple idea is to split the chains into smaller segments. This can be done on a module-by-module basis
or completed automatically to some specified scan length. Extending this to the limit yields an extension to
serial scan called random access scan. To some extent, this is similar to that used inside FPGAs to load and read
the control RAM. The basic idea is shown in Figure 9.14.
Partial Scan
Sometimes, making every register scannable is too expensive, so only a partial set is scanned. In the CORDIC
structure introduced in, each CORDIC slice has three m-bit registers. Converting all of these to scan registers
may not be desirable. As the structure is a data pipeline, the test registers can be placed on the input and output
of the pipeline as shown below in Figure 9.15.

Partial scan is a throwback to hoc testing and should be avoided except in unusual circumstances.

Circuit Design of Scannable Elements


As we have seen, an ordinary flip-flop can be made scannable by adding a multiplexer on the data input, as
shown in Figure 9.16(a).
Figure 9.16(b) shows a circuit design for such a scan register using a transmission gate multiplexer. The setup
time increases delay of the extra transmission gate in series with the D input as compared to the ordinary static
flip-flop
Figure 9.16(c) shows a circuit using clock gating to obtain nearly the same setup time as the ordinary flip-flop.
In either design, if a clock enable is used to stop the clock to unused portions of the chip, care must be taken
that always toggles during scan mode.
During scan mode, the flip-flops are connected back-to-back. Clock skew can lead to hold time problems in the
scan chain. These problems can be overcome by adding delay buffers on the SI input to flip-flops that might see
large clock skews.
Another approach is to use nonoverlapping clocks to ensure hold times. For example, the Level Sensitive Scan
Design (LSSD) methodology developed at IBM uses flip-flops with two-phase nonoverlapping clocks like
Figure 7.21. During scan mode- a scan clock is toggled in place of , as shown in Figure 9.17

The nonoverlapping clocks also prevent hold time problems in normal operation, but increase the sequencing
overhead of the flip-flop

Systems using latches can also be modified for scan. Typically, a scan input and an extra slave scan latch are
added to convert the latch into a scannable flip-flop. Figure 9.18 shows a scannable transparent latch.
on-chip with a clock chopper that converts the external low-frequency scan clock into an on-chip with short
pulses. The scan chain must also be checked for hold time races. Note that the SO transmission gate is ON
during normal operation, loading the Q output and increasing power consumption through spurious transitions
on Z and SC).
Many designers would elect to use a second scan clock wire to avoid these problems. Domino pipelines also
can be scanned. Traditional domino pipelines incorporate scan into the two-phase transparent latches on the
half-cycle boundaries. Skew-tolerant domino eliminates the latches and must include scan directly in the domino
gate. One natural point to scan is the last gate of each cycle.
Figure 9.20(a) shows how to make the last gate of each cycle in a skew-tolerant domino pipeline scannable
The last dynamic gate has full keeper and thus will retain its state when either high or low. The scan technique
resembles that of a transparent latch from Figure 9.18(c). The key is to turn off both precharge the evaluation
transistors so the output node floats and behaves like a master latch. Then a two-phase scan clock is toggled to
shift data first onto the master node and then into 0. slave scan latch.

These scan clocks are again cause bear no relationship to the domino clocks and (þ2. gc/k is
stopped low, so is high and the precharge transistor Is off. A special clock gate forces low during scan to turn
the evaluation transistor off. When scan is complete, gclk rises so the next domino gate resumes normal
operation. This scan approach adds a small amount of loading on the critical path through the dynamic gate.
Figure 9.20(b) shows a clock gate that produces domino phases. It uses an SR latch to stop and release during
scan, as illustrated in Figure 9.20(c). The gate also accepts an enable to stop the domino clocks when the pipeline
is idle.
The Itanium 2 provides domino scan in a similar fashion, but with a single-phase scan clock that is compatible
with scan of the Naffziger pulsed latches. The last domino gate each half-cycle uses a dynamic latch converter.
Scan circuitry can be added to the DLC in much the same way as it is added to a latch.
Robust scan circuitry obeys a number of rules to avoid electrical failures. SI is locally buffered to prevent
problems with directly driving diffusion inputs and overdriving feedback inside the latch. T he output is also
buffer, so noise cannot back drive the state node. Two-phase nonoverlapping clock hold-time problems, and
static feedback on the state node allows low-frequency operation. All internal nodes should swing rail-to-rail.
These rules can be bent to save area at the expense of greater electrical verification on the scan chain, as was
done for the Itanium 2.

Built-in Self-Test (BIST)


Self-test and built-in test techniques, as their names suggest, rely on augmenting circuits to allow them to
perform operations upon themselves that prove correct operation. These techniques Add area to the chip for the
test logic, but reduce the test time required and thus can the overall system cost, offers extensive coverage of
the subject from the implementer's perspective.
One method of testing a module is to use signature analysis or cyclic redundancy checking. This involves using
a Pseudo-random sequence generator (PRSG) to produce the input signals tor a section of combinational
circuitry and a signaturc analyzer to observe the output signals.
A PRSG is defined by a polynomial of some length n. It is constructed from a linear feedback shift register
(LFSR), which. in turn is made of n flip-flops connected in a serial fashion, as shown in Figure 9.22(a).
The XOR of particular outputs are fed back to the input of the LFSR. An n-bit LFSR will cycle through states
before repeating the sequence. They are described by a characteristic polynomial indicating which bits are fed
back. A complete feedback shift register (CFSR), shown in Figure 9.22(b), includes the zero state that may be
required in some test situations. An n-bit LFSR is converted to an n-bit CFSR by adding an n-1 Input NOR gate.
When in state 0. . 01, the next state is 0.. .00. When in state 0...00,t he next state is 10...0. Otherwise, thc sequence
is the same. Alternatively, the bottom bits of an n +1 bit LFSR can be used to cycle through the all zeros state
without the delay of the NOR gate.
A signature analyzer receives successive outputs of a combinational logic block and produces a syndrome is
function of these outputs, The syndrome is reset To 0, and then XORed with the output on each cycle. The
syndrome is swizzled each cycle so that a fault in one bit is unlikely to cancel itself out.
At the end of a test sequence, the LFSR contains the syndrome that is a function of all previous outputs. This
can be compared with the correct syndrome to determine whether the -circuit is good or bad. If the syndrome
contains enough bits, it is extremely improbable that a defective circuit will produce the correct syndrome.

BILBO
The combination of signature analysis and the scan technique creates structure known as BILBO—for Built-In
Logic Block Observation or BIST—for Built-In Self-Test. The 3-bit BILBO register shown in Figure 9.23 is a
scannable, resettable register that also can serve as a pattern generator and signature analyzer

.
specifies the mode of operation. In the reset mode (10), all the flip-flops are synchronously initialized to
0. In normal mode (11), the flip-flops behave normally with their D input and Q output. In scan mode (00). the
flip-flops are configured as a 3-bit shift register between SI and SO-
In test mode (01), the register behaves as a pseudo-random sequence generator or signature analyzer. If all the
D inputs arc held low, the Q outputs loop through a pseudo-random bit sequence, which can serve as the input
to the combinational logic. If the D inputs are from the combinational logic output, they are swizzled with the
existing state to produce the syndrome. In summary, BIST is performed by first resetting the syndrome in the
output register. Then both registers are placed in the test mode to produce the pseudo-random inputs and
calculate the syndrome. Finally, the syndrome is shifted out through the scan chain.
Various companies have commercial design aid packages that support BIST. LogicVision has a package called
Logic BIST, which takes a synthesized netlist and adds the scan registers and PRSG circuits automatically. It
then checks the fault coverage and provides generated scripts to use boundary scan to run tests on the final chip.
As an example, on a WLAN modem chip comprising roughly 1 million gates, a full at-speed test takes under a
second With BIST This comes with roughly a 7.3% overhead in the core arca (but actually zero because the
design was pad limited) and a 99.70/0 fault coverage revel. The WLAN modem parts designed in this way were
fully tasted in less than ten minutes on receipt of first silicon. This kind of test method is incredibly valuable for
productivity in manufacturing test generation.

Memory Self-test
Testing large memories on a production tester can be expensive because they contain so many bits and thus
require so many test vectors. Embedding self-test circuits with the memories can reduce the number of external
test vectors that have to be run. A typical read/write memory (RAM) test program for an M-bit address memory
might be as follows:

where data is 1 and -data is 0 fòr a single-bit memory or selected set of patterns far an n-bit word.
test
writing all zeroes, all ones, and alternating ones and zeros. An address counter, some multiplexers, and a simple
state machine result in a low-overhead self-test structure for read/write memories. The self-test consists of 256
K cycles that input a checkerboard pattern of alternating 1's and 0's to test for cell-to-cell interference. This is
followed by 256 K cycles in which the data is read out. Then a complemented checkerboard is written and read.
A total of 1 million cycles provide a test sufficient for system maintenance.
ROM memories be tested by placing signature analyzer at the output of the ROM and incorporating a test mode
that cycles through the contents of the ROM. A significant advantage of all self-test methods is that testing can
be performed when the part is in the field. With care, self-test can even be done during normal system operation.

IDDQ TESTING
A method of testing for bridging faults is called IDDQ test (Vdd supply current Quisent) apply current
monitoring. This relies on the fact that when a complimentary CMOS logic gate is not switching, it draws no
DC current (except for leakage). When a bridging fault occurs. then for some combination of input conditions,
a measurable DC Idd will flow. Testing consists of applying the normal vectors. allowing the signals to settle,
and then measuring Idd. As potentially only one gate is affected, the IDDQ test has to be very sensitive. In
addition, to be effective, any circuits that draw DC power such as pseudo-nMOS gates or analog circuits have
to be disabled. Dynamic gates can also cause problems. As current measuring is slow, the tests must be run
slower than normal, which increases the test time.
IDDQ testing can be completed externally to the chip by measuring current drawn on the Vdd line or internally
using specially constructed test circuits. This technique gives a form of indirect massive observability at little
circuit overhead. However, as sub hreshold leakage current increases, IDDQ testing ceases to be effective
because variations in subthreshold leakage exceed currents caused by the faults.

Manufacturability
Circuits can be optimized for manufacturability to increase theil yield. This can be done in a number of different
ways.
Physical At the physical level (i.e., mask level), the yield and hence manufacturability can be improved by
reducing the effect of process defects. The rules for particular processes will frequently have guidelines for
improving Yield. The following list is representative.
O

Increasingly, design tools are dealing these kinds of optimizations automatically.

Redundancy
Redundant structures can be used to compensate for defective components on a chip. For example, memory
arrays are commonly built Mith extra rows. During manufacturing test, if one of the words is found to be
defective, the memory can be reconfigured to access the spare row instead. Laser-cut wires or electrically
programmable fuses can be used for configuration. Similarly, if the memory has many banks and one or more
are found to be defective, they can be disabled, possibly even under software control.
Power
Elevated power can cause failure due to excess current in wires, which in turn can cause metal migration
failures. In addition, high-power devices raise the die temperature, degrading device performance and time,
causing device parameter shifts. The method of dealing with this component of manufacturability is to minimize
power through design techniques. In addition, a suitable package and heat sink should be chosen to remove
excess heat.
Process Spread We have seen that process simulations be carried out at different process corners. Monte
Carlo analysis, can provide better modeling for process spread and can help with centering a design within the
process variations.

Yield Analysis Whew a chip has poor yield or will be manufactured in high volume, dies that fail
manufacturing test can be taken to a laboratory for yield analysis to locate the root cause of the failure. If
particular structures are determined to have caused many of the failures, the layout of the structures can be
redesigned. for example, during volume production ramp-up foe a major microprocessor, the silicide over long
thin polysilicon lines was Found to often crack and raise the wire resistance. This in turn leads to slower-than-
expected operation for the cracked chips. The layout was modified to widen polysilicon wires or strap them
with metal wherever possible, boosting the yield at higher frequencies.

You might also like