Ec6601 Notes Rejinpaul Ii PDF
Ec6601 Notes Rejinpaul Ii PDF
Ec6601 Notes Rejinpaul Ii PDF
com
www.rejinpaul.com
Ms.K.Sangeethalakshmi
Assistant Professor
Department of ECE
sangeetha.lk@rmkcet.ac.in
CMOS VLSI Design 4th Ed.
TEXTBOOKS:
1. Weste and Harris: CMOS VLSI DESIGN (Third edition) Pearson
Education, 2005
2. Uyemura J.P: Introduction to VLSI circuits and systems, Wiley
2002.
CMOS VLSI Design 4th Ed.
Syllabus
A brief History
MOS transistor
Ideal I-V characteristics,
C-V characteristics
Non ideal IV effects
DC transfer characteristics
CMOS technologies
Layout design Rules
CMOS process enhancements,
Technology related CAD issues, Manufacturing issues. 3
Vin0 Vin5
Vin1 Vin4
Idsn, |Idsp|
Vin2 Vin3
Vin3 Vin2
Vin4 Vin1
VDD
Vout
Vin1 Vin4
Idsn, |Idsp|
VDD
Vin2 Vin3
Idsp
Vin3 Vin2 Vin Vout
Vin4 Vin1 Idsn
VDD
Vout
Vin0 Vin5
in5
Vin1 Vin4
dsn, |Idsp
Idsn dsp
|
Vin2 Vin3
Vin3 Vin2
Vin4 Vin1
in0
VDD
Vout
out
DD
DC TRANSFER CURVE
Transcribe points onto Vin vs. Vout plot
Vin0 Vin1
VDD Vin2
Vin0 Vin5
A B
Vout
Vin1 Vin4
C
Vin2 Vin3
Vin3
Vin3 Vin2 D Vin4 Vin5
Vin4 Vin1 E
0 Vtn VDD/2 VDD+Vtp
VDD VDD
Vout Vin
OPERATING REGIONS
Revisit transistor operating regions VDD
Vin Vout
BETA RATIO
If bp / bn 1, switching point will move from
VDD/2
Called skewed gate
0
VDD
Vin
NOISE MARGINS
How much noise can a gate input see before it
does not recognize the input?
LOGIC LEVELS
To maximize noise margins, select logic levels at
unity gain point of DC transfer characteristic
Vout
b p/b n > 1
Vin Vout
VOL
Vin
0
Vtn VIL VIH VDD- VDD
|Vtp|
INTRODUCTION
Integrated circuits: many transistors on one chip.
Very Large Scale Integration (VLSI): bucketloads!
0: Introduction
Complementary Metal Oxide Semiconductor
Fast, cheap, low power transistors
Today: How to build your own simple CMOS chip
CMOS transistors
Building logic gates from transistors
Transistor layout and fabrication
Rest of the course: How to build a good CMOS
chip
12
NMOS TRANSISTOR
Four terminals: gate, source, drain, body
Gate – oxide – body stack looks like a capacitor
0: Introduction
Gate and body are conductors
SiO2 (oxide) is a very good insulator
Called metal – oxide – semiconductor (MOS)
capacitor
Source Gate Drain
Even though gate is Polysilicon
n+ n+
Body
p bulk Si
13
NMOS OPERATION
Body is usually tied to ground (0 V)
When the gate is at a low voltage:
0: Introduction
P-type body is at low voltage
Source-body and drain-body diodes are OFF
No current flows, transistor is OFF
0
n+ n+
S D
p bulk Si
14
0: Introduction
Negative charge attracted to body
Inverts a channel under gate to n-type
Now current can flow through n-type silicon from
source through channel to drain, transistor is ON
Source Gate Drain
Polysilicon
SiO2
1
n+ n+
S D
p bulk Si
15
PMOS TRANSISTOR
Similar, but doping and voltages reversed
Body tied to high voltage (VDD)
0: Introduction
Gate low: transistor ON
Gate high: transistor OFF
Bubble indicates inverted behavior
p+ p+
n bulk Si
16
TERMINAL VOLTAGES
Vg
Mode of operation depends on Vg, Vd, Vs
Vgs = Vg – Vs
+ +
Vgs Vgd
Vgd = Vg – Vd
- -
NMOS CUTOFF
No channel
Ids ≈ 0
n+ n+
p-type body
b
18
NMOS LINEAR
Channel forms
Current flows from d to s
Vgs > Vt
Vgs > Vgd > Vt
+ g +
- - Ids
s d
n+ n+
0 < Vds < Vgs-Vt
p-type body
b
19
NMOS SATURATION
Channel pinches off
Ids independent of Vds
Vgs > Vt
g Vgd < Vt
+ +
- -
s d Ids
n+ n+
Vds > Vgs-Vt
p-type body
b
20
I-V CHARACTERISTICS
In Linear region, Ids depends on
How much charge is in the channel?
21
CHANNEL CHARGE
MOS structure looks like parallel plate capacitor
while operating in inversions
Gate – oxide – channel
Qchannel = CV
Cox =
C = Cg = eoxWL/tox = CoxWL
eox / tox
V = Vgc – Vt = (Vgs – Vds/2) – Vt
polysilicon gate
gate Vg
W + +
tox source Vgs Cg Vgd drain
Vs - - Vd
n+
L
n+
SiO2 gate oxide channel
(good insulator, eox = 3.9) n+ - + n+
Vds
p-type body
p-type body 22
CAPACITANCE
Any two conductors separated by an insulator
have capacitance
24
GATE CAPACITANCE
Approximate channel as connected to source
Cgs = eoxWL/tox = CoxWL = CpermicronW
polysilicon
gate
W
tox
L SiO2 gate oxide
n+ n+ (good insulator, eox = 3.9e0)
p-type body
25
DIFFUSION CAPACITANCE
Csb, Cdb
Undesirable, called parasitic capacitance
26
CMOS FABRICATION
CMOS transistors are fabricated on silicon wafer
Lithography process similar to printing press
0: Introduction
On each step, different materials are deposited or
etched
Easiest to understand by viewing both top and
cross-section of wafer in a simplified
manufacturing process
27
INVERTER CROSS-SECTION
Typically use p-type substrate for nMOS
transistors
0: Introduction
Requires n-wellAfor body of pMOS transistors
GND VDD
Y SiO2
n+ diffusion
p+ diffusion
n+ n+ p+ p+
polysilicon
n well
p substrate
metal1
28
0: Introduction
connection called Shottky Diode
Use heavily doped well and substrate contacts /
taps A
GND VDD
Y
p+ n+ n+ p+ p+ n+
n well
p substrate
well
substrate tap
tap
29
0: Introduction
A
GND VDD
n-well
0: Introduction
Polysilicon
Polysilicon
n+ diffusion
p+ diffusion
n+ Diffusion
Contact
Metal p+ Diffusion
Contact
Metal
31
FABRICATION
Chips are built in huge factories called fabs
Contain clean rooms as large as football fields
0: Introduction
Courtesy of International
Business Machines Corporation.
Unauthorized use not permitted.
32
FABRICATION STEPS
Start with blank wafer
Build inverter from the bottom up
0: Introduction
First step will be to form the n-well
Cover wafer with protective layer of SiO2 (oxide)
Remove layer where n-well should be built
Implant or diffuse n dopants into exposed wafer
Strip off SiO2
p substrate
33
OXIDATION
Grow SiO2 on top of Si wafer
900 – 1200 C with H2O or O2 in oxidation furnace
0: Introduction
SiO2
p substrate
34
PHOTORESIST
Spin on photoresist
Photoresist is a light-sensitive organic polymer
0: Introduction
Softens where exposed to light
Photoresist
SiO2
p substrate
35
LITHOGRAPHY
Expose photoresist through n-well mask
Strip off exposed photoresist
0: Introduction
Photoresist
SiO2
p substrate
36
ETCH
Etch oxide with hydrofluoric acid (HF)
Seeps through skin and eats bone; nasty stuff!!!
0: Introduction
Only attacks oxide where resist has been exposed
Photoresist
SiO2
p substrate
37
STRIP PHOTORESIST
Strip off remaining photoresist
Use mixture of acids called piranah etch
Necessary so resist doesn’t melt in next step
0: Introduction
SiO2
p substrate
38
N-WELL
0: Introduction
Diffusion
Place wafer in furnace with arsenic gas
Heat until As atoms diffuse into exposed Si
Ion Implanatation
Blast wafer with beam of As ions
Ions blocked by SiO2, only enter exposed Si
SiO2
n well
39
STRIP OXIDE
Strip off the remaining oxide using HF
Back to bare wafer with n-well
0: Introduction
Subsequent steps involve similar series of steps
n well
p substrate
40
POLYSILICON
Deposit very thin layer of gate oxide
< 20 Å (6-7 atomic layers)
0: Introduction
Chemical Vapor Deposition (CVD) of silicon layer
Place wafer in furnace with Silane gas (SiH4)
Forms many small crystals called polysilicon
Heavily doped to be good conductor
Polysilicon
Thin gate oxide
n well
p substrate
41
POLYSILICON PATTERNING
Use same lithography process to pattern
polysilicon
0: Introduction
Polysilicon
Polysilicon
Thin gate oxide
n well
p substrate
42
SELF-ALIGNED PROCESS
Use oxide and masking to expose where n+
dopants should be diffused or implanted
0: Introduction
N-diffusion forms nMOS source, drain, and n-
well contact
n well
p substrate
43
N-DIFFUSION
Pattern oxide and form n+ regions
Self-aligned process where gate blocks diffusion
0: Introduction
Polysilicon is better than metal for self-aligned
gates because it doesn’t melt during later
processing
n+ Diffusion
n well
p substrate
44
N-DIFFUSION CONT.
Historically dopants were diffused
Usually ion implantation today
0: Introduction
But regions are still called diffusion
n+ n+ n+
n well
p substrate
45
N-DIFFUSION CONT.
Strip off oxide to complete patterning step
0: Introduction
n+ n+ n+
n well
p substrate
46
P-DIFFUSION
Similar set of steps form p+ diffusion regions for
pMOS source and drain and substrate contact
0: Introduction
p+ Diffusion
p+ n+ n+ p+ p+ n+
n well
p substrate
47
CONTACTS
Now we need to wire together the devices
Cover chip with thick field oxide
0: Introduction
Etch oxide where contact cuts are needed
Contact
n well
p substrate
48
METALIZATION
Sputter on aluminum over whole wafer
Pattern to remove excess metal, leaving wires
0: Introduction
Metal
Metal
Thick field oxide
p+ n+ n+ p+ p+ n+
n well
p substrate
49
LAYOUT
Chips are specified with set of masks
Minimum dimensions of masks determine
0: Introduction
transistor size (and hence speed, cost, and power)
Feature size f = distance between source and
drain
Set by minimum width of polysilicon
Feature size improves 30% every 3 years or so
Normalize for feature size when describing
design rules
Express rules in terms of l = f/2
E.g. l = 0.3 mm in 0.6 mm process 50
0: Introduction
51
INVERTER LAYOUT
Transistor dimensions specified as Width /
Length
0: Introduction
Minimum size is 4l / 2l, sometimes called 1 unit
In f = 0.6 mm process, this is 1.2 mm wide, 0.6 mm long
52
Vin0 Vin5
Vin1 Vin4
Idsn, |Idsp|
Vin2 Vin3
Vin3 Vin2
Vin4 Vin1
VDD
Vout
DC TRANSFER CURVE
Transcribe points onto Vin vs. Vout plot
Vin0 Vin1
VDD Vin2
Vin0 Vin5
A B
Vout
Vin1 Vin4
C
Vin2 Vin3
Vin3
Vin3 Vin2 D Vin4 Vin5
Vin4 Vin1 E
0 Vtn VDD/2 VDD+Vtp
VDD VDD
Vout Vin
OPERATING REGIONS
Revisit transistor operating regions VDD
Vin Vout
BETA RATIO
If bp / bn 1, switching point will move from
VDD/2
Called skewed gate
0
VDD
Vin
NOISE MARGINS
How much noise can a gate input see before it
does not recognize the input?
LOGIC LEVELS
To maximize noise margins, select logic levels at
unity gain point of DC transfer characteristic
Vout
b p/b n > 1
Vin Vout
VOL
Vin
0
Vtn VIL VIH VDD- VDD
|Vtp|
Ms.K.Sangeethalakshmi
Assistant Professor
Department of ECE
sangeetha.lk@rmkcet.ac.in
UNIT II
CIRCUIT CHARACTERIZATION AND
SIMULATION
Syllabus
Delay estimation
Logical effort and Transistor sizing,
Power dissipation
Interconnect
Design margin,
Reliability
Scaling
SPICE tutorial, Device models, Device characterization,
Circuit characterization, Interconnect simulation.
INTRODUCTION
Chip designers face a bewildering array of choices
What is the best circuit topology for a function?
???
How many stages of logic give least delay?
How wide should the transistors be?
EXAMPLE
Ben Bitdiddle is the memory designer for the Motoroil
68W86, an embedded automotive processor. Help Ben
A[3:0] A[3:0]
4:16 Decoder
16 words
Decoder specifications: 16
Register File
DELAY PLOTS
d =f+p 2-input
= gh + p 6
NAND Inverter
g = 4/3
Normalized Delay: d
5 p=2
What about d = (4/3)h + 2
4 g=1
NOR2? p=1
3 d=h+1
2 Effort Delay: f
1
Parasitic Delay: p
0
0 1 2 3 4 5
Electrical Effort:
h = Cout / Cin
6: Logical Effort
an inverter delivering the same output current.
Measure from delay vs. fanout plots
2 2 A 4
Y
2 B 4
A 2
A Y Y
1 B 2 1 1
CATALOG OF GATES
Logical effort of common gates
6: Logical Effort
Gate type Number of inputs
1 2 3 4 n
Inverter 1
NAND 4/3 5/3 6/3 (n+2)/3
NOR 5/3 7/3 9/3 (2n+1)/3
Tristate / mux 2 2 2 2 2
XOR, XNOR 4, 4 6, 12, 6 8, 16, 16, 8
8
CATALOG OF GATES
Parasitic delay of common gates
In multiples of pinv (1)
6: Logical Effort
Gate type Number of inputs
1 2 3 4 n
Inverter 1
NAND 2 3 4 n
NOR 2 3 4 n
Tristate / mux 2 4 6 8 2n
XOR, XNOR 4 6 8
9
6: Logical Effort
31 stage ring oscillator in
0.6 mm process has
Logical Effort: g=1 frequency of ~ 200 MHz
Electrical Effort: h=1
Parasitic Delay: p=1
Stage Delay: d=2
10
Frequency: fosc = 1/(2*N*d) = 1/4N
CMOS VLSI Design 4th Ed.
6: Logical Effort
Logical Effort: g=1
The FO4 delay is about
Electrical Effort: h=4
300 ps in 0.6 mm process
Parasitic Delay: p=1
15 ps in a 65 nm process
Stage Delay: d=5
11
6: Logical Effort
Cout-path
Path Electrical Effort H
Cin-path
Path Effort F f i gi hi
10
x z
y
20
g1 = 1 g2 = 5/3 g3 = 4/3 g4 = 1
h1 = x/10 h2 = y/x h3 = z/y h4 = 20/z
12
6: Logical Effort
Cout path
Path Electrical Effort H
Cin path
Path Effort F f i gi hi
13
6: Logical Effort
90
G =1
5
H = 90 / 5 = 18
GH = 18 15
90
h1 = (15 +15) / 5 = 6
h2 = 90 / 15 = 6
F = g1g2h1h2 = 36 = 2GH
14
BRANCHING EFFORT
Introduce branching effort
Accounts for branching between stages in path
6: Logical Effort
Con path Coff path
b
Con path
B bi
Note:
F = GBH
15
MULTISTAGE DELAYS
Path Effort Delay DF f i
P pi
6: Logical Effort
Path Parasitic Delay
Path Delay D d i DF P
16
6: Logical Effort
Delay is smallest when each stage bears same effort
fˆ gi hi F
1
N
GATE SIZES
How wide should the gates be for least delay?
fˆ gh g CCoutin
6: Logical Effort
gi Couti
Cini
fˆ
Working backward, apply capacitance
transformation to find input capacitance of each
gate given load it drives.
Check work by verifying input cap spec is met.
18
6: Logical Effort
y
x
45
A 8
x
y B
45
19
y
x
45
A 8
x
6: Logical Effort
y B
45
y
x
45
45
A P:
84 P:
x 4
N: 4 P:
y 12 B
B
N: 6 45
N: 3 45
6: Logical Effort
Example: drive 64-bit datapath with unit
inverter Initial Driver 1 1 1 1
8 4 2.8
16 8
D = NF1/N + P
23
= N(64)1/N + N
Datapath Load 64 64 64 64
N: 1 2 3 4
f: 64 8 4 2.8
D: 65 18 15 15.3
Fastest 22
DERIVATION
Consider adding inverters to end of path
How many give least delay? N - n1 ExtraInverters
Logic Block:
6: Logical Effort
n1 n1Stages
D NF pi N n1 pinv
1
N Path Effort F
i 1
D 1 1 1
F N ln F N F N pinv 0
N
Define best stage effort
F
1
N
pinv 1 ln 0
23
6: Logical Effort
Neglecting parasitics (pinv = 0), we find = 2.718
(e)
For pinv = 1, solve numerically for = 3.59
24
SENSITIVITY ANALYSIS
How sensitive is delay to using exactly the best
number of stages? 1.6
1.51
D(N) /D(N)
1.4
1.26
6: Logical Effort
1.2 1.15
1.0
(=6) ( =2.4)
0.0
0.5 0.7 1.0 1.4 2.0
N/ N
EXAMPLE, REVISITED
Ben Bitdiddle is the memory designer for the Motoroil
68W86, an embedded automotive processor. Help Ben
A[3:0] A[3:0]
6: Logical
4:16 Decoder
16 words
Decoder specifications: 16
Register File
Effort
16 word register file
Each word is 32 bits wide
Each bit presents load of 3 unit-sized transistors
True and complementary address inputs A[3:0]
Each input may drive 10 unit-sized transistors
Ben needs to decide:
How many stages to use?
How large should each gate be?
How fast can decoder operate? 26
NUMBER OF STAGES
Decoder effort is mainly electrical and branching
Electrical Effort: H = (32*3) / 10 = 9.6
6: Logical Effort
Branching Effort: B=8
6: Logical Effort
1/ 3
Stage Effort: f F
Path Delay: D 3 fˆ 1 4 1 22.1
Gate sizes: z = 96*1/5.36 = 18 y = 18*2/5.36 = 6.7
A[3] A[3] A[2] A[2] A[1] A[1] A[0] A[0]
10 10 10 10 10 10 10 10
y z word[0]
y z word[15]
28
COMPARISON
Compare many alternatives with a spreadsheet
D = N(76.8 G)1/N + P
6: Logical Effort
Design N G P D
NOR4 1 3 4 234
NAND4-INV 2 2 5 29.8
NAND2-NOR2 2 20/9 4 30.1
INV-NAND4-INV 3 2 6 22.1
NAND4-INV-INV-INV 4 2 7 21.1
NAND2-NOR2-INV-INV 4 20/9 6 20.5
NAND2-INV-NAND2-INV 4 16/9 6 19.7
INV-NAND2-INV-NAND2-INV 5 16/9 7 20.4
NAND2-INV-NAND2-INV-INV-INV 6 16/9 8 21.6
29
REVIEW OF DEFINITIONS
Term Stage Path
number of stages 1 N
6: Logical Effort
logical effort g G gi
H
Cout-path
electrical effort h CCoutin Cin-path
Con-path Coff-path
branching effort b Con-path B bi
effort f gh F GBH
effort delay f DF f i
parasitic delay p P pi
delay d f p D di DF P
30
6: Logical Effort
3) Sketch path with N stages 1
gi Couti
Find gate sizes Cini
fˆ
6)
31
6: Logical Effort
32
SUMMARY
Logical effort is useful for thinking of delay in
circuits
6: Logical Effort
Numeric logical effort characterizes gates
NANDs are faster than NORs in CMOS
Paths are fastest when effort delays are ~4
Path delay is weakly sensitive to stages, sizes
But using fewer stages doesn’t mean faster paths
Delay of path is about log4F FO4 inverter delays
Inverters and NAND2 best for driving large caps
Provides language for discussing fast circuits
But requires practice to master
33
6: Logical Effort
POWER DISSIPATION
34
7: Power
Ptotal = Pdynamic + Pstatic
Dynamic power: Pdynamic = Pswitching + Pshortcircuit
Switching load capacitances
Short-circuit current
Static power: Pstatic = (Isub + Igate + Ijunct +
Icontention)VDD
Subthreshold leakage
Gate leakage
Junction leakage
Contention current
35
7: Power
When transistors switch, both nMOS and pMOS
networks may be momentarily ON at once
Leads to a blip of “short circuit” current.
36
Pswitching CVDD 2 f
Try to minimize:
Activity factor
Capacitance
Supply voltage
Frequency
7: Power
Leakage and delay trade off
Aim for low leakage in sleep and low delay in active
mode
To reduce leakage:
Increase Vt: multiple Vt
Use low Vt only in critical circuits
Increase Vs: stack effect
Input vector control in sleep
Decrease Vb
Reverse body bias in sleep
Or forward body bias in active mode
38
Ms.K.Sangeethalakshmi
Assistant Professor
Department of ECE
sangeetha.lk@rmkcet.ac.in
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
UNIT III
COMBINATIONAL AND SEQUENTIAL
CIRCUIT DESIGN
Syllabus
Circuit families
Low power logic design
comparison of circuit families
Sequencing static circuits
circuit design of latches and flip flops
Static sequencing element methodology
sequencing dynamic circuits
synchronizers
COMBINATIONAL CIRCUIT
What makes a circuit fast?
◦ I = C dV/dt -> tpd (C/I) DV B 4
◦ low capacitance A 4
Y
◦ high current 1 1
◦ small swing
Logical effort is proportional to C/I
pMOS are the enemy!
◦ High capacitance for a given current
Can we take the pMOS capacitance off the
input?
Various circuit families try to do this…
CMOS VLSI Design 4th Ed.
10: Circuit Families 3
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Pseudo-nMOS
In the old days, nMOS processes had no
pMOS
◦ Instead, use pull-up transistor that is always
ON
In CMOS, use a pMOS that is always ON
◦ Ratio issue
◦ Make pMOS about ¼ effective strength of
load
P/2
1.8
Ids
pulldown network 1.5
Vout
16/2 1.2
P = 24
Vin Vout 0.9
0.6
CMOS VLSI Design 4th Ed. P = 14
0.3
4 P=4
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Pseudo-nMOS Gates
Design for unit current on output
to compare with unit inverter. Y
inputs
pMOS fights nMOS f
gu = gu = gu =
gd = g = gd =
gavg = Y gd = gavg =
avg
pu = A pu =
Y Y pu =
A pd = B pd = A B pd =
pavg = pavg = pavg =
Pseudo-nMOS Gates
Design for unit current on output
to compare with unit inverter. Y
inputs
pMOS fights nMOS f
Pseudo-nMOS Design
Ex: Design a k-input AND gate using
pseudo-nMOS. Estimate the delay driving
a fanout of H Pseudo-nMOS
In1 1
Y
G = 1 * 8/9 = 8/9 In 1 k
H
F = GBH = 8H/9
P = 1 + (4+8k)/9 = (8k+13)/9
N=2
1/N 4 2 H 8k 13
D = NF +P= 3 9
CMOS VLSI Design 4th Ed.
Pseudo-nMOS Power
Pseudo-nMOS draws power whenever Y
=0
◦ Called static power P = IDDVDD
◦ A few mA / gate * 1M gates would be a
problem
◦ Explains why nMOS went extinct
Use pseudo-nMOS
en sparingly
Y
for wide
NORs A B C
Turn off pMOS when not in use
CMOS VLSI Design 4th Ed.
8
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Dynamic Logic
Dynamic gates uses a clocked pMOS
pullup
Two modes: precharge and evaluate
2
2/3 1
A Y Y Y
1 A 4/3 A 1
The Foot
What if pulldown network is ON during
precharge?
Use series evaluation transistor to
precharge transistor
prevent
Y
fight. Y Y
inputs inputs
A f f
foot
footed unfooted
Logical Effort
Inverter NAND2 NOR2
1
Y
1 1
A 2
unfooted Y Y
A 1 B 2 A 1 B 1
gd = 1/3 gd = 2/3 gd = 1/3
pd = 2/3 pd = 3/3 pd = 3/3
1
Y
1 1
A 3
Y Y
footed A 2 B 3 A 2 B 2
gd = 2/3 gd = 3/3 gd = 2/3
2 pd = 3/3 3 pd = 4/3 2 pd = 5/3
Monotonicity
Dynamic gates require monotonically rising
inputs during evaluation
◦ 0 -> 0 A
◦ 0 -> 1
◦ 1 -> 1 violates monotonicity
during evaluation
Monotonicity Woes
But dynamic gates produce
monotonically falling outputs
during evaluation
Illegal for one dynamic gate to
drive another!
A=1
Domino Gates
Follow dynamic stage with inverting static
gate
◦ Dynamic / static pair is called domino gate
◦ Produces monotonic outputs
Precharge Evaluate Precharge
domino AND
W
W X Y Z X
A
Y
B C
Z
dynamic static
NAND inverter
A W X A X
H Y =
B H Z B Z
C C
Domino Optimizations
Each domino gate triggers next one, like a
string of dominos toppling over
Gates evaluate sequentially but precharge
in parallel
Thus evaluation is more critical than
precharge
S0
D0
S1
D1
S2
D2
S3
D3
S4 S5 S6 S7
D4 D5 D6 D7
Dual-Rail Domino
Domino only performs noninverting
functions:
◦ AND, OR but not NAND, NOR, or XOR
Dual-rail domino solves this problem
◦ Takes true and complementary inputs
sig_h sig_l Meaning
◦ Produces true and complementary
Y_l
outputsY_h
0 0 Precharged
‘0’
inputs
0 1 f f
1 0 ‘1’
1 1 invalid
Example: AND/NAND
Given A_h, A_l, B_h, B_l
Compute Y_h = AB, Y_l = AB
Pulldown networks are conduction
complements
Y_l Y_h
= A*B A_h = A*B
A_l B_l B_h
Example: XOR/XNOR
Sometimes possible to share transistors
Y_l Y_h
= A xnor B A_h A_l A_l A_h = A xor B
B_l B_h
Leakage
Dynamic node floats high during
evaluation
◦ Transistors are leaky (IOFF 0)
◦ Dynamic value will leak away over time
◦ Formerly miliseconds, now nanoseconds
Use keeper to hold dynamic
weak keeper node
◦ Must be weak enough
1 k
X
H not
Y to fight evaluation
A 2
2
Charge Sharing
Dynamic gates suffer from charge sharing
Y A
A x CY
Y
B=0 Cx Charge sharing noise
CY
Vx VY VDD
Cx CY
Secondary Precharge
Solution: add secondary precharge
transistors
◦ Typically need to precharge every other node
Big load capacitance CY helps as well
secondary
precharge
Y transistor
A x
B
Noise Sensitivity
Dynamic gates are very sensitive to noise
◦ Inputs:VIH Vtn
◦ Outputs: floating output susceptible noise
Noise sources
◦ Capacitive crosstalk
◦ Charge sharing
◦ Power supply noise
◦ Feedthrough noise
◦ And more!
CMOS VLSI Design 4th Ed.
22
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Power
Domino gates have high activity factors
◦ Output evaluates and precharges
If output probability = 0.5, a = 0.5
Output rises and falls on half the cycles
◦ Clocked transistors have a = 1
Leads to very high power consumption
Domino Summary
Domino logic is attractive for high-speed
circuits
◦ 1.3 – 2x faster than static CMOS
◦ But many challenges:
Monotonicity, leakage, charge sharing, noise
Widely used in high-performance
microprocessors in 1990s when speed was
king
Largely displaced by static CMOS now that
power is the limiter
Still used in memories for area efficiency
LEAP
LEAn integration with Pass transistors
Get rid of pMOS transistors
◦ Use weak pMOS feedback to pull fully high
◦ Ratio constraint
S
A
S L Y
B
CPL
Complementary Pass-transistor Logic
◦ Dual-rail form of pass transistor logic
◦ Avoids need for ratioed feedback
◦ Optional cross-coupling for rail-to-rail swing
S
A
S L Y
B
S
A
S L Y
B
STATIC CMOS
Bubble Pushing
Compound Gates
Logical Effort Example
Input Ordering
Asymmetric Gates
Skewed Gates
Best P/N ratio
Bubble Pushing
Start with network of AND / OR gates
Convert to NAND / NOR + inverters
Push bubbles around to simplify logic
◦ Remember DeMorgan’s Law
Y Y
(a) (b)
Y Y
D
(c) (d)
Compound Gates
Logical Effort of compound gates
unit inverter AOI21 AOI22 Complex AOI
A 4 B 4 A 4 B 4 B 6
2 C 4 C 4 D 4 C 6 A 3
A Y Y Y
1 A 2 A 2 C 2 D 6 E 6
C 1 Y
B 2 B 2 D 2 E 2 A 2
D 2 B 2 C 2
Input Order
Our parasitic delay model was too simple
◦ Calculate parasitic delay for Y falling
If A arrives latest? 2t
If B arrives latest? 2.33t
2 2 Y
A 2 6C
B 2x 2C
Asymmetric Gates
Asymmetric gates favor one input over another
Ex: suppose input A of a NAND gate is most
critical
◦ Use smaller transistor on A (less capacitance)
◦ Boost size of noncritical input resetA Y
◦ So total resistance is same
gA = 10/9 2 2
Y
gB = 2 A 4/3
gtotal = gA + gB = 28/9 reset 4
Symmetric Gates
Inputs can be made perfectly symmetric
2 2
Y
A 1 1
B 1 1
Skewed Gates
Skewed gates favor one edge over another
Ex: suppose rising output of inverter is most
critical
◦ Downsize noncritical nMOS transistor
HI-skew unskewed inverter unskewed inverter
inverter (equal rise resistance) (equal fall resistance)
2 2 1
A Y A Y A Y
1/2 1 1/2
2 2 B 4
Y
2 A 4
A 2
unskewed A Y Y
Inverter 1 guNAND2=1 B NOR2 2 gu = 4/3 1 1 gu = 5/3
gd 2 = 12 B 4
gd = 4/3 gd = 5/3
2 gA avg = 12 Y
A 4 gavg = 4/3 gavg = 5/3
unskewed A Y Y
1 gu = 1 B 2 gu = 4/3 1 1 gu = 5/3
gd = 1
gavg = 1
gd = 4/3
gavg = 4/3 2 gd = 5/3
2
gavg = 5/3 B 4
4 Y
2 2 2
Y
B
A 4
2 AA 4 1
HI-skew
HI-skew A
A Y
1/2 g = 5/6
u
YAB 1
1 g u
=1 1/2 1/2
Y
gu = 3/2
Y
1/2 gu = 5/6gg
gd = 5/3
gavg = 5/4
d
=2
= 3/2
B gd = 3 1
gavg = 9/4 gu = 1/2 1/2 gu =
avg
gd 1 = 5/3
1
Y
B 2 gd = gd =
1 A 2
LO-skew A
1
Y gB avg = 5/4
A 2
2 1 1
Y gavg = gavg =
gu = 4/3 gu = 2 gu = 2
gd = 2/3 gd = 1 gd = 1
gavg = 1 gavg = 3/2 1 1
gavg = 3/2 B 2
Y
1 A 2
A 2
LO-skew A Y Y
1 gu = 4/3 B 2 gu = 1 1 gu =
gd = 2/3 gd = gd =
gavg = 1 gavg = gavg =
Asymmetric Skew
Combine asymmetric and skewed gates
◦ Downsize noncritical transistor on
unimportant input
◦ Reduces parasitic delay for critical input
A
Y
reset
1 2
Y
A 4/3
reset 4
P/N Ratios
In general, best P/N ratio is sqrt of equal
delay ratio.
◦ Only improves average delay slightly for
inverters
◦ But significantly decreases area and power
Inverter NAND2 NOR2
2 2 B 2
Y
fastest 1.414 A 2
A 2
A Y Y
P/N ratio 1 gu = B 2 gu = 1 1 gu =
gd = gd = gd =
gavg = gavg = gavg =
Observations
For speed:
◦ NAND vs. NOR
◦ Many simple stages vs. fewer high fan-in stages
◦ Latest-arriving input
For area and power:
◦ Many simple stages vs. fewer high fan-in stages
Sequencing
Combinational logic
◦ output depends on current inputs
Sequential logic
◦ output depends on current and previous inputs
◦ Requires separating previous, current, future
◦ Called state or tokens
◦ Ex: FSM, pipeline
clk clk clk clk
in out
CL CL CL
Sequencing Cont.
If tokens moved through pipeline at constant
speed, no sequencing elements would be
necessary
Ex: fiber-optic cable
◦ Light pulses (tokens) are sent down cable
◦ Next pulse sent before first reaches end of cable
◦ No need for hardware to separate pulses
◦ But dispersion sets min time between pulses
This is called wave pipelining in circuits
In most circuits, dispersion is high
◦ Delay fast tokens so they don’t catch slow ones.
Sequencing Overhead
Use flip-flops to delay fast tokens so they
move through exactly one stage each cycle.
Inevitably adds some delay to the slow
tokens
Makes circuit slower than just the logic delay
◦ Called sequencing overhead
Some people call this clocking overhead
◦ But it applies to asynchronous circuits too
◦ Inevitable side effect of maintaining sequence
Sequencing Elements
Latch: Level sensitive
◦ a.k.a. transparent latch, D latch
Flip-flop: edge triggered
◦ A.k.a. master-slave flip-flop, D flip-flop, D register
Timing Diagrams
◦ Transparent
clk clk
Latch
Flop
D Q D Q
◦ Opaque
◦ Edge-trigger clk
Q (latch)
Q (flop)
Latch Design
Pass Transistor Latch
Pros
+ Tiny
+ Low clock load D Q
Cons Used in 1970’s
◦ Vt drop
◦ nonrestoring
◦ backdriving
◦ output noise sensitivity
◦ dynamic
◦ diffusion input
CMOS VLSI Design 4th Ed.
47
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Latch Design
Transmission gate
+ No Vt drop
D Q
- Requires inverted clock
Latch Design
Inverting buffer
+ Restoring D
X
Q
+ No backdriving
+ Fixes either
D Q
Output noise sensitivity
Or diffusion input
◦ Inverted output
Latch Design
Tristate feedback
+ Static X
D Q
◦ Backdriving risk
because of leakage
Latch Design
Buffered input
+ Fixes diffusion input D
X
Q
+ Noninverting
Latch Design
Buffered output Q
+ No backdriving D
X
Widely used in standard cells
+ Very robust (most important)
- Rather large
- Rather slow (1.5 – 2 FO4 delays)
- High clock loading
CMOS VLSI Design 4th Ed.
52
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Latch Design
Datapath latch Q
+ smaller D
X
+ faster
- unbuffered input
Flip-Flop Design
Flip-flop is built as pair of back-to-back
latches
X
D Q
Q
X
D Q
Enable
Enable: ignore clock when en = 0
◦ Mux: increase latch D-Q delay
◦ Clock Gating: increase en setup time, skew
Symbol Multiplexer Design Clock Gating Design
en
D 1
Latch
Latch
Latch
D Q Q D Q
0
en en
en
D 1
Flop
Q
0
Flop
Flop
D Q D Q
en
en
Reset
Force output low when reset asserted
Synchronous vs. asynchronous
Symbol
Latch
Flop
D Q D Q
reset reset
Synchronous Reset
Q Q
reset reset
Q
D D
Q
Q
Asynchronous Reset
reset
reset
D
D
reset
reset
Set / Reset
Set forces output high when enabled
Sequencing Methods
Tc
Flip-flops
Flip-Flops
clk
Flop
Flop
Combinational Logic
Pulsed Latches
1 2 1
Latch
Latch
Latch
Combinational Combinational
Logic Logic
Half-Cycle 1 Half-Cycle 1
Pulsed Latches
p tpw
p p
Latch
Latch
Combinational Logic
Timing Diagrams
Contamination and
Propagation Delays A tpd
Combinational
A Y
tpd Logic Prop. Delay Logic
Y tcd
Flop
D Q D
tccq Latch/Flop Clk->Q Cont. Delay tpcq
Q tccq
tpdq Latch D->Q Prop. Delay
D Q D tpdq
tcdq
thold Latch/Flop Hold Time Q
Max-Delay: Flip-Flops
t pd Tc tsetup t pcq
clk clk
Q1 D2
F1
F2
sequencing overhead Combinational Logic
Tc
tsetup
clk
tpcq
Q1 tpd
D2
L1
L2
L3
sequencing overhead Logic 1 Logic 2
1
2
Tc
D1 tpdq1
Q1 tpd1
D2 tpdq2
Q2 tpd2
D3
D1 Q1 D2 Q2
L1
L2
Combinational Logic
sequencing overhead
Tc
D1 tpdq
D2
p
tpcq Tc tpw
Q1 tpd tsetup
(b) tpw < tsetup
D2
Min-Delay: Flip-Flops
clk
F1
CL
clk
D2
F2
clk
Q1 tccq tcd
D2 thold
L1
CL
2
Hold time reduced by D2
L2
nonoverlap
tnonoverlap
1
Paradox: hold applies
tccq
2
twice each cycle, vs.
only once for flops. Q1 tcd
D2 thold
L1
CL
p
Hold time increased
D2
by pulse width
L2
p
tpw
thold
Q1 tccq tcd
D2
Time Borrowing
In a flop-based system:
◦ Data launches on one rising edge
◦ Must setup before next rising edge
◦ If it arrives late, system fails
◦ If it arrives early, time is wasted
◦ Flops have hard edges
In a latch-based system
◦ Data can pass through latch while transparent
◦ Long cycle of logic can borrow time into next
◦ As long as each loop
CMOS VLSI completes in one cycle
4th Ed.
Design
66
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
2
1 2 1
Latch
Latch
Latch
Combinational
(a) Combinational Logic
Logic
Loops may borrow time internally but must complete within the cycle
c tsetup tnonoverlap
D1 Q1 D2 Q2
T
L1
L2
Combinational Logic 1
tborrow
2
1
2 tnonoverlap
Pulsed Latches Tc
tsetup
tborrow t pw tsetup Tc/2
Nominal Half-Cycle 1 Delay
tborrow
D2
Clock Skew
We have assumed zero clock skew
Clocks really have uncertainty in arrival
time
◦ Decreases maximum propagation delay
◦ Increases minimum contamination delay
◦ Decreases time borrowing
Skew: Flip-Flops
clk clk
F1
F2
Combinational Logic
Tc
sequencing overhead
clk
Q1 tpdq tsetup
D2
clk
Q1
F1
CL
clk
D2
F2
tskew
clk
thold
Q1 tccq
D2 tcd
Skew: Latches
2-Phase Latches 1 2 1
2t
D1 Q1 Combinational D2 Q2 Combinational D3 Q3
L1
L2
L3
t pd Tc pdq
Logic 1 Logic 2
sequencing overhead 1
Two-Phase Clocking
If setup times are violated, reduce clock speed
If hold times are violated, chip fails at any speed
In this class, working chips are most important
◦ No tools to analyze clock skew
An easy way to guarantee hold times is to use 2-
phase latches with big nonoverlap times
Call these clocks f1, f2 (ph1, ph2)
Safe Flip-Flop
Past years used flip-flop with
nonoverlapping clocks
◦ Slow – nonoverlap adds to setup time
◦ But no hold times
In industry, use a better timing analyzer
◦ Add buffers
to slow
signals if hold
Q
time is at
risk X
D Q
4th Ed.
CMOS VLSI Design
73
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
Adaptive Sequencing
p
Designers include timing margin X
◦ Voltage
ERR
D Q
◦ Temperature
◦ Process variation
D
p
◦ Data dependency Q
X
Summary
Flip-Flops:
◦ Very easy to use, supported by all tools
2-Phase Transparent Latches:
◦ Lots of skew tolerance and time borrowing
Pulsed Latches:
◦ Fast, some skew tol & borrow, hold time risk
UNIT IV
CMOS TESTING
Ms.K.Samgeethalakshmi
Assistant Professor
Department of ECE
Sangeetha.lk@rmkcet.ac.in
SYLLABUS
Need for testing
Testers, Text fixtures and test programs
Logic verification
Silicon debug principles
Manufacturing test
Design for testability
Boundary scan
TESTING
Testing is one of the most expensive parts of
chips
Logic verification accounts for > 50% of design effort
for many chips
Debug time after fabrication has enormous
opportunity cost
Shipping defective parts can sink a company
LOGIC VERIFICATION
Does the chip simulate correctly?
Usually done at HDL level
Verification engineers write test bench for HDL
Can’t test all cases
Look for corner cases
SILICON DEBUG
Test the first chips back from fabrication
If you are lucky, they work the first time
If not…
Logic bugs vs. electrical failures
Most chip failures are logic bugs from inadequate
simulation
Some are electrical failures
Crosstalk
Ratio failures
SHMOO PLOTS
How to diagnose failures?
Hard to access chips
Picoprobes
Electron beam
Built-in self-test
Shmoo plots
Vary voltage, frequency
Look for cause of
electrical failures
MANUFACTURING TEST
A speck of dust on a wafer is sufficient to kill chip
Yield of any chip is < 100%
Must test chips after manufacturing before delivery
to customers to only ship good parts
Manufacturing testers are
very expensive
Minimize time on tester
Careful selection of
test vectors
MANUFACTURING FAILURES
STUCK-AT FAULTS
How does a chip fail?
Usually failures are shorts between two conductors
or opens in a conductor
This can cause very complicated behavior
A simpler model: Stuck-At
Assume all failures cause nodes to be “stuck-at” 0 or
1, i.e. shorted to GND or VDD
Not quite true, but works well in practice
EXAMPLES
10
12
TEST EXAMPLE
SA1 SA0 A3 n1
A2
A3 Y
{0110} {1110} A
n2
1
n3
A
A2 {1010} {1110} 0
A1 {0100} {0110}
A0 {0110} {0111}
n1 {1110} {0110}
n2 {0110} {0100}
n3 {0101} {0110}
Y {0110} {1110}
Minimum set: {0100, 0101, 0110, 0111, 1010,
13
1110}
CMOS VLSI Design 4th Ed.
SCAN
CLK
Convert each flip-flop to a scan register SCAN
Flop
Only costs one extra multiplexer SI Q
D
Normal mode: flip-flops behave as usual
Scan mode: flip-flops behave as shift register
scan-in
Flop
Flop
Flop
Contents of flops
Flop
Flop
Flop
can be scanned Logic Logic
inputs Cloud Cloud outputs
Flop
Flop
values scanned
Flop
Flop
Flop
in scanout
15
SCANNABLE FLIP-FLOPS
Q Q
SI 1 SI
(a)
(b)
d
D
d Q
SCAN
d X
Q
s
s
SI
(c)
s
16
ATPG
Test pattern generation is tedious
Automatic Test Pattern Generation (ATPG) tools
produce a good set of vectors for each block of
combinational logic
Scan chains are used to control and observe the
blocks
Complete coverage requires a large number of
vectors, raising the cost of test
Most products settle for covering 90+% of
potential stuck-at faults
17
BUILT-IN SELF-TEST
Built-in self-test lets blocks test themselves
Generate pseudo-random inputs to comb. logic
Combine outputs into a syndrome
With high probability, block is fault-free if it
produces the expected syndrome
18
PRSG
Linear Feedback Shift Register
Shift register with input taken from XOR of state
Pseudo-Random Sequence Generator
Step Y
CLK Y 0 111
Q[0] Q[1] Q[2]
Flop
Flop
Flop
D D D
1 110
2 101
3 010
Flops reset to 111 4 100
5 001
6 011
7 111 (repeats)
19
BILBO
Built-in Logic Block Observer
Combine scan with PRSG & signature analysis
D[0] D[1] D[2]
C[0]
C[1]
Q[2] / SO
Flop
Flop
Flop
SI 1
0 Q[0]
Q[1]
20
BOUNDARY SCAN
Testing boards is also difficult
Need to verify solder joints are good
Drive a pin to 0, then to 1
Check that all connected pins get the values
21
CHIP B CHIP C
CHIP A CHIP D
23
SUMMARY
Think about testing from the beginning
Simulate as you go
Plan for test after fabrication
24
VERILOG HDL
VERILOG FUNDAMENTALS
WHAT IS VERILOG
Developed in 1984
Hardware Softwre
Spec Spec
ASIC
FPGA Boards
&
Software
PLD Systems
Std Parts
Behavioral
Gate
Layout (VLSI)
Concurrency
Structure
Procedural
Time
USER IDENTIFIERS
Formed from {[A-Z], [a-z], [0-9], _, $}, but ..
.. can’t begin with $ or [0-9]
myidentifier
m_y_identifier
3my_identifier
$my_identifier
_myidentifier$
Case sensitivity
myid Myid
COMMENTS
/* Multiple line
comment */
NETS (I)
NETS (II)
A wire Y; // declaration
Y assign Y = A & B;
B
wand Y; // declaration
assign Y = A;
A assign Y = B;
Y
B
wor Y; // declaration
assign Y = A;
assign Y = B;
dr
tri Y; // declaration
A Y
assign Y = (dr) ? A : z;
REGISTERS
Variables that store values
Do not represent real hardware but ..
VECTORS
Represent buses
wire [3:0] busA;
reg [1:4] busB;
reg [1:0] busC;
Left number is MS bit
Slice management
busC[1] = busA[2];
busC = busA[2:1]; busC[0] = busA[1];
Vector assignment (by position!!)
busB[1] = busA[3];
busB[2] = busA[2];
busB[3] = busA[1];
busB = busA;
busB[4] = busA[0];
ARRAYS (I)
Syntax
integer count[1:5]; // 5 integers
reg var[-15:16]; // 32 1-bit regs
reg [7:0] mem[0:1023]; // 1024 8-bit regs
Accessing array elements
Entire element: mem[10] = 8’b 10101010;
Element subfield (needs temp storage):
reg [7:0] temp;
..
temp = mem[10];
var[6] = temp[2];
ARRAYS (II)
Limitation: Cannot access array subfield or
entire array at once
var[2:9] = ???; // WRONG!!
var = ???; // WRONG!!
No multi-dimentional arrays
reg var[1:10] [1:100]; // WRONG!!
Arrays don’t work for the Real data type
real r[1:10]; // WRONG !!
STRINGS
Implemented with regs:
reg [8*13:1] string_val; // can hold up to 13 chars
..
string_val = “Hello Verilog”;
string_val = “hello”; // MS Bytes are filled with 0
string_val = “I am overflowed”; // “I ” is truncated
Escaped chars:
\nnewline
\ttab
%%%
\\\
\““
LOGICAL OPERATORS
&& logical AND
|| logical OR
! logical NOT
Operands evaluated to ONE bit value: 0, 1 or x
Result is ONE bit value: 0, 1 or x
A = 6; A && B 1 && 0 0
B = 0; A || !B 1 || 1 1
C = x; C || B x || 0 x
but C&&B=0
c = a ^ b;
a = 4’b1010;
b = 2’b11;
REDUCTION OPERATORS
& AND
| OR
^ XOR
~& NAND
~| NOR
~^ or ^~ XNOR
SHIFT OPERATORS
>> shift right
<< shift left
filled
a = 4’b1010;
...
d = a >> 2; // d = 0010
c = a << 1; // c = 0100
CONCATENATION OPERATOR
{op1, op2, ..} concatenates op1, op2, .. to single
number
Operands must be sized !!
reg a;
reg [2:0] b, c;
..
a = 1’b 1;
b = 3’b 010;
c = 3’b 101;
catx = {a, b, c}; // catx = 1_010_101
caty = {b, 2’b11, a}; // caty = 010_11_1
catz = {b, 1}; // WRONG !!
Replication ..
catr = {4{a}, b, 2{c}}; // catr = 1111_010_101101
RELATIONAL OPERATORS
> greater than
< less than
>= greater or equal than
<= less or equal than
EQUALITY OPERATORS
== logical equality
Return 0, 1 or x
!= logical inequality
=== case equality
!== case inequality Return 0 or 1
CONDITIONAL OPERATOR
cond_expr ? true_expr : false_expr
A
1
Y
Y = (sel)? A : B;
B
0
sel
OPERATOR PRECEDENCE
Use parentheses to
enforce your
priority
HIERARCHICAL DESIGN
Top Level
E.g.
Module
Full Adder
Sub-Module Sub-Module
1 2
MODULE
module my_module(out1, ..,
inN);
in1 my_module out1 output out1, .., outM;
in2 out2 input in1, .., inN;
f
.. // declarations
inN outM .. // description of f (maybe
.. // sequential)
endmodule
assign S = A ^ B;
A S
Half assign C = A & B;
B Adder C
endmodule
cin
module full_adder(sum, cout, in1, in2, cin);
output sum, cout;
input in1, in2, cin;
endmodule
HIERARCHICAL NAMES
ha2.A
cin
PORT ASSIGNMENTS
module
Inputs reg or net net
module
module
net net
Inouts
41
Download Useful Materials from Rejinpaul.com
www.rejinpaul.com
www.rejinpaul.com
CONTINUOUS ASSIGNEMENTS
A CLOSER LOOK
Syntax:
assign #del <id> = <expr>;
Usage:
nand (out, in1, in2); 2-input NAND without delay
and #2 (out, in1, in2, in3); 3-input AND with 2 t.u.
delay
not #1 N1(out, in); NOT with 1 t.u. delay and instance
name
xor X1(out, in1, in2); 2-input XOR with instance name
Write them inside module, outside procedures
“INITIAL” BLOCKS
Start execution at sim time zero and finish when
their last statement executes
module nothing;
initial
$display(“I’m first”);
Will be displayed
at sim time 0
initial begin
#50;
$display(“Really?”); Will be displayed
end at sim time 50
endmodule
“ALWAYS” BLOCKS
Start execution at sim time zero and continue until
sim finishes
EXAMPLES
EVENTS (II)
wait (expr)
always begin
wait (ctrl)
#10 cnt = cnt + 1; execution loops every
#10 cnt2 = cnt2 + 2;
time ctrl = 1 (level
end
sensitive timing control)
EXAMPLE
TIMING (I)
d
initial begin
#5 c = 1; c
#5 b = 0;
#5 d = c; b
end
0 5 10 15
Time
Each assignment is
blocked by its previous one
TIMING (II)
d
initial begin
fork c
#5 c = 1;
#5 b = 0; b
#5 d = c;
join 0 5 10 15
end Time
Assignments are
not blocked here
PROCEDURAL STATEMENTS: IF
E.g. 4-to-1 mux:
module mux4_1(out, in, sel);
output out;
if (expr1) input [3:0] in;
true_stmt1; input [1:0] sel;
reg out;
else if (expr2) wire [3:0] in;
wire [1:0] sel;
true_stmt2;
.. always @(in or sel)
if (sel == 0)
else out = in[0];
def_stmt; else if (sel == 1)
out = in[1];
else if (sel == 2)
out = in[2];
else
out = in[3];
endmodule
reg [3:0] Y;
wire start;
integer i;
initial
Y = 0;
E.g.
module count(Y, start);
output [3:0] Y;
input start;
reg [3:0] Y;
wire start;
while (expr) integer i;
stmt; initial
Y = 0;
E.g.
module count(Y, start);
output [3:0] Y;
input start;
initial
Can be either an Y = 0;
integer or a variable
always @(posedge start)
repeat (4) #10 Y = Y + 1;
endmodule
STRUCTURAL VS PROCEDURAL
Structural Procedural
textual description of Think like C code
circuit
order does not matter Order of statements are
important
Starts with assign Starts with initial or
statements always statement
wire c, d; reg c, d;
assign c =a & b; always@ (a or b or c) begin
assign d = c |b; assign c =a & b;
assign d = c |b; end
STRUCTURAL VS PROCEDURAL
Procedural Structural
wire [3:0]Q;
reg [3:0] Q;
wire [1:0]y;
wire [1:0] y;
assign
always@(y) Q[0]=(~y[1])&(~y[0]),
begin Q[1]=(~y[1])&y[0],
Q=4’b0000; Q[2]=y[1]&(~y[0]),
case(y) begin Q[3]=y[1]&y[0];
2’b00: Q[0]=1;
2’b01: Q[1]=1;
2’b10: Q[2]=1;
Q[0]
2’b11: Q[3]=1;
endcase
end Q[1]
BLOCKING VS NON-BLOCKING
Blocking Non-blocking
<variable> = <variable> <=
<statement> <statement>
BLOCKING VS NON-BLOCKING
Initial
begin
#1 e=2;
#1 b=1;
#1 b<=0;
e<=b; // grabbed the old b
f=e; // used old e=2, did not wait e<=b
Typical example:
clock generation in test modules
module test;
endmodule
MIXED MODEL
reg Y;
wire c, clk, res;
res wire n;
c n Y not(n, c); // gate-level
clk
always @(res or posedge clk)
if (res)
Y = 0;
else
Y = n;
endmodule
SYSTEM TASKS
Always written inside procedures
COMPILER DIRECTIVES
50ns
PARAMETERS
in[3:0] p_in[3:0]
out[2:0]
wu
A. Implelementation
without parameters
wd
clk
endmodule endmodule
(II)
module top(out, in, clk);
output [1:0] out;
input [3:0] in;
A. Implelementation input clk;
without parameters (cont.)
wire [1:0] out;
wire [3:0] in;
wire clk;
endmodule
(III)
module top(out, in, clk);
B. Implelementation output [1:0] out;
with parameters input [3:0] in;
input clk;
wire [1:0] out;
module dff(Q, D, clk); wire [3:0] in;
wire clk;
parameter WIDTH = 4;
output [WIDTH-1:0] Q; wire [3:0] p_in;
input [WIDTH-1:0] D; wire wu, wd;
input clk;
assign wu = p_in[3] & p_in[2];
reg [WIDTH-1:0] Q; assign wd = p_in[1] & p_in[0];
wire [WIDTH-1:0] D;
wire clk; dff instA(p_in, in, clk);
// WIDTH = 4, from declaration
always @(posedge clk) dff instB(out, {wu, wd}, clk);
Q = D; defparam instB.WIDTH = 2;
// We changed WIDTH for instB only
endmodule
endmodule
endmodule
LOGIC
Combination logic function can be expressed as:
logic_output(t) = f(logic_inputs(t))
Combinational
logic_inputs(t) logic_outputs(t)
Logic
Rules
Avoid technology dependent modeling; i.e. implement
functionality, not timing.
The combinational logic must not have feedback.
Specify the output of a combinational behavior for all
possible cases of its inputs.
Logic that is not combinational will be synthesized as
sequential.
LOGIC
NETLIST
Synthesis tools further optimize a gate netlist specified
in terms of Verilog primitives
Example:
General Steps:
Logic gates are translated to Boolean equations.
The Boolean equations are optimized.
Optimized Boolean equations are covered by library
gates.
Complex behavior that is modeled by gates is not
mapped to complex library cells (e.g. adder,
multiplier)
The user interface allows gate-level models to be
preserved in synthesis.
CONTINUOUS ASSIGNMENTS
Example:
module or_nand_2 (enable, x1, x2, x3, x4, y);
input enable, x1, x2, x3, x4;
output y;
assign y = !(enable & (x1 | x2) & (x3 | x4));
endmodule
Example:
module or_nand_3 (enable, x1, x2, x3, x4, y);
input enable, x1, x2, x3, x4;
output y;
reg y;
always @ (enable or x1 or x2 or x3 or x4)
if (enable)
y = !((x1 | x2) & (x3 | x4));
else
y = 1; // operand is a constant.
endmodule
Example:
module or_nand_4 (enable, x1, x2, x3, x4, y);
input enable, x1, x2, x3, x4;
output y;
assign y = or_nand(enable, x1, x2, x3, x4);
function or_nand;
input enable, x1, x2, x3, x4;
begin
or_nand = ~(enable & (x1 | x2) & (x3 | x4));
end
endfunction
endmodule
task or_nand;
input enable, x1, x2, x3, x4;
output y;
begin
y = !(enable & (x1 | x2) & (x3 | x4));
end
endtask
endmodule
SYNTHESIS OF MULTIPLEXORS
Conditional Operator
Note: CASE statement and if/else statements are more preferred and
recommended styles for inferring MUX
UNWANTED LATCHES
PRIORITY LOGIC
Functional Specs.
Data-path Controller
etc.
All state machines have the general feedback structure consisting of:
Combinational logic implements the next state logic
• Next state (ns) of the machine is formed from the current
state (cs) and the current inputs
State register holds the value of current state
Next State
Inputs Current
Next-State State
Memory
Logic
ns cs
Inputs Next-State State Output Outputs
Logic Register Logic
Next state depends on the current state and the inputs but the output
depends only on the present state
Output Outputs
ns cs Logic
Inputs Next-State State
Logic Register
Next state and the outputs depend on the current state and the inputs
//Output assignments
endmodule
reset
reset_state out_bit = 0
0 1
1
FSM
out_bit = 0 read_1_zero read_1_one out_bit = 0
Flow-Chart
0
0 0 1 1
0 read_2_zero read_2_one 1
out_bit = 1 out_bit = 1