Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Vlsi

Download as pdf or txt
Download as pdf or txt
You are on page 1of 79

BiCMOS Technology

Combines Bipolar and CMOS transistors in a


single integrated circuit.
By retaining benefits of bipolar and CMOS,
BiCMOS is able to achieve VLSI circuits with
speed-power-density performance previously
unattainable with either technology individually.
BiCMOS technology enables highperformance
integrated circuits ICs but increases process
complexity

Characteristics of CMOS
Technology

Lower static power dissipation
Higher noise margins
Higher packing density lower manufacturing cost per device
High yield with large integrated complex functions



High input impedance (low drive current)
Scaleable threshold voltage
High delay sensitivity to load (fan-out limitations)
Low output drive current (issue when driving large capacitive loads)
Low transconductance, where transconductance, gm o Vin
Bi-directional capability (drain & source are interchangeable)
A near ideal switching device


Advantages of CMOS over
bipolar

Other CMOS Advantages

Characteristics of Bipolar
Technology
Higher switching speed
Higher current drive per unit area, higher gain
Generally better noise performance and better high frequency
characteristics
Better analogue capability
Improved I/O speed (particularly significant with the growing
importance of package limitations in high speed systems).

high power dissipation
lower input impedance (high drive current)
low voltage swing logic
low packing density
low delay sensitivity to load
high gm (gm o Vin)
high unity gain band width (ft) at low currents
essentially unidirectional
Advantages of Bipolar
over CMOS

Other Bipolar
Advantages

BJT & CMOS Advantages
BJT over CMOS
Switching speed
Current drive per unit area
Noise performance no 1/f
noise
Analog capability
I/O speed
High transconductance
CMOS over BJT
Power dissipation
Noise margin
Packing density
Ability to integrate large
and complex circuits and
functions with high yield
Good switch

BiCMOS Circuit Advantages
Improved speed over CMOS
Lower power dissipation over BJT (Simplifies
packing and board requirements)
Flexible I/O (ECL, CMOS, or TTL)
High performance analog
Latch-up immunity
High impedance input (FET)
High gain (BJT)
Low 1/f noise, >1 GHz toggle frequency
Zero offset analog switches
Gain-bandwidth product extended
Good voltage reference
Low input offset voltage for differential pair
The simplified BiCMOS Inverter
Two bipolar transistors (T3
and T4), one nMOS and
one pMOS transistor (both
enhancement-type devices,
OFF at Vin=0V)
The MOS switches perform
the logic function & bipolar
transistors drive output
loads

V
out


V
dd


V
in


T
2


T
4


T
1


T
3


C
L


Vin = 0 :
T1 is off. Therefore T3 is non-conducting.
T2 ON - supplies current to base of T4.
T4 base voltage set to Vdd.
T4 conducts & acts as current source to charge load CL
towards Vdd.
Vout rises to Vdd - Vbe (of T4)
Note : Vbe (of T4) is base-emitter voltage of T4.
(pullup bipolar transistor turns off as the output
approaches 5V - Vbe (of T4))

Vin = Vdd :
T2 is off. Therefore T4 is non-conducting.
T1 is on and supplies current to the base of T3
T3 conducts & acts as a current sink to discharge load CL
towards 0V.
Vout falls to 0V+ VCEsat (of T3)



T3 & T4 present low impedances when turned on into
saturation & load CL will be charged or discharged rapidly.
Output logic levels will be good & will be close to rail
voltages since VCEsat is quite small & VBE ~ 0.7V.
Therefore, inverter has high noise margins
Inverter has high input impedance, i.e., MOS gate input
Inverter has low output impedance
Inverter has high drive capability but occupies a relatively
small area
However, this is not a good arrangement to implement
since no discharge path exists for current from the base of
either bipolar transistor when it is being turned off, i.e.,
when Vin=Vdd, T2 is off and no conducting path to the base of T4
exists
when Vin=0,T1 is off and no conducting path to the base of T3
exists
This will slow down the action of the circuit.


Further advantages of BiCMOS
Technology
High impedance CMOS transistors may be used for the
input circuitry while the remaining stages and output
drivers are realized using bipolar transistors.
BiCMOS devices offer many advantages where high load
current sinking and sourcing is required.
The high current gain of the NPN transistor greatly
improves the output drive capability of a conventional
CMOS device.
MOS speed depends on device parameters such as
saturation current and capacitance.
These in turn depend on oxide thickness, substrate
doping and channel length.
BiCMOS can take advantage of any advances in CMOS
and/or bipolar technology, greatly accelerating the
learning curve normally associated with new technologies.



Are there disadvantages with
BiCMOS technology ?
Main disadvantage: greater process complexity when
compared to CMOS
Results in a 1.25 -> 1.4 times increase in die costs over
conventional CMOS.
Taking into account packaging costs, the total
manufacturing costs of supplying a BiCMOS chip ranges
from 1.1-> 1.3 times that of CMOS.
Since BiCMOS does not have to be scaled as
aggressively as CMOS, existing fabs can be utilized
resulting in lower capital costs.
Extra costs incurred in developing a BiCMOS technology
is more than offset by the fact that the enhanced chip
performance obtained extends the usefulness of
manufacturing equipment & clean rooms by at least one
technology generation.

Signal Strengths
Signals such as 1 and 0 have strengths, measures ability
to sink or source current VDD and GND Rails are the
strongest 1 and 0.
Under the switch abstraction, G has complete control
and S and D have no effect.
In reality, the gate can turn the switch on only if a
potential difference of at least Vt exists between the G
and S.
We will look at Vt in detail later on in the course.
Thus signal strengths are related to Vt and therefore p
and n transistors produce signals with different strengths
Tristate Inverter
C=1, C=0 =>
inverter function Z=A
C=0, C=1 =>
high impedance output Z
Building CMOS Gates (n-side)
Building CMOS Gates (p-side)
Building CMOS Gates (Final CMOS gate)
Pseudo-NMOS Logic
In CMOS, use PMOS that is
always ON to make pseudo-
NMOS gates.
PMOS transistor is always on and
thus fights the pull-down NMOS
network.
Ratio of NMOS to PMOS strength
very important design parameter.
Uses a p-type transistor as a
resistive pullup, ntype network for
pulldowns.
Consumes static power.
Has much smaller pullup network than static gate.
Pulldown time is longer because pullup is fighting.
For logic 0 output, pullup and pulldown form a voltage
divider.
Must choose n, p transistor sizes to create effective
resistances of the required ratio.
Effective resistance of pulldown network must be
comptued in worst case; series n-types means larger
transistors.
Make PMOS about 1/4 effective strength of pulldown
network

Pseudo-NMOS Example
Pseudo-NMOS is a ratio circuit where dc current flows when the N pull-down
tree is conducting.
Must design the ratio of N devices W/L to P load device W/L so that
when the N pull down leg with max resistance is conducting, the output
is at a sufficiently low VOL.
Dynamic CMOS Logic
General Dynamic CMOS Logic:
IDD Path is turned off when clock-disabled and/or the
output is evaluated when clock enabled.
In general:
Basic Dynamic CMOS n-Logic.
Basic Dynamic CMOS p-Logic.
Clocked CMOS Logic.
Domino Logic
Basic Dynamic CMOS Logic with n-
Logic
Example
Erroneous Evaluation in Cascaded Dynamic n-
Logics:
They are evaluated at the same clock edges.
Even slow evaluation may be impossible if the output
has been discharged by pre-charged inputs.
Cascading Problem in Dynamic CMOS Logic
If several stages of the previous circuit are cascaded
together using the same clock |, a problem in evaluation
involving a built-in race condition will exist
During pre-charge, both Vout1 and Vout2 are pre-charged to Vdd
When | goes high to begin evaluate, all inputs at stage 1 require
some finite time to resolve, but during this time charge may
erroneously be discharged from Vout2
e.g. assume that eventually the 1
st
stage NMOS logic tree conducts
and fully discharges Vout1, but since all the inputs to the N-tree all not
immediately resolved, it takes some time for the N-tree to finally
discharge Vout1 to GND.
If, during this time delay, the 2
nd
stage has the input condition shown
with bottom NMOS transistor gate at a logic 1, then Vout2 will start to
fall and discharge its load capacitance until Vout1 finally evaluates and
turns off the top series NMOS transistor in stage 2
The result is an error in the output of the 2
nd
stage Vout2

Cascaded Dynamic CMOS Logic Gates:
Evaluate Problem
With simple cascading of dynamic CMOS logic
stages, a problem arises in the evaluate cycle:
The pre-charged high voltage on Node N2 in stage 2
may be inadvertently (partially) discharged by logic
inputs to stage 2 which have not yet reached final
correct (low) values from the stage 1 evaluation
operation.
Can not simply cascade dynamic CMOS logic gates
without preventing unwanted bleeding of charge from
pre-charged nodes
Possible Solutions:
Two phase clocks
Use of inverters to create Domino Logic
NP Domino Logic
Zipper/NORA logic

Single Phase Dynamic CMOS Gates
When the gates are precharged, N1 and N2 are charged
to Vdd.
During evaluate, n-logic-1 will conditionally discharge
with some delay due to t.
N1 must be fully evaluated before
N2 is evaluated; otherwise the output will N2 be in error.
This cannot be assured with single-phase clocks.
These problems can be overcome by using multi-phase
clocks and a sample and hold circuit to isolate cascaded
stages and control evaluate timing.

Example:

During precharge, is automatically high and S is
automatically ON
Suppose that n-logic-1 is ON and n-logic-2 is ON
Immediately after precharge ends,
N2 will tend to be pulled down until N1is pulled down
by n-logic-1 thereby turning S OFF
To make cascaded stages (single phase clock),
S should be OFF until it is conditionally turned
ON ( by N1 ).
Can do this by inserting an inverter between
N1and S => domino logic N1.

CMOS Domino Logic
Problem with faulty discharge of precharged nodes in
CMOS dynamic logic circuits can be solved by placing
an inverter in series with the output of each gate:
All inputs to N logic blocks (which are derived from
inverted outputs of previous stages) therefore will be at
zero volts during precharge and will remain at zero until
the evaluation stage has logic inputs to discharge the
precharged node PZ.
However, all circuits only provide non-inverted outputs
In (b) a weak P device compensates for charge loss
due to charge sharing and leakage at low frequency
clock operation
In (c) the weak P device can be used to latch the
output high

NP Domino Logic (NORA Logic)
An elegant solution to the dynamic CMOS logic erroneous
evaluation problem is to use NP Domino Logic (also called
NORA logic) as shown below.
Alternate stages of N logic with stages of P logic:
N logic stages use true clock, normal precharge and
evaluation phases, with N logic tree in the pull down
leg. P logic stages use a complement clock, with P
logic stage tied above the output node.
During precharge clk is low (-clk is high) and the P-
logic output precharges to ground while N-logic
outputs precharge to Vdd.
During evaluate clk is high (-clk is low) and both type
stages go through evaluation; N-logic tree logically
evaluates to ground while P-logic tree logically
evaluates to Vdd.
Inverter outputs can be used to feed other N-
blocks from N-blocks, or to feed other P-blocks
from P-blocks.

NORA CMOS Logic Circuit Example
An example of NP or NORA (No Race) logic is shown
below:
During | low (| high), each stage pre-charges
N logic stages pre-charge to Vdd; P logic stages pre-
charge to GND
When | goes high (| low), each stage enters
the evaluation phase
N logic evaluates to GND; P logic stages evaluate to
Vdd
All NMOS and PMOS stages evaluate one after
another in succession, as in Domino logic
Logic is:
Stage 1 is X = (A B)
Stage 2 is G = X + Y
Stage 3 is Z = (F G + H)

Zipper CMOS Dynamic Logic
Zipper CMOS logic is a scheme for improving
charge leakage and charge sharing problems
Pre-charge transistors receive a slightly modified
clock where the clock pulse (during pre-charge
off time) holds the pre-charge transistor at weak
conduction in order to provide a trickle pre-
charge current during the evaluation phase
PMOS pre-charge transistor gates are held at Vdd -
|Vtp|
NMOS pre-charge transistor gates are held at Vtn
above GND

Clocked CMOS Logic (C
2
MOS)
Clock Distribution
On a small chip, the clock distribution network is
just a wire
And possibly an inverter for clkb
On practical chips, the RC delay of the wire
resistance and gate load is very long
Variations in this delay cause clock to get to different
elements at different times
This is called clock skew
Most chips use repeaters to buffer the clock and
equalize the delay
Reduces but doesnt eliminate skew
Solutions
Reduce clock skew
Careful clock distribution network design
Plenty of metal wiring resources
Analyze clock skew
Only budget actual, not worst case skews
Local vs. global skew budgets
Tolerate clock skew
Choose circuit structures insensitive to skew
Clock Skew
Clock Skew Sources are:
systematic,
random,
drift,
jitter

Reduce clock skew
Careful clock distribution network design
Plenty of metal wiring resources
Analyze clock skew
Only budget actual, not worst case skews
Local vs. global skew budgets
Tolerate clock skew
Choose circuit structures insensitive to skew
Clock System Architecture
Synchronous chip interface with PLL
Phase-locked loop block diagram
Global Clock Distribution Networks
Grids
H-trees
Spines
Ad-hoc
Hybrid

Clock Grids
A clock grid is a mesh of horizontal and vertical wires
driven from the middle or edges.
Use grid on two or more levels to carry clock.
Make wires wide to reduce RC delay.
Ensures low skew between nearby points.
But possibly large skew across die.
Grids compensate for random skew.
Grids consume a large amount of metal resources and
hence a high switching capacitance and power
consumption.
Alpha Clock Grids
PLL
gclk grid
Alpha 21064 Alpha 21164 Alpha 21264
gclk grid
Alpha 21064 Alpha 21164 Alpha 21264
H-Trees
Fractal structure
Gets clock arbitrarily close to
any point
Matched delay along all paths
Delay variations cause skew
Buffers are added to serve
as repeaters
A and B might see big skew
High random skew, drift and
jitter between two nearby
points
A B
Ad-hoc
The clock is routed haphazardly with some
attempt to equalize wire lengths or add buffers to
equalize delay.
Have low systematic skews because the buffer
sizes can be adjusted until nominal delays are
nearly equal.
Subject to random skew.
If loads are uniform, the spine avoids the systematic skew
of the grid by matching the length of the clock wires.
Save power by not switching certain wires.
System with many clocked elements may require a large
number of serpentine routes, leading to high area and
capacitance for the clock network.
Clock spines have large skews between nearby elements
driven by different serpentines.
Clock Spine
Hybrid Networks
Use H-tree to distribute clock to many points.
Tie these points together with a grid.
Hybrid combination of H-tree and grid offers
lower skew.
Hybrid approach has lower systematic skew,
less susceptible to skew from non-uniform load
distribution.
Hybrid approach is regular, making layout of
well-controlled transmission line structures
easier.

Clock Tree Distribution
To prevent clock skew problems on a chip, clock
distribution networks are designed very carefully
Example shown: linear (E-W) clock tree distribution
network
Clock is buffered several times before driving FO=3
Each FO3 buffer drives another high FO (FO=4 shown) buffer
Finally another single buffer is used for each linear clock line to
drive across chip or functional island on a chip
H tree distribution network often used on chips with area
pads (solder bumps)
Master clock is brought on board chip near central part of chip
and driven outward with large H interconnection arrangement
Clock Generation
clk en
|
1
|
2
|
3
|
4
Scaling of MOS Circuits
VLSI technology is constantly evolving towards
smaller line widths.
Reduced feature size generally leads to:
Better / faster performance
More gate / chip
More accurate description of modern technology
is ULSI (ultra large scale integration.

Scaling Factors
In our discussions we will consider 2 scaling
factors, and
1/ is the scaling factor for VDD and oxide
thickness D
1/ is scaling factor for all other linear
dimensions
Scaling models are:
Constant electric field
Constant voltage field
Combined E & V
Scaling Factors for Device Parameters
Simple derivations showing the effects of scaling are
derived in Pucknell and Eshraghian pages 125 129.
It is important that you understand how the following
parameters are effected by scaling:
Gate Area
Gate Capacitance per unit area
Gate Capacitance
Charge in Channel
Channel Resistance
Transistor Delay
Maximum Operating Frequency
Transistor Current
Switching Energy
Power Dissipation Per Gate (Static and Dynamic)
Power Dissipation Per Unit Area
Power - Speed Product

Introduction
Wiring-Up of chip devices takes place through
various conductors produced during processing
Today, interconnects constitute the main source
of delay in MOS circuits
We will examine:
Sheet Resistance Resistance / Unit Area
Area Capacitance
Delay Units
CMOS Inverter Delay
Rise and Fall Time Estimation
Sheet Resistance
Resistance of a square slab of material
R
AB
= L/A
=> R = L/t*W
Let L = W (square slab)
=> R
AB
= /t = R
s
ohm / square
t
w
L
A
B
R
AB
= ZR
sh

Z = L/W
Typical sheet resistance values for materials
are very well characterized
Layer Rs (Ohm / Sq)
Aluminium 0.03
N Diffusion 10 50
Silicide 2 4
Polysilicon 15 - 100
N-transistor Channel 10
4
P-transistor Channel 2.5 x 10
4
Typical Sheet Resistances for 5m Technology
Polysilicon
N - diffusion
N-type Minimum Feature Device
R = 1sq x Rs = Rs = 10
4

L
W
2
2
Polysilicon
N - diffusion
L = 2
W = 8
R = Z Rs
R = (L/W) * Rs
R = 4 10
4

Area Capacitance of Layers
Conducting layers are separated from each other by
insulators (typically SiO2)
This may constitute a parallel plate capacitor, C =
0

ox
A
/ D (farads)
D = thickness of oxide, A = area,

ox
= 4 F/m
2

Area capacitance given in pF/m
2

Standard unit for a technology node is the gate - channel
capacitance of the minimum sized transistor (2 x 2),
given as Cg
This is a technology specific value



Delay Unit
For a feature size square gate, = Rs x Cg
i.e for 5m technology, = 10
4
ohm/sq x 0.01pF = 0.1ns
Because of effects of parasitics which we have not
considered in our model, delay is typically of the order of
0.2 - 0.3 ns
Note that is very similar to channel transit time
sd

CMOS Inverter Delay
Pull-down delay = Rpd x 2 Cg
Pull-up delay = Rpu x 2Cg
Asymmetry in rise and fall due to resistance difference
between pull-up and pull-down (factor of 2.5) (due to
motilities of carriers)
Delay through a pair of inverters is 2 (fall time) + 5
(rise time)
Delay through a pair of CMOS inverters is therefore 7
Asymmetry can be improved by reducing resistance of
pull - up
Reduce resistance of pull - up by increasing channel
width ( typically by a factor of 2.5)
Note that increasing channel width also increases the
capacitance

CMOS Inverter Rise and Fall Time
Estimation
T
f
~ 3C
L
/ V
DD


r
~ 3C
L
/ V
DD
(Derivations for the above are in Pucknell and
Eshraghian Pages 105 - 107)
So,
r
/
f
=
n
/
p

Given that (due to mobilities)
n
= 2.5
p,
rise time is
slower by a factor of 2.5 when using minimum
dimensions of n and p transistors

Large Capacitive Loads
When trying to drive off-chip loads, large capacitances
are often encountered
off-chip capacitances can be of the order of 10
4
Cg
Inverters intended to drive large capacitive loads must
therefore present low pull-up and pull-down resistances
The technique employed is to cascade a series of
inverters in series, each one of which is larger than the
preceding by a width factor f

Cascaded Inverters
If N is the number of stages required to drive the load,
f is the width scaling, C
L
is the load capacitance, and
C
min
is the minimum capacitance,
y = C
L
/C
min
= f
N

The number of stages is minimized if f = base of
natural log e
Details of this derivation is given in Pucknell and
Eshraghian pages 107 - 109

Power and Energy
Power is drawn from a voltage source attached to the
V
DD
pin(s) of a chip.

Instantaneous Power:

Energy:

Average Power:

( ) ( )
DD DD
P t i t V =
0 0
( ) ( )
T T
DD DD
E P t dt i t V dt = =
} }
avg
0
1
( )
T
DD DD
E
P i t V dt
T T
= =
}
CMOS Power Dissipation
Power dissipation in CMOS circuits comes from
two components i.e. Ptotal = Pstatic + Pdynamic
Static dissipation due to:
subthreshold conduction through OFF transistors
tunneling current through gate oxide
leakage through reverse-biased diodes
contention current in ratioed circuits
Dynamic dissipation due to:
charging and discharging of load capacitances
short circuit current while both PMOS and NMOS
networks are partially ON
Ratioed circuits (e.g. pseudo NMOS) have more
static dissipation.
Dynamic Power Dissipation
Dynamic power is required to charge and discharge load
capacitances when transistors switch.
One cycle involves a rising and falling output.
On rising output, charge Q = CV
DD
is required
On falling output, charge is dumped to GND
This repeats Tf
sw
times over an interval of T
C
f
sw
i
DD
(t)
VDD
Dynamic Power Cont.
C
f
sw
i
DD
(t)
VDD
| |
dynamic
0
0
sw
2
sw
1
( )
( )
T
DD DD
T
DD
DD
DD
DD
DD
P i t V dt
T
V
i t dt
T
V
Tf CV
T
CV f
=
=
=
=
}
}
Activity Factor
Suppose the system clock frequency = f
Let f
sw
= af, where a = activity factor
If the signal is a clock, a = 1
If the signal switches once per cycle, a =
Dynamic gates:
Switch either 0 or 2 times per cycle, a =
Static gates:
Depends on design, but typically a = 0.1

Dynamic power:
2
dynamic DD
P CV f o =
Short Circuit Current
When transistors switch, both nMOS and pMOS
networks may be momentarily ON at once
Leads to a blip of short circuit current.
< 10% of dynamic power if rise/fall times are comparable
for input and output
Static Power
Static power is consumed even when chip is
quiescent:
Ratioed circuits burn power in fight between ON
transistors
Leakage draws power from nominally OFF devices
0
1
gs t
ds
T T
V V
V
nv v
ds ds
I I e e


(
=
(
(

( )
0 t t ds s sb s
V V V V q | | = + +
Low Power Design
Reduce dynamic power
o: clock gating, sleep mode
C: small transistors (esp. on clock), short wires
V
DD
: lowest suitable voltage
f: lowest suitable frequency
Reduce static power
Selectively use ratioed circuits
Selectively use low V
t
devices
Leakage reduction:
stacked devices, body bias, low temperature
THANK YOU

You might also like