EE-382M Vlsi-Ii: A Brief Summary of Trends, Device Limitations, Scaling, Device Performance in CMOS Technologies

EE-382M
VLSIII
A brief summary of trends, device limitations, scaling,
device performance in CMOS technologies
Gian Gerosa, Intel
Fall 2008
The University of Texas at Austin
EE382M VLSI-II Class Notes
Page # 1/44
Core 2 duo in 45n

Core 2 duo in 65n
Core duo in 65n
P4 in 90n
Transistors on a chip doubles every ~2 years

Page # 2/44
Die Size Growth in Desktop/Mobile Processors
(mm per side)
Per side (mm)
100
10
8080
8008
4004
1
1970
8086
8085
1980
286
386
PPro P4
486 Pentium
P4 in 90n
Core2-duo in 65n
Core2-duo in 45n
Core-duo in 65n
~7% growth per year

~2X growth in 10 years
1990
Year
2000
2010
Die size used to grow ~14% every two years
Page # 3/44
Logic Transistor Density

P4
450K
P4
1000
P4
250K
90n
65n
45n
Shrinks & Compactions meet density goals

New u-Architectures drop density
Courtesy: Shekhar Borkar, Intel
Page # 4/44
Each processor has several revisions

Time
1.5
P646
1.0
P648
0.8
P650
0.35 0.25
P854 P856
0.18
P858
0.13
P860
ns
io
at
er
if
ol
pr
80386
0.6
P852
80846
Pentium
ad
Le
Pentium II,III
ns
ig
s
de
Pentium 4
Page # 5/44
120
120
1.0um
0.8um
0.6um
0.35um
0.25um
0.18um
0.13um
10000
10000
Frequency used to double every ~2 years

POWER WALL at ~100W stopped this.
Northwood
Willamette
80
80
Power (W)
1000
1000
60
60
Banias
100
100
Klamath
40
40
Katmai
1010
Deschutes
CuMine
20
20
00
1 1
Page # 6/44
Frequency (MHz)
100
100
Power Dissipation of Compactions

70
Power(Watts)
60
P4
50
130n
40
90n
30
P2&3
20
Pentium
10
386
486
0
1.5u
1.0u
0.7u
0.5u
0.35u
0.25u
0.18u
Lead processor power increases

Compactions provide higher performance at lower power
Page # 7/44
Power Dissipation of Lead uP

P4
PPro
Pentium
P4 (90n)
Core2-duo
65n
Core-duo
65n
Core2-duo
45n
10
8086 286
1
8008
4004
486
386
8085
8080
0.1
1971
1974
1978
1985
1992
2000
Power wall
Power (Watts)
100
Year
Power increases exponentially

Page # 8/44
Page # 9/44
Source: Mark Bohr, Intel Corporation

Page # 10/44
MEROM core2 duo in 65nm
~180 mm2
~ 450 million transistors
Page # 11/44
PENRYN core2 duo in 45nm
~105 mm2
~510 million transistors
Page # 12/44
SILVERTHORNE (ATOM Processor) in 45nm
~25 mm2
~47 million transistors
Page # 13/44
Deep Sub-micron CMOS device Cross Section
CoSi 2
Halo
Implant
N+
N+
N+
Shallow
trench
isolation
P-Well
Si 3 N 4
S/D
Extension
P+
P+
P+
N-Well
P-Epi
Page # 14/44
Deep Sub-Micron Transistors
Characteristics in the linear,

saturation, and sub-threshold
regions
Leakage
Parasitic Elements
Performance / Leakage tradeoff
Si3N4
CoSi2
70 nm
130nm Generation
Courtesy: Mark Bohr, Intel
Page # 15/44

Page # 16/44
Strained silicon increases electron/hole mobility.

Page # 17/44

Page # 18/44
High-K, Metal Gate 45 nm CMOS (intel)
K. Mistry, et al., A 45nm Logic Technology with High-k+ Metal Gate Transistors, Strained Silicon, 9 Cu Interconnect
Layers, 193nm Dry Patterning, and 100% Pb-free Packaging, Tech. Digest IEDM, Dec 2007.
Page # 19/44

Page # 20/44
SOURCES of TRANSISTOR LEAKAGE
gate
source
gate
drain
source
gate
drain
source
drain
Subthreshold
Leakage
Junction
Leakage
Gate
Leakage
Ioff
Ijctn
Igate
Ilkg indicator = 0.5(ON state) + 0.5(OFF state)

= 0.5(Igate(ON)) + 0.5(Ioff+Ijctn+Igate(OFF))
Page # 21/44
Transistor Leakage Components
Subthreshold
Leakage
@ Vgs=0
Design of High-Performance Microprocessor Circuits, IEEE Press, New York, 2001
Page # 22/44
Performance vs. Leakage Tradeoff
Page # 23/44
65 nm Transistors
Ioff vs. Idsat
IEDM 2004, P. Bai et. al., A 65nm Logic Technology

Featuring 35nm Gate Length, Enhanced Channel
Strain, 8 Cu Interconnect Layers, Low-k ILD and
0.57um2 SRAM Cell.
Page # 24/44
Deep Submicron Device PARASITICS

Source-Drain Resistance
Gate Resistance
Parasitic Capacitances
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
Page # 25/44
Constant Field Scaling
constant
Design of High-Performance Microprocessor Circuits, IEEE Press, New York, 2001
Page # 26/44
Some numbers .. Constant Electric Field Scaling
Width = W = 0.7, Length = L = 0.7, t ox = 0.7
n Lateral and vertical dimensions reduce 30%
0 .7 0 .7
= 0 . 7,
Area Cap = C a =
0 .7
Fringing Cap = C f = 0 . 7 ,
Total Cap C = 0 . 7
~(e*W*L)/Tox
~W
o Capacitance--area and fringing--reduce 30%
Die Area = X Y = 0.7 0.7 = 0.72

p Die area reduces 50%
Page # 27/44
Constant Electric Field Scaling cntd-
0 .7
Cap
=
= 0 .7
1
Transistor
q Capacitance per transistor reduces 30%
0.7
1
Cap
=
=
Area 0.7 0.7 0.7
r Capacitance per unit area increases 43%
Vdd = 0.7, Vt = 0.7, I =
W
0 .7 0 .7
(Vdd Vt ) =
= 0 .7
0 .7
tox
velocity-saturated device
2
0
.
7
0
.
7
C Vdd 0.7 0.7
=
= 0.7, Power = C V 2 f =
= 0 .7 2
T =
I
0 .7
0 .7
s Delay reduces 30%, power reduces 50%

Fmax scales by 1/0.7
Page # 28/44
What About Constant Voltage Scaling?
Page # 29/44
A comparison .
Constant voltage scaling
Constant electric field scaling
C = 0.7,V = 1
0.7 =
W
1 1
I = (V Vt) =
0.7
tox
CV = 0.7 1 =
D=
0.7
I
1
Power = CV2F = 0.7 1
0.7
Power = 1
C = 0.7,E =
Power Density = 1/0.72 = 2
V = 0.7 =
V 0.7 =
1
1,E = =
tox 0.7
L 0.7
W
0.7
I = (V Vt) =
0.7 = 0.7
tox
0.7
CV = 0.7 0.7 =
D=
0.7
I
0.7
Power = CV2F = 0.7 0.72

0.7
Power = 0.5
Power Density = 0.5/0.72 = 1
Page # 30/44
Issues with Constant Voltage Scaling

Practical (from the systems integration point of view), since
the power supply and signal voltages are unchanged but,
Electric field increases by factor k (1/S). Can cause transistor
failures such as oxide breakdown, punch-through, and hot
electron charging of the oxide.
Current density will also increase in transistors as well as
metals causing self-heating and metal migration in
interconnects.
Power density (P/area) is increasing causing localized
heating and heat dissipation problems.
In reality, CMOS technology evolution has followed a mixture
of both constant field and constant voltage scaling.
Page # 31/44
Non-Scaling Effects
Subthreshold Current: since kT/q and Eg do not scale ..
The Polysilicon gate depletion contributes a Capacitance which is in

series with the oxide capacitance Cox .. Thus the total gate
capacitance does not scale exactly to 1/K .. unless metal gates
are used.
The full benefit of scaling cannot be realized unless process
tolerances (Leff, Tox, Vt, etc.) scale along with 1/K.
Page # 32/44
Transistor Performance Trends
Q
dQ/dt
i
dt
=
=
=
=
CV
CdV/dt
CdV/dt
CdV/i
Delay
= CV/Idsat
CV/Idsat
Page # 33/44
CMOS Inverter Delay
i = dQ/dt
i = D(CV)/dt
i = CdV/dt
Page # 34/44
Inverter Delay cntdWn=NFET width, Idsatn = mA/um

Wp=PFET width, Idsatp = mA/um
Page # 35/44
0.13 micron Cross Section (Copper)

SiO2
M6
M6
M5
M5
M4
M4
M3
M3
M2
M2
M1
M1
LI
POLY
Substrate
Cu Interconnect 130nm Generation
Courtesy: IBM
Page # 36/44

Page # 37/44

Page # 38/44
Interconnect
Inter-dielectric
thickness
length
thickness
width
Metal-to-metal space
Assuming K = 1/0.7 ~ 1.43

RC delay = [ (1/0.7)^2 * Rw * 0.7 ] * [ Cw * 0.7 ] = RwCw
I / WwTw = [ 0.7 * I ] / [ (0.7 * Ww) * ( 0.7 * Tw ) ] = I / (0.7 * WwTw)
Page # 39/44
Interconnect Delay Curves
Picoseconds
0.18 um Aluminum
0.13 um Copper
M1
M2
M3
M4
M5
M4
M6
microns
microns
An M4 5mm 0.18um line (1.8ns un-repeated) would scale to 3.5mm in 0.13um; assuming
fF/um remains constant, but ohms/um doubles, then the same wire would take 3.6ns.
Copper takes this to 1.0ns.
Page # 40/44
Repeated Interconnect
0.13 um Copper
Picoseconds
M3
M4
M5
The 3.5mm M4 lines 1ns

can be further reduced to
0.52 ns by adding
repeaters.
M6
microns
Page # 41/44
90 nm & 65nm Technology Overview

90 nm
65nm
units
Lphysical
Wmin
Tox N/P
Xj N/P
60/65
90
2.0/2.0
32/32
38/44
65
1.4/1.4
24/24
nm
nm
nm
nm
CONTACT
VIA1
VIA2
VIA3
VIA4
VIA5
VIA6
VIA7
90
130
130
130
220
240
340
65
95
95
95
175
175
300
300
nm
nm
nm
nm
nm
nm
nm
nm
POLY w/s
M1 w/s
M2 w/s
M3 w/s
M4 w/s
M5 w/s
M6 w/s
M7 w/s
90/130
140/140
170/170
170/170
240/240
360/360
540/540
810/810
65/90
105/105
130/130
130/130
180/180
180/180
400/400
400/400
nm
nm
nm
nm
nm
nm
nm
nm
Page # 42/44
90 nm & 65nm Technology Overview cntd90 nm (Cu)

0.3/-0.3
300/600
2.9/2.9
0.4/0.4
~1.4
~1000/500
60/-60
65nm
.29/-.33
378/748
4.1/4.2
.34/.36
~1.8
~1380/630
190/-175
units
volts
ohms-um
fF/um^2
fF/um
fF/um
uA/um
nA/um @100C
CONTACT
VIA1
VIA2
VIA3
VIA4
VIA5
VIA6
VIA7
4.0
3.0
2.4
2.4
1.4
1.0
0.6
8.0
6.0
6.0
6.0
4.5
3.4
2.0
2.0
ohms/con
ohms/con
ohms/con
ohms/con
ohms/con
ohms/con
ohms/con
ohms/con
M1
M2
M3
M4
M5
M6
M7
700
400
400
150
150
150
1570 & 0.23

930 & 0.23
930 & 0.22
330 & 0.22
330 & 0.23
330 & 0.23
100 & 0.25
mohms/um
mohms/um
mohms/um
mohms/um
mohms/um
mohms/um
mohms/um
Vt0 N/P
Rdsw N/P
Cj N/P
Cjsw N/P
Cgate N/P
Idsat N/P
Ioff N/P
R
R
R
R
R
R
R
&
&
&
&
&
&
&
C
C
C
C
C
C
C
&
&
&
&
&
&
0.23
0.23
0.23
0.22
0.23
0.25
@
@
@
@
@
@
@
@
100C
100C
100C
100C
100C
100C
100C
100C
&
&
&
&
&
&
&
fF/um
fF/um
fF/um
fF/um
fF/um
fF/um
fF/um
Page # 43/44
References
1.
A. Chandrakasan, W.J. Bowhill, F. Fox, Design of High-Performance Microprocessor Circuits, IEEE

Press, New York, 2001.
2. Y. Taur, T.H. Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, UK, 1998.
3. K. Bernstein et. Al., High Speed CMOS Design Styles, Kluwer Academic Publishers, Boston, 1999.
4. R.J. Baker, H.W. Li, D.E. Boyce, CMOS Circuit Design, Layout, and Simulation, IEEE Press, New
York, 1998.
5. J.M. Rabaey, Digital Integrated Circuits, Prentice-Hall, New Jersey, 1996.
6. S. Thompson, et. al., An Enhanced 130nm Generation Logic Technology Featuring 60nm
Transistors Optimized for High Performance and Low Power at 0.7-1.4V, 2001 IEDM Technical
Digest, pp. 257-260.
7. S. Thompson, et. al., 90nm Technology, 2002 IEDM Technical Digest, pp. 61-64.
8. P. Bai, et. al., A 65nm Logic Technology Featuring 35nm Gate Length, Enhanced Channel Strain, 8 Cu
Interconnect Layersw, Low-k ILD and 0.57um2 SRAM Cell, 2004 IEDM.
9. Summary of a gazillion processors:
http://www-vlsi.stanford.edu/group/chips_micropro.html
10. M. Bohr, Intels 90nm Process Starting High Volume Manufacturing,
http://www.intel.com/research/silicon
11. For latest information on Intels silicon technology, please visit:
http://www.intel.com/technology/silicon/
12. K. Mistry, et al., A 45nm Logic Technology with High-k+ Metal Gate Transistors, Strained Silicon,
9 Cu Interconnect Layers, 193nm Dry Patterning, and 100% Pb-free Packaging, Tech. Digest
IEDM, Dec 2007.
Page # 44/44
BACKUP
Page # 45/44
Drain Current Models
Page # 46/44
Ids: Characteristics in the Linear Region
Page # 47/44
Ids: Characteristics in the Subthreshold Region
Page # 48/44
Characteristics in the Saturation Region (Long Channel)
Figures from: Y. Taur, T.H. Ning, Fundamentals of Modern VLSI

Devices, Cambridge University Press, UK, 1998.
Page # 49/44
Characteristics in the Saturation Region with Velocity

Saturation
Page # 50/44
Characteristics in the Saturation Region with Velocity

Saturation
Page # 51/44
Interconnect Delay
Intrinsic wire delay ~ 0.5*L2*(R/um)*(C/um)

Page # 52/44
Wire Delay Example
In 0.18 um CMOS, assume a 2 mm M2 wire, minimum width (0.34um),

and a 120fF load at the end.
93
What is the intrinsic wire delay?

What size driver should you use?
Intrinsic Delay:
Using a worst case 0.18 ohms/um and 0.22 fF/um:
C = 2000*0.22 = 440 fF and R = 360 ohms.
Intrinsic wire delay = 0.5*RC ~ 79ps
120 fF
360
47
220 f
FO=4
Driver size (use FO=4 and P:N ratio~2):

input cap ~ (Cwire+Cload)/4 ~ 560fF/4 = 140fF
Using Cox ~1fF/um, PFET is 93um and NFET is 47um.
If M2 wire is 4mm long, intrinsic delay is ~ 317ps .. 4X longer .
220 f
Page # 53/44
Homework Assignment #1.2 & #1.4

A 2 cycle static circuit will be analyzed with a Copper 1.0V,
90nm CMOS technology. Maximum frequency (Fmax) of
operation and power dissipation as a function of VDD with
ZERO clock skew will be established via circuit simulation.
In addition, the impact of realistic clock skew on Fmax will
be determined. Finally, the Fmax will be predicted for a
65nm CMOS circuit using constant field scaling law.
Page # 54/44
EE382M HMK #1.2

A0_in
M
S
F
F
All dimensions in microns

90nm CMOS
R in ohms, C in fF
4.8
8.4
8.4
7.6
clk
B0_in
M
S
F
F
3.6
3.9
5.3
5.3
5.8
15.6
22
45
78
45
4.2
6.0
6.0
29
24
45
24
6.0
10.5
10.5
M
S
F
F
6.4
6.4
7.2
clk
13.3
13.3
14.4
6.4
11.2
11.2
7.0
7.0
7.7
16
56
4.2
6.0
6.0
21
10
10
10.8
clk
21
10
10
10.8
5.6
8.0
8.0
21
M
S
F
F
13
33
13
OUT
21
clk
Page # 55/44
logic
slave
master
logic
slave
B0_in
master
A0_in
master
master
slave
slave
Clock-Skew Impact to Fmax (HMK #1.4)
Out
clock
clock
clock
GCLK
GLOBAL clock
LCB
in en
LCB
LCB
in en
in en
local clock
buffer
GLOBAL enable
Vdd
1 > 2
LCB
in
out
en
Page # 56/44
INVERTING MSFF
Din
Dout
clock
Page # 57/44

EE-382M Vlsi-Ii: A Brief Summary of Trends, Device Limitations, Scaling, Device Performance in CMOS Technologies

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

EE-382M Vlsi-Ii: A Brief Summary of Trends, Device Limitations, Scaling, Device Performance in CMOS Technologies

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EE-382M Vlsi-Ii: A Brief Summary of Trends, Device Limitations, Scaling, Device Performance in CMOS Technologies

Uploaded by

Copyright:

Available Formats

EE-382M

The University of Texas at Austin

EE382M VLSI-II Class Notes

Core 2 duo in 45n

Transistors on a chip doubles every ~2 years

EE382M VLSI-II Class Notes

Die Size Growth in Desktop/Mobile Processors

(mm per side)

Per side (mm)

~7% growth per year

Die size used to grow ~14% every two years

The University of Texas at Austin

EE382M VLSI-II Class Notes

Logic Transistor Density

Shrinks & Compactions meet density goals

The University of Texas at Austin

EE382M VLSI-II Class Notes

Each processor has several revisions

The University of Texas at Austin

EE382M VLSI-II Class Notes

Frequency used to double every ~2 years

The University of Texas at Austin

EE382M VLSI-II Class Notes

Power Dissipation of Compactions

Lead processor power increases

The University of Texas at Austin

EE382M VLSI-II Class Notes

Power Dissipation of Lead uP

Power increases exponentially

The University of Texas at Austin

EE382M VLSI-II Class Notes

The University of Texas at Austin

EE382M VLSI-II Class Notes

Source: Mark Bohr, Intel Corporation

EE382M VLSI-II Class Notes

MEROM core2 duo in 65nm

EE382M VLSI-II Class Notes

PENRYN core2 duo in 45nm

The University of Texas at Austin

EE382M VLSI-II Class Notes

SILVERTHORNE (ATOM Processor) in 45nm

The University of Texas at Austin

EE382M VLSI-II Class Notes

Deep Sub-micron CMOS device Cross Section

The University of Texas at Austin

EE382M VLSI-II Class Notes

Deep Sub-Micron Transistors

Characteristics in the linear,

The University of Texas at Austin

EE382M VLSI-II Class Notes

Courtesy: Mark Bohr, Intel

Source: Mark Bohr, Intel Corporation

EE382M VLSI-II Class Notes

Strained silicon increases electron/hole mobility.

EE382M VLSI-II Class Notes

Source: Mark Bohr, Intel Corporation

EE382M VLSI-II Class Notes