Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Mpuarch

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 33

Microprocessor Architecture

Featuring
8086 to Pentium
C R Sarma
Associate Professor, Dept of ECE
G.Narayanamma Institute of Technology and Science
Intel 8086/8088
Microprocessors

• Intel 8086 and 8088 Microprocessors


are the basis of all IBM-PC compatible
computers
(8086 introduced in 1978, first IBM-PC released in 1981)

• All Intel, AMD and other advanced


microprocessors are based on and
are compatible with the original
8086/8
• At Power Up and Reset time,
Pentiums, Athlons etc all look like
Intel 8086/8088
Microprocessors

• Intel 8086 is a 16b microprocessor:


– 16b data registers, 16b ALU
• Width of external data bus:
– 8086: 16b
– 8088: 8b
• Width of external address bus:
16b+4b=20b
• Some techniques to optimise the CPU
performance when it’s executing programs
• Segment: Offset memory model
• Little-Endian Data Format
8086/8088 (1)

• Original IBM PC used 8088 microprocessor


• 8088 is similar to the 8086, but it has an
external 8b data bus & only 4B-deep
queue
– For cost reduction reasons
• We can consider 8086 and 8088 together
• PC clones often used 8086 for better
performance
• 8-bit bus reduces performance, but meant
cheaper computers
8086/8088 Functional
Units

Bus Interface
Unit(BIU)
Execution Unit
Fetches Opcodes,
(EU)
Reads Operands,
Writes Data

8086/8088 MPU
8086/8088 (3)

• 8086/8088 consists of two internal


units
– The execution unit (EU) - executes the
instructions
– The bus interface unit (BIU) - fetches
instructions, reads operands and writes
results
• The 8086 has a 6B prefetch queue
• The 8088 has a 4B prefetch queue
8086 Block Diagram
BIU Elements
• Instruction Queue: the next instructions or data
can be fetched from memory while the processor
is executing the current instruction
– The memory interface is slower than the processor
execution time so this speeds up overall performance 
• Segment Registers:
– CS, DS, SS and ES are 16b registers
– Used with the 16b Base registers to generate the 20b
address
– Allow the 8086/8088 to address 1MB of memory
– Changed under program control to point to different
segments as a program executes
• Instruction Pointer (IP) contains the Offset
Address of the next instruction, the distance in
bytes from the address given by the current CS
8086/8088 20-bit
Addresses
CS

16-bit Segnment Base Address 0000

IP

16-bit Offset Address

20-bit Physical Address


8086/8 In Circuit (1)

• 8086/8 microprocessors need support circuits


in a microcomputer system
• 8086/8 multiplex the address and data buses
on the same pins
• This saves pins but at a price:
– Demultiplexing logic is needed to build up separate
address and data buses to interface with RAMs and
ROMs
– Effects the speed due to multiplexing
MAXIMUM MINIMUM
MODE MODE

GND 1 40 Vcc
AD14 AD15
AD13 A16,S3
AD12 A17,S4
AD11 A18,S5
AD10 A19,S6
AD9 /BHE,S7
AD8 MN,/MX
AD7 /RD
AD6 /RQ,/GT0 HOLD
AD5
8086 /RQ,/GT1 HLDA
AD4 /LOCK /WR
AD3 /S2 IO/M
AD2 /S1 DT/R
AD1 /S0 /DEN
AD0 QS0 ALE
NMI QS1 /INTA
INTR /TEST
CLK READY
GND 20 21 RESET
MAXIMUM MINIMUM MAXIMUM MINIMUM
MODE MODE MODE MODE

GND 1 40 Vcc GND 1 40 Vcc


AD14 AD15 A14 A15
AD13 A16,S3 A13 A16,S3
AD12 A17,S4 A12 A17,S4
AD11 A18,S5 A11 A18,S5
AD10 A19,S6 A10 A19,S6
AD9 /BHE,S7 A9 high /SS0
AD8 MN,/MX A8 MN,/MX
AD7 /RD AD7 /RD
AD6 /RQ,/GT0 HOLD AD6 /RQ,/GT0 HOLD
8086 AD5
8088 /RQ,/GT1 HLDA
AD5 /RQ,/GT1 HLDA
AD4 /LOCK /WR AD4 /LOCK /WR
AD3 /S2 IO/M AD3 /S2 IO/M
AD2 /S1 DT/R AD2 /S1 DT/R
AD1 /S0 /DEN AD1 /S0 /DEN
AD0 QS0 ALE AD0 QS0 ALE
NMI QS1 /INTA NMI QS1 /INTA
INTR /TEST INTR /TEST
CLK READY CLK READY
GND 20 21 RESET GND 20 21 RESET
i8086 Circuit - Maximum Mode

Vcc CLK MRDC#


MWTC#
S0# AMWC#
S1#
8288
8284A CLK IORC#
S2# Bus
Clock READY IOWC#
Generator Controller
RESET AIOWC#
DEN
RDY DT/R# INTA#
8086 ALE
CPU
MN/MX#
LE
OE#
BHE# 74LS373
A19:A0,
AD15:AD0 x3
ADDR/DATA BHE#
A19:A16
INTR

DIR
EN#
74LS245
74LS245 D15:D0
x2
ADDR/Data x2
8086/8088 Summary

• First Generation (introduced June


1978)
• One of the first 16b processors on the
market
• 16b internal registers
• 16/8b external data bus
• 20b address bus (1MB addressable)
• Used in 1st generation IBM PCs (1981)
80186/80188

• Evolution of 8086/8088 
80186/80188
• Increased instruction set
• On-chip system components (Clock
generator, DMA, Interrupt, Timers…)
• Unsuccessful in PCs
• Popular in embedded systems…
2 Generation Processor
nd

286
• P2 (286) = 2nd Generation Processor
• Introduced in 1981
• CPU behind IBM AT
• Throughput of original IBM AT (6MHz) was
about 500% of IBM PC (4.77MHz)
• Level of integration: 134k transistors (vs
29k in 8086)
• Still a 16b processor…
• Available in higher clock frequencies:
25MHz
2 Generation Processors
nd

286
• Fully backwards compatible to 8086
80286 runs 8086 software without modification
• Improved instruction execution
Average instruction takes 4.5 cycles vs. 12 cycles (8086)
• Improved instruction set
• Real mode and Protected Mode
Multitasking-support. What happens in one area of memory doesn’t
affect other programs. Protected mode supported by Windows 3.0.
• 16MB addressable physical memory
• On-chip MMU (1GB virtual memory)
• Non-multiplexed address-bus and data-bus
Improving Computer
Performance

• We’ve seen how 16b computer


technology based on the 8086 and
80286 processors developed
• These computers are not powerful
enough for today’s applications
• How do you improve the
performance of your computer?
• Let’s start with the CPU
CPU Performance (1)

• MOST OBVIOUS: Processor Clock


Frequency
• Increased frequency – increased
execution rate
• State of the Art: >4GHz (03/2005)
• Memory and I/O access times can be
performance bottleneck – unless you
take some special measures
CPU Performance (2)

• ALU register width


– A processor is an n-bit processor, where N represents
the precision of the ALU – N can be 4, 8, 16, 32, or 64
– The wider the registers – the more processing per
clock
• Data bus width
– The wider the data bus the faster we can transfer data
– Since the memory and I/O device access times are
finite, the more bits transferred per cycle the better
CPU Performance (3)

• Address bus width


• Increased address width doesn’t provide a
‘speed’ increase as such
• CPU can directly address more memory
• PCs use big programs, which would not fit in a
smaller address space
• Overcoming small address space takes time
– Impacts on overall system performance
3 Generation Processor
rd

386
• P3 (386) = 3rd Generation Processor
• Introduced: 10/1985
• Full 32b processor
(32b registers. 32b internal and external databus. 32b address bus)
• 275k transistors. CMOS. 132-pin PGA
package.
(Supply current Icc=400mA. Roughly the same as 8086 !)
• Clock speeds: 16-33MHz
• P3 processors were far ahead of their time:
It took 10 years before 32b operating systems became mainstream!
• First 386 PCs early 1987
(COMPAQ)
3 Generation Processor
rd

386
• Modes of operation:
– Real. Protected. Virtual Real.
• Protected mode of 386 is fully
compatible with 286
Protected mode=native mode of operation. Chips are
designed for advanced operating systems such as Windows
NT
• New virtual real mode
Processor can run with hardware memory protection while
simulating the 8086’s real-mode operation. Multiple copies of
e.g. DOS can run simultaneously, each in a protected area of
memory. If a program in one memory area crashes, the rest
of the system is protected.
80386 Features

• 32b general and offset registers


• 16B prefetch queue
• Memory management unit with segmentation unit and
paging unit
• 32b address and data bus
• 4GB physical address space
• 64TB virtual address space
• i387 numerical coprocessor
• Implementation of real, protected and virtual 8086 modes
80386 Operating Modes

• Protected Mode for Multitasking support


• Real Mode (native 8086 mode)
– Processor powers up in Real Mode
• System Management Mode
– Power management or system security
– Processor switches to separate address space, while
saving the entire context of the currently running
program or task
Intel 32-bit 80386
Address
Addressing Unit
(AU)
Bus Unit (BU)

Prefetch Queue

Execution Unit (EU) Data

ALU

Control Instruction Unit (IU)


Unit (CU)
Registers

The 80386 includes a Bus Interface Unit for reading and providing data and instructions,
witha Prefetch Queue, an IU for controlling the EU with its registers, as well as an AU for
generating memory and I/O addresses
80386 Register Set
Instruction Pointer EFLAG Register
31 16 15 0 31 16 15 E0

EIP IP EFLAG FLAG

General-Purpose Registers
Segment Registers
31 16 15 8 7 0
15 0
EAX AH AL
CS

EBX BH BL
SS

ECX CH CL
DS

EDX DH DL
ES

ESI SI
FS

EDI DI
GS

EBP BP

ESP SP
Coprocessor: i387

• The hardware implementation of floating


point processing in the i387 means floating
point operations run at much higher speed.
• The i386 can execute all mathematical
expressions using software emulation of
the i387.
4th generation 80486
Block Diagram

Cache
Paging Register
(8K

Segmentation
Unit
A31-A0 Unit and ALU
bytes)

D31-D0
Bus Interface

Control and
Status Signals Control Floating
Prefetcher
(32-byte
queue)

Decoding
Unit
Unit Point Unit

Built in Co - processor

i486 CPU
5th Gen. Processor: Pentium
• Pentium = P5 (586) = 5th Generation
Processor
(trademarking a number designation not possible)
• Introduced: 03/1993
(Pentium-PCs followed a few months later)
• Superscalar technology
(2 instruction pipelines for execution of up to 2 instructions per
clock cycle)
• Branch prediction
(to avoid flushing the instruction queue and pipeline at branch-
taken event)
• Internal 8kB caches for code and data
(but external L2 cache)
• Addressbus: 32b. External Databus: 64b
But not a 64-bit processor! Internal data paths up to 256b wide
5th Gen. Processor: Pentium
• Pipelined FPU
(2..10 times faster than 486 FPU. FDIV bug! Free
replacement…)
962,306,957,033 / 11,010,046 = 87,402.6282027341 (correct
answer)
962,306,957,033 / 11,010,046 = 87,399.5805831329 (flawed
Pentium)

• Burst-mode bus cycles


(fast data transfer from memory to cache)

• >3M transistors. BiCMOS. 0.8µm..0.35


µm.
• Supply voltages: 5V..2.9V
• Packages: PGA273 and SPGA296
5th Gen. Processor: Pentium
• Clock speeds: 60-266MHz
• Clock multiplier circuitry
Processor runs faster than the system bus. Motherboard bus
speeds 50, 60, 66MHz.

• System management mode (SMM)


(full control over power management features)
Pentium Block Diagram
64-bit Data bus
32-bit
Address bus

Pentium

Bus Interface

Data Cache Code Cache


TLB TLB BTB
8 Kbytes 8 Kbytes

Prefetch Buffer

Instruction Decode Microcode


ROM

Control Unit v pipeline u pipeline

Floating Point
Pipeline

Register

You might also like