Unit 2
Unit 2
Unit 2
Processor Basics
CPU Organization
Fundamentals External Communication User and Supervisor mode CPU Operation Accumulator Based CPU
Decode Instruction
Decode Cycle
Fetch Cycle
No
If Operand(s) required
Yes
Fetch Operand(s) Execute Instruction Stop Operand Fetch Cycle
Execute Cycle
PC Address Lines
Control signals .... Instruction Decoder and Control logic Internal Processor Bus
Memory Bus
Data Lines Constant Value Select
MAR MDR Y
IR R0
MUX
A ALU
. . .
Carry In R(n-1) TEMP
Instruction Execution
The task performed by any microoperation falls in one of the following categories:
Transfer data from one register to another; Transfer data from a register to an external interface (system bus); Transfer data from an external interface to a register; Perform an arithmetic or logic operation, using registers for input and output.
Riin
X Ri X
Riout Yin
X Internal Processor Bus
Constant
Y MUX A ALU B
Select
Zin
X Z X
Zout
Control Signals
The CPU executes an instruction as a sequence of control steps. In each control step one or several microoperations are executed.
One clock pulse triggers the activities corresponding to one control step for each clock pulse the control unit generates the control signals corresponding to the microoperations to be executed in the respective control step
Control Signals
In order to allow the execution of a microoperation, one or several control signals have to be issued; they allow the corresponding data transfer and/or computation to be performed. Examples: a) signals for transferring content of register R0 to R1: R0out, R1in b) signals for adding content of Y to that of R0 (result in Z): R0out, Add, Zin c) signals for reading a memory location; address in R3:
PC
Register file Constant 4 MUX A ALU B Instruction decoder IR MDR MAR Memory bus data lines Address lines R
Example : ADD R4, R5, R6 1. 2. 3. 4. 1. PCout, R=B, MARin , Read, IncPC WMFC MDRoutB, R=B, IRin R4outA, R5outB, Select-A, Add, R6in, End Contents of PC are passed through ALU and loaded into MAR to start a Memory Read. The PC is then incremented by 4 and the incremented value is loaded into PC. The processor waits for MFC and loads the data received into MDR The contents of MDR are transferred to IR Execution Phase
2. 3. 4.
Cache Memory
External Communication
Main memory
CPU
Instruction Decoder
Control Signals
IR PCU
AR
PC
SystemSBus
To M and IO Devices
DR
AC
MUX
A bus
MUX
B bus
} SELB
ALU
Output
ALU CONTROL
Encoding of ALU operations
OPR Select 00000 00001 00010 00101 00110 01000 01010 01100 01110 10000 11000 Operation Transfer A Increment A ADD A + B Subtract A - B Decrement A AND A and B OR A and B XOR A and B Complement A Shift right A Shift left A Symbol TSFA INCA ADD SUB DECA AND OR XOR COMA SHRA SHLA
Address 63
Stack pointer
SP
6 bits
C B A DR
4 3 2 1 0
PUSH
POP
SP SP + 1 DR M[SP] M[SP] DR SP SP 1 If (SP = 0) then (FULL 1) If (SP = 0) then (EMPTY 1) EMPTY 0 FULL 0
Stack Organization
PC
AR
stack
SP 3997 3998 3999 4000 4001 Stack grows In this direction
- A portion of memory is used as a stack with a processor register as a stack pointer - PUSH: - POP: SP SP - 1 M[SP] DR DR M[SP] SP SP + 1
- Most computers do not provide hardware to check stack overflow (full stack) or underflow (empty stack) must be done in software
EXAMPLE PROGRAM
(A+B)(C+D)
Accumulator Based CPU
LOAD A ADD B STORE T LOAD C ADD D SUB T STORE X
Instruction Format
*/
Two-Address Instructions
Program to evaluate X = (A + B) * (C + D) : MOV ADD MOV ADD MUL MOV R1, A R1, B R2, C R2, D R1, R2 X, R1 /* R1 M[A] /* R1 R1 + M[B] /* R2 M[C] /* R2 R2 + M[D] /* R1 R1 * R2 /* M[X] R1 */ */ */ */ */ */
Instruction Format
Zero-Address Instructions
SOLUTIONS
A*B+C*D+E*F AB*CD*EF*++ 0 Address
PUSH A PUSH B MUL PUSH C PUSH D MUL PUSH E PUSH F MUL ADD ADD POP X
1 Address
LOAD A MUL B STORE T LOAD C MUL D ADD T STORE T LOAD E MUL F ADD T STORE X
2 Address
MOV R1, A MUL R1, B MOV R2, C MUL R2, D ADD R1,R2 MOV R2, E MUL R2,F ADD R1,R2 MOV X,R1
3 Address
(DEST, SRC, SRC)
MUL R1,A,B MUL R2,C,D ADD R1,R1,R2 MUL R2,E,F ADD X,R1,R2
SOLUTIONS
A*B+A*(B*D+C*E) AB*ABD*CE*+*+ 0 Address
PUSH A PUSH B MUL PUSH A PUSH B PUSH D MUL PUSH C PUSH E MUL ADD MUL ADD POP X
1 Address
LOAD C MUL E STORE T LOAD B MUL D ADD T MUL A STORE T LOAD A MUL B ADD T STORE X
2 Address
MOV R1,C MUL R1,E MOV R2,B MUL R2,D ADD R1,R2 MUL R1,A MOV R2,A MUL R2,B ADD R1,R2 MOV X,R1
3 Address
(DEST, SRC, SRC)
MUL R1,C,E MUL R2,B,D ADD R1,R1,R2 MUL R1,R1,A MUL R2,A,B ADD X,R1,R2
SOLUTIONS
[A*[B+C*(D+E)]] / [F*(G+H)]
ABCDE+*+*FGH+*/
0 Address
PUSH A PUSH B PUSH C PUSH D PUSH E ADD MUL ADD MUL PUSH F PUSH G PUSH H ADD MUL DIV POP X
1 Address
LOAD H ADD G MUL F STORE T LOAD E ADD D MUL C ADD B MUL A DIV T STORE X
2 Address
MOV R1,E ADD R1,D MUL R1,C ADD R1,B MUL R1,A MOV R2,H ADD R2,G MUL R2,F DIV R1,R2 MOV X,R1
3 Address
(DEST, SRC, SRC)
ADD R1,D,E MUL R1,R1,C ADD R1,R1,B MUL R1,R1,A ADD R2,G,H MUL R2,R2,F DIV X,R1,R2
SOLUTIONS
(A+B*C)/(DE*F+G*H) ABC*+DEF*GH* +/ 0 Address
PUSH A PUSH B PUSH C MUL ADD PUSH D PUSH E PUSH F MUL PUSH G PUSH H MUL ADD DIV POP X
1 Address
LOAD E MUL F STORE T LOAD D SUB T STORE T LOAD G MUL H ADD T STORE T LOAD B MUL C ADD A DIV T STORE X
2 Address
MOV R2,E MUL R2,F MOV R1,D SUB R1,R2 MOV R2, G MUL R2,H ADD R1,R2 LOAD R2,B MUL R2,C ADD R2,A DIV R2,R1 MOV R2,X
3 Address
(DEST, SRC, SRC)
MUL R1,E,F SUB R2, D,R1 MUL R1,G,H ADD R2,R1,R2 MUL R1,B,C ADD R1,R1,A DIV X,R1,R2
SOLUTIONS
A+B*[C*D+E*(F+G)] ABCD*EFG+*+*+ 0 Address
PUSH A PUSH B PUSH C PUSH D MUL PUSH E PUSH F PUSH G ADD MUL ADD MUL ADD POP X
1 Address
LOAD G ADD F MUL E STORE T LOAD C MUL D ADD T MUL B ADD A STORE X
2 Address
MOV R1,F ADD R1,G MUL R1,E MOV R2,C MUL R2,D ADD R1,R2 MUL R1,B ADD R1,A MOV X,R1
3 Address
(DEST, SRC, SRC)
ADD R1,F,G MUL R1,R1,E MUL R2,C,D ADD R1,R1,R2 MUL R1,R1,B ADD X,R1,A
Addressing Modes
ADDRESSING MODES
Addressing Modes
Specifies a rule for interpreting or modifying the address field of the instruction (before the operand is actually referenced) Variety of addressing modes to give programming flexibility to the user Pointers Counters to use the bits in the address field of the instruction efficiently
Addressing Modes
ADDRESSING MODES
Instruction Cycle Fetch Instruction from Memory Decode Instruction Execute Instruction
Addressing Mode Specification Distinct Binary Code Single Binary code for both mode and operation.
OPCODE
MODE
ADDRESS
0, 1 , 2 or 3
Addressing Modes
Immediate Mode
Instead of specifying the address of the operand, operand itself is specified - No need to specify address in the instruction - However, operand itself needs to be specified - Sometimes, require more bits than the address - Fast to acquire an operand - Useful for initializing registers
Addressing Modes
Indirect Mode
address
- EA =
Instruction specifies a register which contains the memory of the operand - Saving instruction bits since register address is shorter [IR(R)] ([x]: Content of x)
Addressing Modes
Addressing Modes
Addressing Modes
ADDRESSING MODES - EXAMPLES Address PC = 200 R1 = 400 XR = 100 AC 500 800 399 400 450 700 Memory
600
900
Addressing Effective Mode Address Direct address 500 /* AC (500) Immediate operand /* AC 500 Indirect address 800 /* AC ((500)) Relative address 702 /* AC (PC+500) Indexed address 600 /* AC (RX+500) Register - /* AC R1 */ 400 Register indirect 400 /* AC (R1) Autoincrement 400 /* AC (R1)+ Autodecrement 399 /* AC -(R)
*/ */ */ */ */ */ */ */
702
325
800
300
Shift Instructions
Name Mnemonic Logical shift right SHR Logical shift left SHL Arithmetic shift right SHRA Arithmetic shift left SHLA Rotate right ROR Rotate left ROL Rotate right thru carry RORC Rotate left thru carry ROLC
CISC
68020 Mircoprocessor
RISC
CHARACTERISTICS OF RISC
RISC Characteristics - Relatively few instructions - Relatively few addressing modes - Memory access limited to load and store instructions - All operations done within the registers of the CPU - Fixed-length, easily decoded instruction format - Single-cycle instruction format - Hardwired rather than microprogrammed control
Advantages of RISC
- VLSI Realization - Computing Speed - Design Costs and Reliability - High Level Language Support
RISC
ADVANTAGES OF RISC
VLSI Realization
Control area is considerably reduced Example: RISC I: 6% RISC II: 10% MC68020: 68% general CISCs: ~50%
Computing Speed
- Simpler, smaller control unit faster - Simpler instruction set; addressing modes; instruction format faster decoding - Register operation faster than memory operation - Register window enhances the overall speed of execution - Identical instruction length, One cycle instruction execution suitable for pipelining faster
RISC
ADVANTAGES OF RISC
Design Costs and Reliability
- Shorter time to design reduction in the overall design cost and reduces the problem that the end product will be obsolete by the time the design is completed
- Simpler, smaller control unit higher reliability - Simple instruction format (of fixed length) ease of virtual memory management
FLOATING POINT
IEEE 754
Floating-Point
What can be represented in N bits?
Example(8-bits) Unsigned
0 0 to t0 to t0 2N-1 -1 t0 to t0 2N 255 2N-1 - 1 127
2s Complement
- 2N-1 -128
1s Complement:
- 2N-1+1 to -127 127 10 1 9
BCD:
0N/4 0
Floating-Point
What about?
Very large numbers?
9,349,398,989,787,762,244,859,087,678
Rational numbers
2/3
...
We need a system to represent numbers in which the range of expressible numbers is independent of the number of significant digits
Floating-Point
Examples of real numbers:
Fixed point mantissa may be a fraction or an integer. Example: Decimal number +6132.789 can be represented as
Fraction Exponent Scientific Notation +0.6132789 +04 +0.61322789E+04 The value of the exponent indicates the actual position of the decimal point. 4 positions to the right.
The range is effectively determined by the number of digits in the exponent. The precision is determined by the number of digits in the fraction. More bits for significand gives more accuracy More bits for exponent increases range
For Binary
Overflow / Underflow
Overflow regions:
Due to finite nature of representation in computers, overflow (Region 1&7) and Underflow (Regions 3 & 5) cannot be expressed
Underflow:
Underflow errors are less serious than overflow errors Underflow can be approximated to zero
Rounding
It is quite possible for the result of a calculation to be one of the other numbers, even though it is in region 2 or 6.
For example, +0.100 103 divided by 3 cannot be expressed exactly.
The obvious thing to do is to use the nearest number that can be expressed. This process is called rounding.
Relative Error
The space between adjacent expressible numbers in regions 2 and 6 is not constant. The separation between +0.998 1099 and +0.9991099 is very different than that between +0.998 100 and +0.999 100
However, when separation between a number and its successor is expressed as a percentage of that number, there is no systematic variation throughout region 2 or 6. The relative error introduced by rounding is approximately the same for small numbers as large numbers.
Normalization
A floating point number is said to be normalized if the most significant digit of the mantissa is nonzero For example the decimal number 350 is normalized but 00035 is not. Regardless of where the position of the radix point is assumed to be in the mantissa, the number is normalized only if its leftmost digit is nonzero.
Example:
(2.0) x 10-9 (0.2) x 10-8 (20.0) x 10-10 Normalized Not-normalized Not-normalized
Computers support floating-point arithmetic The fractional point is called the binary point
Format:
Normalized numbers are generally preferable to unnormalized numbers, because there is only one normalized form, whereas there are many unnormalized forms.
(= 9.75)
Make a fractional number, counting the number of shifts: + .100111 ==> 4 shifts Exponent Sign Value 0 100 Mantissa Sign Value 0 1001111
Or for a 16-bit number with a sign, 5-bit exponent, 10-bit mantissa: 0 00100 1001111000
where s is the sign of the number, e represents the biased exponent (8 bits) and m represents the mantissa or significand (23 bits) 32-bit values range in magnitude from 10-38 to 1038.
31 30 . . . s exponent
23
22
-308
to 10
308
The growth of significand and exponent is a compromise between accuracy and range.
31 30 . . . s 20 19 ... Significand (Mantissa/Fraction) 0
exponent
Biased Exponent
Exponents can be both positive ad negative giving rise to a need of sign bit in exponents.
eg. Exponents ranging from 50 to 49 need 2 digits for the value and one bit for the sign.
Biased Exponent eliminates the need for sign by adding a positive quantity to the exponent so that it is always positive
Adding 50 to our example exponent makes the range as 0 to 99, value requiring 2 digits and no sign bit needed.
Hidden bit:
The leading 1 bit doesnt have to be stored (assumed to be present)
Zero is represented as 00 00two Has a symbol (NaN = Not a number) for invalid operations (e.g. 0/0 or subtracting infinity from infinity)
Allows programmers to postpone some tests and decisions to a later time in the program
All other numbers are represented using the following formula: (-1)s x (1 + Fraction) x 2E
Examples:
Unbiased representation: Biased single precision representation
For -0.75 ten
-1 will be represented as (-1 + 127) = 126ten = 0111 1110two -0.75 ten = - 0.11two = - 1.1two x 2-1
(-1)1 x ( 1+ .1000 0000 0000 0000 0000 000two) x 2(126-127)
3 1
3 0
2 9
2 8
2 7
2 6
2 5
2 4
2 3
2 2
2 1
2 0
1 9
1 8
1 7
1 6
1 5
1 4
1 3
1 2
1 1
1 0
1 1 1 0
0 0
0 0
0 0
0 0
Example:
Converting the following binary representation into decimal floating point
2 7 2 6 2 5 2 4 2 3 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 4 1 3 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 0
3 1
3 0
2 9
2 8
1 0
0 1
0 0
0 0
0 0
0 0
= -1.25 x 4 = -5.0
Exponent bias for normalized #s is 1023 The denorm biased exponent of 0 corresponds to an unbiased exponent of -1022 Infinity and NaNs have a biased exponent of 2047
Assignment
Design an ARM6 based CPU Design the organization of 68020 Write the sort note on Pipelining