Lecture 03
Lecture 03
Lecture 03
1. Instruction count
2. Speed of execution
Example contd..
1. Instruction count IC.
IC for program 1= 2+2+2=6
IC for program 2= 1+5+1=7
2. For execution time we can use the following SRC
specifications.
ET = IC x CPI x T Instruction Type CPI
ET1= (2x2)+(2x3)+(2x4)
Control 2
= 18
ALSU 3
ET2 =(5x2)+(1x3)+(1x4)
=17 Data Transfer 4
Note: Since both programs are executing on the same machine, the T factor can
be ignored while calculating ET.
Problem: Consider the following SRC code segments for
implementing the operation a=b+5c. Find which one is more
efficient in terms of instruction count and execution time.
Program 1: Multiplication by using
repeated addition in a for loop
org 0 mpy:
a: .dw 1 brzr r7,r5 ; jump to next after 5
b: .dw 1 iterations
c: .dw 1 add r4,r4,r3 ;r4 contains r4+c
.org 80 addi r5,r5,-1 ; decrement index
la r5, 5 ; load value of loop br r6 ; loop again
lar r6,mpy ;load address of mpy next:
lar r7, next ;load address of next add r4,r4,r2 ; r4 contains sum
ld r2, b ; load contents of b of
ld r3, c ; load contents of c b and 5c
la r4, 0 ;load 0 in r4 st r4, a ;store at address a
stop
Problem: Consider the following two SRC code segments for
implementing the operation a=b+5c. Find which one is more
efficient in terms of instruction count and execution time.
.org 0 stop
a: .dw 1 mpy:
b: .dw 1 la r7,0 ;r7 contains zero
c: .dw 1 lar r8,again ;r8 contain again address
.org 80 again:
lar r1,mpy ;load address of mpy in r1 brzr r5,r3 ;exit loop when index is
0
ld r2, b ; load contents of b in r2 add r7,r7,r4 ; r7 contains r7+c
la r3,5 ; load index in r3 addi r3,r3,-1 ; decrement index
ld r4,c ; load contents of c in r4 br r8
brl r5, r1 ; r5 contains PC
add r2,r2,r7 ; r2 contains sum b+5c
st r2, a
Solution
The instructions in both programs can be divided into 3
types and the respective count of each type is
Data transfer 7 7
instructions
Control instructions 3 4
ALSU instructions 3 3
IC for program 1 = 7 + 3 + 3= 13
IC for program 2 = 7 + 4 + 3= 14
Solution contd..
For execution time, consider the following SRC
specifications.
Instruction Type CPI
ET = IC x CPI x T
Control 2
ET1= (7x4)+(3x2)+(3x3)
ALSU 3
= 43T
ET2= (7x4)+(4x2)+(3x3) Data Transfer 4
= 45T
Conclusion:
Program 1 runs faster than program 2 as obvious from the
execution time of both.
MIPS
• Millions of Instructions Per Second
= IC / (ET x 106)
• Capability of different instructions varies from
machine to machine, eg. RISC machines have
simpler instructions, so the same job will require
more instructions
• Was popular when the VAX 11/780 was treated
as a reference – late 70s and early 80s
MIPS as a performance metric
• MIPS is inversely proportional to execution
time,
ET= IC / (MIPS x 106 )
Example
Consider a machine having a 100 MHz clock and three
instruction types with following Instruction Type CPI
parameters. Control 2
Now suppose that two
ALSU 3
different compilers generate
Data Transfer 4
code for the same program.
The instruction count for each is given as follows
IC in millions Code from Code from
compiler 1 compiler 2
Control 5 10
ALSU 1 1
Data Transfer 1 1
Compare the two codes according to MIPS and
according to execution time.
Solution:
First we find the CPI for both code sequences
Since CPI = clock cycles for each type of instruction / IC
CPI1= (5x2 + 1x3 + 1x4)/ 7 = 2.43
CPI2= (10x2 +1x3 + 1x4)/12 = 2.25
Solution:
First we find the CPI for both code sequences
Since CPI = clock cycles for each type of instruction / IC
CPI1= (5x2 + 1x3 + 1x4)/ 7 = 2.43
CPI2= (10x2 +1x3 + 1x4)/12 = 2.25
7 0
a M[8] One memory “word”
a+1 M[9]
31 24 23 16 15 8 7 0
a+2 M[10]
M[8] M[9] M[10] M[11]
a+3 M[11]
MS Byte LS Byte
SRC: instruction formats
31 27 26 0
Type A Op-code unused
31 27 26 22 21 0
Type B Op-code ra c1
31 27 26 22 21 17 16 0
Type C Op-code ra rb c2
31 27 26 22 21 17 16 12 11 0
Type D Op-code ra rb rc c3
31 27 26 0
Type B Op-code ra c1
Type C Op-code ra rb c2
Data transfer 7 6
instructions
Control instructions 2 3
ALSU instructions 3 3
IC for program 1 = 7 + 2 + 3= 12
IC for program 2 = 6 + 3 + 3= 12
Solution contd..
For execution time, consider the following SRC
specifications.
Instruction Type CPI
ET = IC x CPI x T
ET1= (7x4)+(2x2)+(3x3) Control 2
= 41 ALSU 3
ET2= (6x4)+(3x2)+(3x3) Data Transfer 4
= 39
Conclusion:
Although the instruction count for both programs is same,
program 2 runs much faster than program 1 due to lesser
number of clock cycles required.