Lecture 8: Intermediate Code: CS 540 Spring 2009
Lecture 8: Intermediate Code: CS 540 Spring 2009
Compiler Architecture
Source language
tokens
Syntactic structure
Intermediate Code
Code Generator
Target language
Intermediate Code
Code Optimizer
Intermediate Code
Symbol Table
CS 540 GMU Spring 2009 2
Intermediate Code
Similar terms: Intermediate representation, intermediate language Ties the front and back ends together Language and Machine neutral Many forms Level depends on how being processed More than one intermediate language may be used by a compiler
CS 540 GMU Spring 2009 3
Graphical IRs
Abstract Syntax Trees (AST) retain essential structure of the parse tree, eliminating unneeded nodes. Directed Acyclic Graphs (DAG) compacted AST to avoid duplication smaller footprint as well Control flow graphs (CFG) explicitly model control flow
CS 540 GMU Spring 2009 6
:=
+
b - (uni) b - (uni) c c
* b - (uni)
c
Linearized IC
Stack based (one address) compact
push 2 push y multiply push x subtract
SPIM
Three address code We are going to use a subset as a mid-level intermediate code Loading/Storing
lw register,addr - moves value into register li register,num - moves constant into register la register,addr - moves address of variable into register sw register,addr - stores value from register
CS 540 GMU Spring 2009 9
Address =
contents of register immediate immediate + contents of register address of symbol address of symbol + or - immediate
Examples
li $t2,5 load the value 5 into register t2 lw $t3,x load value stored at location labeled x into register t3 la $t3,x load address of location labeled x into register t3 lw $t0,($t2) load value stored at address stored in register t2 into register t0 lw $t1,8($t2) load value stored at address stored in register 2 + 8 into register t1
CS 540 GMU Spring 2009 11
Lots of registers we will primarily use 8 ($t0 - $t7) for intermediate code generation Binary arithmetic operators work done in registers (reg1 = reg2 op reg3) reg3 can be a constant
add sub mul div reg1,reg2,reg3 reg1,reg2,reg3 reg1,reg2,reg3 reg1,reg2,reg3
a := b *-c + b*-c
lw $t0,b
b
t0
13
a := b *-c + b*-c
lw $t0,b lw $t1,c
t0
b
t1
14
a := b *-c + b*-c
lw $t0,b lw $t1,c neg $t1,$t1
t0
b - (uni)
t1 t1
15
a := b *-c + b*-c
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0
* t1
t0
b - (uni)
t1 t1
16
a := b *-c + b*-c
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0 lw $t0,b
t0
* t1
t0
b - (uni) b
t1 t1
17
a := b *-c + b*-c
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0 lw $t0,b lw $t2,c
c
* t1
t0
b - (uni) b
t1
t0
t1
c
t2
18
a := b *-c + b*-c
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0 lw $t0,b lw $t2,c neg $t2,$t2
* t1
t0
b - (uni) b - (uni)
t1
t0
t2
t2
t1
19
a := b *-c + b*-c
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0 lw $t0,b lw $t2,c neg $t2,$t0 mul $t0,$t0,$t2
* t1
t0 t1
* t0
t0 t2
t2
b - (uni) b - (uni)
t1
20
a := b *-c + b*-c
+ t1 * t1
t0 t1
* t0
t0 t2
t2
b - (uni) b - (uni)
t1
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0 lw $t0,b lw $t2,c neg $t2,$t0 mul $t0,$t0,$t2 add $t1,$t0,$t1
21
a := b *-c + b*-c
assign a * t1
t0 t1
+ t1 * t0
t0 t2
t2
b - (uni) b - (uni)
t1
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1, $t1,$t0 lw $t0,b lw $t2,c neg $t2,$t0 mul $t0,$t0,$t2 add $t1,$t0,$t1 sw $t1,a
22
a := b * -c + b * -c
lw $t0,b lw $t1,c neg $t1,$t1 mul $t1,$t1,$t0 add $t0,$t1,$t1 sw $t0,a assign
a
t0
t1
t0
* (uni) b - t1
t0
23
Comparison operators
set condition temp1 = temp2 xxx temp3, where xxx is a condition (gt, ge, lt, le, eq) temp1 is 0 for false, nonzero for true.
24
More Spim
Jumps
b label - unconditional branch to label bxxx temp, label conditional branch to label, xxx = condition such as eqz, neq,
Procedure statement
jal label jump and save return address jr register jump to address stored in register
CS 540 GMU Spring 2009 25
Control Flow
while x <= 100 do x := x + 1 end while
branch if false
lw $t0,x li $t1,100 L25: sle $t2,$t0,$t1 beqz $t2,L26 addi $t0,$t0,1 sw $t0,x b L25 L26: loop body
26
27
Loops
print 2 print blank for i = 3 to 100 divides = 0 for j = 2 to i/2 if j divides i evenly then divides = 1 end for if divides = 0 then print i print blank end for exit
28
l1:
# increment i
l2:
29
# increment j
l4:
30
Conditional Statements
print 2 print blank for i = 3 to 100 divides = 0 for j = 2 to i/2 if j divides i evenly then divides = 1 end for if divides = 0 then print i print blank end for exit
31
32
Exiting
li $v0,10 syscall
Read(i)
li $v0,5 syscall sw $v0,i
33
34
.data blank: .asciiz .text li $v0,1 li $a0,2 syscall # print 2 li $v0,4 la $a0,blank # print blank syscall li $v0,1 lw $a0,i syscall li $v0,10 syscall
# print I
# exit
CS 540 GMU Spring 2009 35
.data blank: .asciiz " " .text main: li $v0,1 li $a0,2 syscall li $v0,4 la $a0,blank syscall li $t0,3 # i in t0 li $t1,100 # max in t1 l1: sle $t7,$t0,$t1 beqz $t7,l2 li $t4,0 li $t2,2 # jj in t2 div $t3,$t0,2 # max in t3 l3: sle $t7,$t2,$t3 beqz $t7,l4 rem $t7,$t0,$t2 bnez $t7,l5 li $t4,1 l5: addi $t2,$t2,1 b l3 #end of inner loop l4:
bnez $t4,l6 li $v0,1 move $a0,$t0 syscall # print i li $v0,4 la $a0,blank syscall l6: addi $t0,$t0,1 b l1 li $v0,10 syscall #end of outer loop
l2:
Entire program
36
inner loop
CS 540 GMU Spring 2009
37
PC SPIM
38
Notes
Spim requires a main: label as starting location Data must be prefixed by .data Executable code must be prefixed by .text Data and code can be interspersed You cant have variable names (i.e. labels) that are the same as opcodes in particular, b and j are not good names (branch and jump)
39
Next week
40
Processing Declarations
Global variables vs. local variables Binding name to storage location Basic types: integer, boolean Composite types: records, arrays Tied to expression code generation
41
In SPIM
Declarations generate code in .data sections var_name1: .word 0 var_name2: .word 29,10 var_name3: .space 40
Can also allocate a large space
CS 540 GMU Spring 2009 42
43
Expressions
Grammar: S id := E EE+E E id S E
Generate: lw $t0,b
E E
E0 E
As we parse, generate IC for the given input. Use attributes to pass information about temporary variables up the tree
a := b + c + d + e
44
Expressions
Grammar: S id := E EE+E E id E E0 E1
S
E E E E
Expressions
Grammar: S id := E EE+E E id E0 E0 E1
S
E E E E
Expressions
Grammar: S id := E EE+E E id E0 E0 E1
S
E E 1E E
Expressions
Grammar: S id := E EE+E E id E0 E0 E1
S
E
E 1E
Expressions
Grammar: S id := E EE+E E id E0
S
E
0
E0
E 1E
E1
Expressions
Grammar: S id := E EE+E E id E0 E0 E1
S
E
E 1E
Generate: lw $t0,b lw $t1,c add $t0,$t0,$t1 1 lw $t1,d E add $t0,$t0,$t1 lw $t1,e add $t0,$t0,$t1
Expressions
Grammar: S id := E EE+E E id E0 E0 E1
S
E
E 1E
Generate: lw $t0,b lw $t1,c add $t0,$t0,$t1 1 lw $t1,d E add $t0,$t0,$t1 lw $t1,e add $t0,$t0,$t1 sw $t0,a
EE+E
E id
Records
Typical implementation: allocate a block large enough to hold all record fields
struct s{ type1 field-1; typen field-n; } data_object;
Boundary issues Field names address will be offset from record address
CS 540 GMU Spring 2009 54
Records in Spim
Allocate enough space to hold all of the elements. Multiple ways to do this Record holding 3 (uninitialized) four-byte integers named a,b,c: record: .space record_a: record_b: record_c: 12 OR .word 0 .word 0 .word 0
CS 540 GMU Spring 2009
convert to scalar
55
Records in Spim
Address calculations:
Version 1: base address + offset Ex: to get contents of record.b: la $t0,record add $t0,$t0,4 lw $t1,($t0) Version 2: similar to scalars
CS 540 GMU Spring 2009 56
1-D arrays
a[l..h] with element size s Number of elements: e = h l + 1 Size of array: e * s Address of element a[i], assuming a starts at address b and l <= i <= h:
b + (i - l) * s
a[l] a[l+1] a[l+2]
a[h]
57
Example
a[3..100] with element size 4 Number of elements: 100 3 + 1 = 98 Size of array: 98 * 4 = 392 Address of element a[50], assuming a starts at address 100 100 + (50 - 3) * 4 = 288
a[3] a[4] a[5]
CS 540 GMU Spring 2009
a[100]
58
100
104
Address calculation:
#calculate the address of a[y] word size elements la $t0, a lw $t2,y mul $t2,$t2,4 # multiply by word size add $t0,$t0,$t2 #t0 holds address of a[y] lw $t2,($t0) #t2 hold a[y]
CS 540 GMU Spring 2009 59
Arrays
Typical implementation: large block of storage of appropriate size Row major vs. column major Consider a[4..6,3..4]
Address Row Column b + 0s b + 1s a[4,3] a[4,3] a[4,4] a[5,3]
b + 2s
b + 3s b + 4s b + 5s
a[5,3] a[6,3]
a[5,4] a[4,4] a[6,3] a[5,4] a[6,4] a[6,4]
60
Address b + 0s b + 1s b + 2s b + 3s b + 4s b + 5s b + 6s b + 7s
Address of element a[i,j], assuming a starts at address b and l1 <= i <= h1 and l2 <= j <= h2 :
b + (i - l1) * d1 + (j l2) * s
CS 540 GMU Spring 2009 62
Example
A[3100,450] with elements size 4 98*47 = 4606 elements 4606 * 4 = 18424 bytes long d2 = 4 and d1 = 47 * 4 = 188 If a starts at 100, a[5,5] is:
100+(5-3) * 188 + (5 4) * 4 = 720
63
Address calculation:
#calculate the address of a[x,y] word size elements la $t0,a lw $t1,x mul $t1,$t1,20 # stride = 5 * 4 = 20 add $t0,$t0,$t1 # start of a[x,] lw $t1,y mul $t1,$t1,4 # multiply by word size add $t0,$t0,$t1 #t0 holds address of a[y] lw $t1,($t0) #t2 hold a[y]
64
3-D Arrays
a[4..7,3..4,8..9] Size of third (rightmost) dimension = s Size of second dimension =
s*2
b + 0s
a[4,3,8]
a[4,x]
b + 1s
b + 2s b + 3s b + 4s b + 5s
a[4,3,9]
a[4,4,8] a[4,4,9] a[5,3,8] a[5,3,9]
a[4,3,x]
a[4,4,x]
a[5,3,x]
a[5,4,x] a[6,3,x] a[6,4,x] a[7,3,x] a[7,4,x]
65
a[5,x]
b + 6s
b + 7s b + 8s b + 9s
a[5,4,8]
a[5,4,9] a[6,3,8] a[6,3,9]
a[6,x]
b + 10s
b + 11s b + 12s b + 13s b + 14s
a[6,4,8]
a[6,4,9] a[7,3,8] a[7,3,9] a[7,4,8] a[7,4,9]
a[7,x]
Address of element a[i,j,k], assuming a starts at address b and l1 <= i <= h1 and l2 <= j <= h2 :
b + (i - l1) * d1 + (j l2) * d2 + (k l3) * s
CS 540 GMU Spring 2009 66
Example
A[3100,450,1..4] with elements size 4 98*47* 4 = 18424 elements 18424 * 4 = 73696 bytes long d3 = 4, d2 = 4 * 4 = 16 and d1 = 16 * 47 = 752 If a starts at 100, a[5,5,2] is:
100+(5-3) * 752 + (5 4) * 16 + (2 1)*4 = 1624
CS 540 GMU Spring 2009 67
An object is an abstract data type that encapsulates data, operations and internal state behind a simple, consistent interface.
x z Data Code
y
Data Code
Data
Code
The Concept:
Elaborating the concepts: Each object needs local storage for its attributes Attributes are static (lifetime of object ) Access is through methods Some methods are public, others are private Objects internal state leads to complex behavior
69
Objects
Each object needs local storage for its attributes Access is through methods Heap allocate object records or instances Need consistent, fast access use known, constant offsets in objects Provision for initialization Class variables Inheritance
CS 540 GMU Spring 2009 70
f1 code f2 code
b: c: z f1 f2
f1 code f2 code
b: c: z f1 f2
Better Representation
For object x of type A:
f1 code f2 code
b: c: z f1 f2
b: c: z f1 f2
More typically:
For object x of type A:
N: 2 Class A d: f1 f2
f2 code f1 code
b: c: z
b: c: z
Objects share methods (and static attributes) via shared class object (can keep counter of objects N)
CS 540 GMU Spring 2009 73
74
x
y
self
x y c
Multiple inheritance??
CS 540 GMU Spring 2009 75
Label generation all labels must be unique Nested control structures need a stack
CS 540 GMU Spring 2009 76
Conditional Examples
if (y > 0) then begin
lw $t0,y li $t1,0 sgt $t2,$t0,$t1 beqz $t2,L2 body
L2:
# = 1 if true
body
end
CS 540 GMU Spring 2009
Control Flow
77
Conditional Examples
if (y > 0) then begin
lw $t0,y li $t1,0 sgt $t2,$t0,$t1 beqz $t2,L2 body-1 b L3 L2: body-2 L3:
# = 1 if true
body-1
end else body-2 end
Control Flow
CS 540 GMU Spring 2009 78
Looping constructs
while x < 100 do
L25: lw $t0,x li $t1,100 sge $t2,$t0,$t1 beqz $t2,L26 body b L25 L26:
body end
Control Flow
CS 540 GMU Spring 2009 79
Generating Conditionals
if_stmt IF expr THEN
{ code to eval expr ($2) already done get two new label names output conditional ($2=false) branch to first label}
stmts ELSE
{ output unconditional branch to second label output first label }
stmts ENDIF
{ output second label }
CS 540 GMU Spring 2009 80
Generating Loops
for_stmt FOR id = start TO stop
{ code to eval start ($4) and stop ($6)done get two new label names output code to initialize id = start output label1 output code to compare id to stop output conditional branch to label2}
stmts END
{ increment id (and save) unconditional branch to label1 output label2 }
CS 540 GMU Spring 2009 81
Nested conditionals
Need a stack to keep track of correct labels Can implement own stack
push two new labels at start of statement pop two labels when end statement while generating code, use the two labels on the top of the stack