Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
5 views

15_AssemblyFunctions

Uploaded by

Vishal Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

15_AssemblyFunctions

Uploaded by

Vishal Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Princeton University

Computer Science 217: Introduction to Programming Systems

Assembly Language:
Function Calls

1
Goals of this Lecture

Help you learn:


• Function call problems
• x86-64 solutions
• Pertinent instructions and conventions

2
Function Call Problems
(1) Calling and returning
• How does caller function jump to callee function?
• How does callee function jump back to the right place in caller
function?

(2) Passing arguments


• How does caller function pass arguments to callee function?

(3) Storing local variables


• Where does callee function store its local variables?

(4) Returning a value


• How does callee function send return value back to caller function?
• How does caller function access the return value?

(5) Optimization
• How do caller and callee function minimize memory access? 3
Running Example

long absadd(long a, long b)


{
long absA, absB, sum;
absA = labs(a);
absB = labs(b);
sum = absA + absB;
return sum;
}

Calls standard C labs() function


• Returns absolute value of given long

4
Agenda
Calling and returning
Passing arguments
Storing local variables
Returning a value
Optimization

5
Problem 1: Calling and Returning
How does caller jump to callee?
• i.e., Jump to the address of the callee’s first instruction
How does the callee jump back to the right place in caller?
• i.e., Jump to the instruction immediately following the
most-recently-executed call instruction

… absadd(3L, -4L);
… 1 long absadd(long a, long b)
{
long absA, absB, sum;
absA = labs(a);
absB = labs(b);
2 sum = absA + absB;
return sum;
}

6
Attempted Solution: jmp Instruction
Attempted solution: caller and callee use jmp instruction
f: g:
… …
jmp g # Call g jmp fReturnPoint # Return
fReturnPoint:

7
Attempted Solution: jmp Instruction
Problem: callee may be called by multiple callers
f1: g:
… …
jmp g # Call g jmp ??? # Return
f1ReturnPoint:

f2:

jmp g # Call g
f2ReturnPoint:

8
Attempted Solution: Use Register
Attempted solution: Store return address in register

f1: g:
movq $f1ReturnPoint, %rax …
jmp g # Call g jmp *%rax # Return
f1ReturnPoint:

f2:
movq $f2ReturnPoint, %rax
Special form of
jmp g # Call g
jmp instruction
f2ReturnPoint:

9
Attempted Solution: Use Register
Problem: Cannot handle nested function calls
f:
Problem if f() calls g(),
movq $fReturnPoint, %rax
and g() calls h()
jmp g # Call g
fReturnPoint: Return address g() -> f()

is lost

g:
movq $gReturnPoint, %rax
h:
jmp h # Call h

gReturnPoint:
… jmp *%rax # Return
jmp *%rax # Return
10
x86-64 Solution: Use the Stack
Observations:
• May need to store many return addresses
• The number of nested function calls is not known in advance
• A return address must be saved for as long as the invocation of
this function is live, and discarded thereafter
• Stored return addresses are destroyed in
reverse order of creation
• f() calls g() ⇒ return addr for g is stored RIP for h
• g() calls h() ⇒ return addr for h is stored RIP for g

• h() returns to g() ⇒ return addr for h is destroyed RIP for f


• g() returns to f() ⇒ return addr for g is destroyed
• LIFO data structure (stack) is appropriate

x86-64 solution:
• Use the STACK section of memory, usually accsesed via RSP
• Via call and ret instructions
11
call and ret Instructions
ret instruction “knows” the return address
f: h:

1

call h 2 ret

call g
4
… 3
5
g:

6 call h

ret
12
Stack operations
RSP (stack pointer) register
0
points to top of stack

Instruction Equivalent to
pushq src subq $8, %rsp
movq src, (%rsp)

popq dest movq (%rsp), dest


addq $8, %rsp

RSP

13
Implementation of call
RIP (instruction pointer) register
0
points to next instruction
to be executed
Instruction Equivalent to
Note: Can’t really access
pushq src subq $8, %rsp RIP directly, but this is
movq src, (%rsp) implicitly what call does
popq dest movq (%rsp), dest
addq $8, %rsp

call addr pushq %rip RSP


jmp addr before
call

call instruction pushes return addr


(old RIP) onto stack, then jumps 14
Implementation of call
RIP (instruction pointer) register
0
points to next instruction
to be executed
Instruction Equivalent to
pushq src subq $8, %rsp
movq src, (%rsp)

popq dest movq (%rsp), dest


addq $8, %rsp
RSP
Old RIP
call addr pushq %rip after
jmp addr call

15
Implementation of ret
RIP (instruction pointer) register
0
points to next instruction
to be executed
Instruction Equivalent to
pushq src subq $8, %rsp
movq src, (%rsp)

popq dest movq (%rsp), dest


addq $8, %rsp
RSP
Old RIP
call addr pushq %rip before
jmp addr ret
ret popq %rip
ret instruction pops stack, thus
placing return addr (old RIP) into RIP 16
Implementation of ret
RIP (instruction pointer) register
0
points to next instruction
to be executed
Instruction Equivalent to
pushq src subq $8, %rsp
movq src, (%rsp)

popq dest movq (%rsp), dest


addq $8, %rsp

call addr pushq %rip RSP


jmp addr after
ret popq %rip ret

17
Running Example

# long absadd(long a, long b)


absadd:
# long absA, absB, sum

# absA = labs(a)

call labs

# absB = labs(b)

call labs

# sum = absA + absB

# return sum

ret
18
Agenda
Calling and returning
Passing arguments
Storing local variables
Returning a value
Optimization

19
Problem 2: Passing Arguments
Problem:
• How does caller pass arguments to callee?
• How does callee accept parameters from caller?

long absadd(long a, long b)


{
long absA, absB, sum;
absA = labs(a);
absB = labs(b);
sum = absA + absB;
return sum;
}

20
x86-64 Solution 1: Use the Stack

Observations (déjà vu):


• May need to store many arg sets
• The number of arg sets is not known in advance
• Arg set must be saved for as long as the invocation of this
function is live, and discarded thereafter
• Stored arg sets are destroyed in reverse order of creation
• LIFO data structure (stack) is appropriate

21
x86-64 Solution 2: Use Registers
x86-64 solution:
• Pass first 6 (integer or address) arguments in registers for efficiency
• RDI, RSI, RDX, RCX, R8, R9
• More than 6 arguments ⇒
• Pass arguments 7, 8, … on the stack
• (Beyond scope of COS 217)
• Arguments are structures ⇒
• Pass arguments on the stack
• (Beyond scope of COS 217)

Callee function then saves arguments to stack


• Or maybe not!
• See “optimization” later this lecture
• Callee accesses arguments as positive offsets vs. RSP

22
Running Example
# long absadd(long a, long b) 0
absadd:
pushq %rdi # Push a
pushq %rsi # Push b

# long absA, absB, sum



# absA = labs(a)
movq 8(%rsp), %rdi
call labs
… RSP
# absB = labs(b) b
movq 0(%rsp), %rdi RSP+8
call labs a

# sum = absA + absB
Old RIP

# return sum

addq $16, %rsp
ret 23
Agenda
Calling and returning
Passing arguments
Storing local variables
Returning a value
Optimization

24
Problem 3: Storing Local Variables

Where does callee function store its local variables?

long absadd(long a, long b)


{
long absA, absB, sum;
absA = labs(a);
absB = labs(b);
sum = absA + absB;
return sum;
}

25
x86-64 Solution: Use the Stack
Observations (déjà vu again!):
• May need to store many local var sets
• The number of local var sets is not known in advance
• Local var set must be saved for as long as the invocation of this
function is live, and discarded thereafter
• Stored local var sets are destroyed in reverse order of creation
• LIFO data structure (stack) is appropriate

x86-64 solution:
• Use the STACK section of memory
• Or maybe not!
• See later this lecture

26
Running Example
# long absadd(long a, long b)
absadd: 0
pushq %rdi # Push a
pushq %rsi # Push b

# long absA, absB, sum


subq $24, %rsp RSP
sum
# absA = labs(a) RSP+8
movq 32(%rsp), %rdi absB
call labs RSP+16
… absA
# absB = labs(b)
RSP+24
movq 24(%rsp), %rdi b
call labs
… RSP+32
a
# sum = absA + absB
movq 16(%rsp), %rax
addq 8(%rsp), %rax
Old RIP
movq %rax, 0(%rsp)

# return sum

addq $40, %rsp
ret 27
Agenda
Calling and returning
Passing arguments
Storing local variables
Returning a value
Optimization

28
Problem 4: Return Values
Problem:
• How does callee function send return value back to caller function?
• How does caller function access return value?

long absadd(long a, long b)


{
long absA, absB, sum;
absA = labs(a);
absB = labs(b);
sum = absA + absB;
return sum;
}

29
x86-64 Solution: Use RAX
In principle
• Store return value in stack frame of caller

Or, for efficiency


• Known small size ⇒ store return value in register
• Other ⇒ store return value in stack

x86-64 convention
• Integer or address:
• Store return value in RAX
• Floating-point number:
• Store return value in floating-point register
• (Beyond scope of COS 217)
• Structure:
• Store return value on stack
• (Beyond scope of COS 217)
30
Running Example
# long absadd(long a, long b)
absadd:
pushq %rdi # Push a 0
pushq %rsi # Push b

# long absA, absB, sum


subq $24, %rsp
RSP
# absA = labs(a) sum
movq 32(%rsp), %rdi
call labs RSP+8
movq %rax, 16(%rsp)
absB
RSP+16
# absB = labs(b) absA
movq 24(%rsp), %rdi
call labs RSP+24
movq %rax, 8(%rsp)
b
RSP+32
# sum = absA + absB a
movq 16(%rsp), %rax
addq 8(%rsp), %rax
movq %rax, 0(%rsp)
Old RIP
# return sum
movq 0(%rsp), %rax
addq $40, %rsp Complete!
ret

31
Agenda
Calling and returning
Passing arguments
Storing local variables
Returning a value
Optimization

32
Problem 5: Optimization
Observation: Accessing memory is expensive
• More expensive than accessing registers
• For efficiency, want to store parameters and local variables in
registers (and not in memory) when possible

Observation: Registers are a finite resource


• In principle: Each function should have its own registers
• In reality: All functions share same small set of registers

Problem: How do caller and callee use same set of registers


without interference?
• Callee may use register that the caller also is using
• When callee returns control to caller, old register contents may have
been lost
• Caller function cannot continue where it left off

33
x86-64 Solution: Register Conventions
Callee-save registers
• RBX, RBP, R12, R13, R14, R15
• Callee function must preserve contents
• If necessary…
• Callee saves to stack near beginning
• Callee restores from stack near end
Caller-save registers
• RDI, RSI, RDX, RCX, R8, R9, RAX, R10, R11
• Callee function can change contents
• If necessary…
• Caller saves to stack before call
• Caller restores from stack after call
34
Running Example
Local variable handling in unoptimized version:
• At beginning, absadd() allocates space for local variables
(absA, absB, sum) on stack
• Body of absadd() uses stack
• At end, absadd() pops local variables from stack

Local variable handling in optimized version:


• absadd() keeps local variables in R13, R14, R15
• Body of absadd() uses R13, R14, R15
• Must be careful:
• absadd()cannot change contents of R13, R14, or R15
• So absadd() must save R13, R14, and R15 near beginning,
and restore near end

35
Running Example
# long absadd(long a, long b)
absadd:
pushq %r13 # Save R13, use for absA
pushq %r14 # Save R14, use for absB
absadd() stores local
pushq %r15 # Save R15, use for sum
vars in R13, R14, R15, not
# absA = labs(a)
pushq %rsi # Save RSI
in memory
call labs
movq %rax, %r13
popq %rsi # Restore RSI
absadd() cannot destroy
# absB += labs(b)
contents of R13, R14, R15
movq %rsi, %rdi
call labs
movq %rax, %r14
So absadd() must save
# sum = absA + absB
R13, R14, R15 near
movq %r13, %r15
addq %r14, %r15
beginning and restore near
# return sum
end
movq %r15, %rax
popq %r15 # Restore R15
popq %r14 # Restore R14
popq %r13 # Restore R13
ret

36
Running Example
Parameter handling in unoptimized version:
• absadd() accepts parameters (a and b) in RDI and RSI
• At beginning, absadd() copies contents of RDI and RSI to stack
• Body of absadd() uses stack
• At end, absadd() pops parameters from stack

Parameter handling in optimized version:


• absadd() accepts parameters (a and b) in RDI and RSI
• Body of absadd() uses RDI and RSI
• Must be careful:
• Call of labs() could change contents of RDI and/or RSI
• absadd() must save contents of RDI and/or RSI before call of
labs(), and restore contents after call

37
Running Example
# long absadd(long a, long b)
absadd:
pushq %r13 # Save R13, use for absA
pushq %r14 # Save R14, use for absB
absadd() keeps a and b
pushq %r15 # Save R15, use for sum
in RDI and RSI, not in
# absA = labs(a)
pushq %rsi # Save RSI
memory
call labs
movq %rax, %r13
popq %rsi # Restore RSI
labs() can change RDI
# absB += labs(b)
and/or RSI
movq %rsi, %rdi
call labs
movq %rax, %r14
absadd() must retain
# sum = absA + absB
contents of RSI (value of b)
movq %r13, %r15
addq %r14, %r15
across 1st call of labs()
# return sum
movq %r15, %rax
So absadd() must save
popq %r15 # Restore R15
popq %r14 # Restore R14
RSI before call and restore
popq %r13 # Restore R13
ret
RSI after call
38
Non-Optimized vs. Optimized Patterns
Unoptimized pattern
• Parameters and local variables strictly in memory (stack) during
function execution
• Pro: Always possible
• Con: Inefficient
• gcc compiler uses when invoked without –O option

Optimized pattern
• Parameters and local variables mostly in registers during function
execution
• Pro: Efficient
• Con: Sometimes impossible
• More than 6 local variables
• Local variable is a structure or array
• Function computes address of parameter or local variable
• gcc compiler uses when invoked with –O option, when it can! 39
Hybrid Patterns
Hybrids are possible
• Example
• Parameters in registers
• Local variables in memory (stack)

Hybrids are error prone for humans


• Example (continued from previous)
• Step 1: Access local variable  local var is at stack offset X
• Step 2: Push caller-save register
• Step 3: Access local variable  local var is at stack offset X+8!!!
• Step 4: Call labs()
• Step 6: Access local variable  local var is at stack offset X+8!!!
• Step 7: Pop caller-save register
• Step 8: Access local variable  local var is at stack offset X

Avoid hybrids for Assignment 4


40
Summary
Function calls in x86-64 assembly language
Calling and returning
• call instruction pushes RIP onto stack and jumps
• ret instruction pops from stack to RIP

Passing arguments
• Caller copies args to caller-saved registers (in prescribed order)
• Unoptimized pattern:
• Callee pushes args to stack
• Callee uses args as positive offsets from RSP
• Callee pops args from stack
• Optimized pattern:
• Callee keeps args in caller-saved registers
• Be careful!

41
Summary (cont.)

Storing local variables


• Unoptimized pattern:
• Callee pushes local vars onto stack
• Callee uses local vars as positive offsets from RSP
• Callee pops local vars from stack
• Optimized pattern:
• Callee keeps local vars in callee-saved registers
• Be careful!

Returning values
• Callee places return value in RAX
• Caller accesses return value in RAX

42
Putting it all together
Add up the keys of a tree .text
struct tree { .globl sum
int key; sum:
struct tree *left; # LOCAL VARIABLES:
struct tree *right; # %r12=t, %r13d=partial sum
}; pushq %r12
pushq %r13
int sum (struct tree *t) { movq %rdi, %r12
if (t==NULL) cmpq $0, %r12
return 0; jne .L2
else return t->key + movl $0, %eax
sum(t->left) + jmp .L3
sum(t->right); .L2:
} movl 0(%r12), %r13d
movq 8(%r12), %rdi
call sum
t addl
movq
%eax, %r13d
16(%r12), %rdi
3 ////
4 //// call sum
addl %eax, %r13d
0 movl %r13d, %eax
0 .L3:
popq %r13
5 //// popq %r12
3 0
ret

0
4 5 43

You might also like