Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
64 views

Offensive Security & Reverse Engineering (OSRE) : Ali Hadi

This document provides an introduction to x86 assembly language. It discusses some key points about x86 assembly including: - The most commonly used 32-bit instructions and architecture components - How a simple "Hello World" C program compares when compiled to different assembly outputs - Important concepts like data types, number bases, endianess, and CPU registers - Differences between CISC and RISC architectures The intent is to expose the reader to fundamental x86 assembly concepts in a crash course style format.

Uploaded by

oscar tebar
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Offensive Security & Reverse Engineering (OSRE) : Ali Hadi

This document provides an introduction to x86 assembly language. It discusses some key points about x86 assembly including: - The most commonly used 32-bit instructions and architecture components - How a simple "Hello World" C program compares when compiled to different assembly outputs - Important concepts like data types, number bases, endianess, and CPU registers - Differences between CISC and RISC architectures The intent is to expose the reader to fundamental x86 assembly concepts in a crash course style format.

Uploaded by

oscar tebar
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 110

Offensive Security & Reverse

Engineering (OSRE)

Ali Hadi
Intro. to x86 Assembly
“Crash Course”

today’s lecture has been re-formatted from Xeno Kovah’s


“Intro. to X86” course found at Open Security Training …

Module #2
About this Lecture
The intent of this lecture is to expose you to the most
commonly generated assembly instructions, and the most
frequently dealt with architecture hardware.

Covers Doesn’t Cover


• The intent of this lecture is to • Floating point
expose you to the most instructions/hardware
commonly generated assembly • 16/64 bit instructions/hardware
instructions, and the most
frequently dealt with architecture • Complicated or rare 32 bit
hardware. instructions
– 32 bit instructions/hardware • Instruction pipeline, caching
– Implementation of a Stack hierarchy, alternate modes of
operation, HW virtualization, etc

@OpenSecurityTraining 4
What you're going to learn
#include <stdio.h>
int main()
{
printf(“Hello World!\n”);
return 0x1234;
}

@OpenSecurityTraining 5
Is the same as…

.text:00401730 main
.text:00401730 push ebp
.text:00401731 mov ebp, esp
.text:00401733 push offset aHelloWorld ; "Hello world\n"
.text:00401738 call ds:__imp__printf
.text:0040173E add esp, 4
.text:00401741 mov eax, 1234h
.text:00401746 pop ebp
.text:00401747 retn

Windows Visual C++ 2005, /GS (buffer overflow protection) option turned off
Disassembled with IDA Pro 4.9 Free Version

@OpenSecurityTraining 6
Is the same as…
08048374 <main>:
8048374: 8d 4c 24 04 lea 0x4(%esp),%ecx
8048378: 83 e4 f0 and $0xfffffff0,%esp
804837b: ff 71 fc pushl -0x4(%ecx)
804837e: 55 push %ebp
804837f: 89 e5 mov %esp,%ebp
8048381: 51 push %ecx
8048382: 83 ec 04 sub $0x4,%esp
8048385: c7 04 24 60 84 04 08 movl $0x8048460,(%esp)
804838c: e8 43 ff ff ff call 80482d4 <puts@plt>
8048391: b8 2a 00 00 00 mov $0x1234,%eax
8048396: 83 c4 04 add $0x4,%esp
8048399: 59 pop %ecx
804839a: 5d pop %ebp
804839b: 8d 61 fc lea -0x4(%ecx),%esp
804839e: c3 ret
804839f: 90 nop

Ubuntu 8.04, GCC 4.2.4


Disassembled with “objdump -d”
@OpenSecurityTraining 7
Is the same as…
_main:
00001fca pushl%ebp
00001fcb movl %esp,%ebp
00001fcd pushl%ebx
00001fce subl $0x14,%esp
00001fd1 calll 0x00001fd6
00001fd6 popl %ebx
00001fd7 leal 0x0000001a(%ebx),%eax
00001fdd movl %eax,(%esp)
00001fe0 calll 0x00003005 ; symbol stub for: _puts
00001fe5 movl $0x00001234,%eax
00001fea addl $0x14,%esp
00001fed popl %ebx
00001fee leave
00001fef ret

Mac OS 10.5.6, GCC 4.0.1


Disassembled from command line with “otool -tV”
@OpenSecurityTraining 8
But it all boils down to…
.text:00401000 main
.text:00401000 push offset aHelloWorld ; "Hello world\n"
.text:00401005 call ds:__imp__printf
.text:0040100B pop ecx
.text:0040100C mov eax, 1234h
.text:00401011 retn

Windows Visual C++ 2005, /GS (buffer overflow protection) option turned off
Optimize for minimum size (/O1) turned on
Disassembled with IDA Pro 4.9 Free Version

@OpenSecurityTraining 9
Instructions Needed
• By one measure, only 14 assembly instructions account for 90%
of code!
– http://www.blackhat.com/presentations/bh-usa-06/BH-US-06-Bilar.pdf
• Knowing about 20-30 (not counting variations) is good enough
that you will have the check the manual very infrequently
• You've already seen 11 instructions, just in the hello world
variations!

@OpenSecurityTraining 10
Refresher(s)

let’s remember some basics …


Data Types

In C: char

short

int/long

double/long long

?->
long double?

@OpenSecurityTraining 12
Decimal, Binary, Hexidecimal
Decimal (base 10) Binary (base 2) Hex (base 16)
00 0000b 0x00
01 0001b 0x01
02 0010b 0x02
03 0011b 0x03
04 0100b 0x04
05 0101b 0x05
06 0110b 0x06
07 0111b 0x07
08 1000b 0x08
09 1001b 0x09
10 1010b 0x0A
11 1011b 0x0B
12 1100b 0x0C
13 1101b 0x0D
14 1110b 0x0E
15 1111b 0x0F

@OpenSecurityTraining 13
Negative Numbers
• “one's complement” = flip all bits. 0->1, 1->0
• “two's complement” = one's complement + 1
• Negative numbers are defined as the “two's complement” of the
positive number

Number One's Comp. Two's Comp. (negative)


00000001b : 0x01 11111110b : 0xFE 11111111b : 0xFF : -1
00000100b : 0x04 11111011b : 0xFB 11111100b : 0xFC : -4
00011010b : 0x1A 11100101b : 0xE5 11100110b : 0xE6 : -26
? ? 10110000b : 0xB0 : -?
• 0x01 to 0x7F positive byte, 0x80 to 0xFF negative byte
• 0x00000001 to 0x7FFFFFFF positive dword
• 0x80000000 to 0xFFFFFFFF negative dword

@OpenSecurityTraining 14
Architecture(s)

the machines world …


CISC vs. RISC
• Intel is CISC - Complex Instruction Set Computer
– Many very special purpose instructions that you will
never see, and a given compiler may never use
• just need to know how to use the manual
– Variable-length instructions, between 1 and 16(?) bytes
long.
• 16 is max len in theory, not sure in practice
• Other major architectures are typically RISC - Reduced
Instruction Set Computer
– Typically more registers, less and fixed-size instructions
– Examples: PowerPC, ARM, SPARC, MIPS

@OpenSecurityTraining 16
Endian
• Endianness comes from Jonathan Swift's Gulliver's Travels. It
doesn't matter which way you eat your eggs :)
• Little Endian - 0x12345678 stored in RAM “little end” first.
The least significant byte of a word or larger is stored in the
lowest address. E.g. 0x78563412
– Intel is Little Endian
• Big Endian - 0x12345678 stored as is.
– Network traffic is Big Endian
– Most of the others you've heard of (PowerPC, ARM, SPARC, MIPS) is
either Big Endian by default or can be configured as either (Bi-
Endian)

@OpenSecurityTraining 17
Endianess Pictures
Big Endian Little Endian
High Memory
(Others) Addresses (Intel)

Register Register
00 0x5 00
FE ED FA CE FE ED FA CE
00 0x4 00
CE 0x3 FE
FA 0x2 ED
ED 0x1 FA
FE 0x0 CE

Low Memory Addresses


@OpenSecurityTraining 18
Registers
• Registers are small memory storage areas built into the
processor (still volatile memory)
• 8 “general purpose” registers + the instruction pointer which
points at the next instruction to execute
– But two of the 8 are not that general
• On x86-32, registers are 32 bits long
• On x86-64, they're 64 bits

@OpenSecurityTraining 19
Register Conventions
• These are Intel's suggestions to compiler developers (and
assembly handcoders). Registers don't have to be used
these ways, but if you see them being used like this, you'll
know why.
• EAX - Stores function return values
• EBX - Base pointer to the data section
• ECX - Counter for string and loop operations
• EDX - I/O pointer

Intel Arch v1 Section 3.4.1 - General-Purpose Registers

@OpenSecurityTraining 20
Registers Conventions – Cont.
• ESI - Source pointer for string operations
• EDI - Destination pointer for string operations
• ESP - Stack pointer
• EBP - Stack frame base pointer
• EIP - Pointer to next instruction to execute (“instruction
pointer”)

@OpenSecurityTraining 21
Registers Conventions – Cont.
• Caller-save registers - EAX, EDX, ECX
– If the caller has anything in the registers that it cares
about, the caller is in charge of saving the value before a
call to a subroutine, and restoring the value after the call
returns
– Put another way - the callee can (and is highly likely to)
modify values in caller-save registers

@OpenSecurityTraining 22
Registers Conventions – Cont.
• Callee-save registers - EBP, EBX, ESI, EDI
– If the callee needs to use more registers than are saved by
the caller, the callee is responsible for making sure the
values are stored/restored
– Put another way - the callee must be a good citizen and
not modify registers which the caller didn't save, unless
the callee itself saves and restores the existing values

@OpenSecurityTraining 23
Registers - 8/16/32 bit Addressing

http://www.sandpile.org/ia32/reg.htm @OpenSecurityTraining 24
Registers - 8/16/32 bit Addressing –
Cont.

http://www.sandpile.org/ia32/reg.htm
@OpenSecurityTraining 25
EFLAGS
• EFLAGS register holds many single bit flags.
• Remember the following for now:
– Zero Flag (ZF) - Set if the result of some instruction is zero;
cleared otherwise
– Sign Flag (SF) - Set equal to the most-significant bit of the
result, which is the sign bit of a signed integer. (0
indicates a positive value and 1 indicates a negative
value.)

Intel Vol 1 Sec 3.4.3 - page 3-20

@OpenSecurityTraining 26
1
Your first x86 instruction:
NOP
• NOP - No Operation! No registers, no values, no nothin'!
• Just there to pad/align bytes, or to delay time
• Bad guys use it to make simple exploits more reliable
– We’ll get to this later ☺

@OpenSecurityTraining 27
Extra! Extra!
Late-breaking NOP news!
• Amaze those who know x86 by citing this interesting bit of
trivia:
• “The one-byte NOP instruction is an alias mnemonic for the
XCHG (E)AX, (E)AX instruction.”
• XCHG instruction is not officially in this class. But if I hadn't
just told you what it does, I bet you would have guessed right
anyway.

@OpenSecurityTraining 28
The Stack
• The stack is a conceptual area of main memory (RAM) which
is designated by the OS when a program is started.
– Different OS start it at different addresses by convention
• A stack is a Last-In-First-Out (LIFO/FILO) data structure where
data is "pushed" on to the top of the stack and "popped" off
the top.
• By convention the stack grows toward lower memory
addresses.
• Adding something to the stack means the top of the stack is
now at a lower memory address.

@OpenSecurityTraining 29
The Stack – Cont.
• As already mentioned, ESP points to the top of the stack, the
lowest address which is being used
– While data will exist at addresses beyond the top of the stack, it is
considered undefined
• The stack keeps track of which functions were called before
the current one, it holds local variables and is frequently used
to pass arguments to the next function to be called.
• A firm understanding of what is happening on the stack is
*essential* to understanding a program's operation.

@OpenSecurityTraining 30
2
PUSH
Push Word, Dword, Qword onto the Stack
• For our purposes, it will always be a DWORD (4 bytes).
– Can either be an immediate (a numeric constant), or the value in a
register
• The push instruction automatically decrements the stack
pointer ESP by 4.

@OpenSecurityTraining 31
Registers Before Registers After
eax 0x00000003 eax 0x00000003
esp 0x0012FF8C push eax esp 0x0012FF88

Stack Before Stack After

0x0012FF90 0x00000001 0x00000001


esp 0x0012FF8C 0x00000002 0x00000002

0x0012FF88 undef esp 0x00000003

0x0012FF84 undef undef

0x0012FF80
undef undef

@OpenSecurityTraining 32
3
POP
Pop a Value from the Stack
• Take a DWORD off the stack, put it in a register, and increment
ESP by 4

@OpenSecurityTraining 33
Registers Before Registers After
eax 0xFFFFFFFF pop eax eax
esp
0x00000003
0x0012FF8C
esp 0x0012FF88

Stack Before Stack After

0x0012FF90 0x00000001 0x00000001


0x0012FF8C 0x00000002 esp 0x00000002
esp 0x0012FF88 0x00000003 undef(0x00000003)
0x0012FF84 undef Undef

0x0012FF80 undef undef

@OpenSecurityTraining 34
Calling Conventions
• How code calls a subroutine is compiler-dependent and
configurable. But there are a few conventions.
• We will only deal with the “cdecl” and “stdcall” conventions.
• More info at
– http://en.wikipedia.org/wiki/X86_calling_conventions
– http://www.programmersheaven.com/2/Calling-conventions

@OpenSecurityTraining 35
Calling Conventions - cdecl
• “C declaration” - most common calling convention
• Function parameters pushed onto stack right to left
• Saves the old stack frame pointer and sets up a new stack
frame
• EAX or EAX:EDX returns the result for primitive data types
• Caller is responsible for cleaning up the stack

@OpenSecurityTraining 36
Calling Conventions - stdcall
• Typically used by Microsoft C++ code (ex: Win32 API)
• Function parameters pushed onto stack right to left
• Saves the old stack frame pointer and sets up a new stack
frame
• EAX or EDX:EAX returns the result for primitive data types
• Callee responsible for cleaning up any stack parameters it
takes

@OpenSecurityTraining 37
4 CALL
Call Procedure
• CALL's job is to transfer control to a different function, in a
way that control can later be resumed where it left off
• First it pushes the address of the next instruction onto the
stack
– For use by RET for when the procedure is done
• Then it changes EIP to the address given in the instruction
• Destination address can be specified in multiple ways
– Absolute address
– Relative address (relative to the end of the instruction)

@OpenSecurityTraining 38
5
RET
Return from Procedure
Two forms
• Pop the top of the stack into EIP (remember pop
increments stack pointer)
– In this form, the instruction is just written as “ret”
– Typically used by cdecl functions
• Pop the top of the stack into EIP and add a constant number
of bytes to ESP
– In this form, the instruction is written as “ret 0x8”, or “ret 0x20”, etc
– Typically used by stdcall functions

Kinda book p. 133 @OpenSecurityTraining 39


6
MOV
Move
• Can move:
– register to register
– memory to register, register to memory
– immediate to register, immediate to memory
• Never memory to memory!
• Memory addresses are given in r/m32 form (coming later)

@OpenSecurityTraining 40
General Stack Frame Operation
We are going to pretend that main() is the very first function being executed
in a program. This is what its stack looks like to start with (assuming it has
any local variables).

stack bottom
Local Variables
main() frame
undef
undef

stack top

@OpenSecurityTraining 41
Stack Frame Operation – Cont.
When main() decides to call a subroutine, main() becomes “the caller”. We will
assume main() has some registers it would like to remain the same, so it will save
them. We will also assume that the callee function takes some input arguments.

stack bottom
Local Variables
main() frame
Caller-Save Registers
undef
Arguments to Pass to Callee
undef

stack top
@OpenSecurityTraining 42
Stack Frame Operation – Cont.
When main() actually issues the CALL instruction, the return address gets saved
onto the stack, and because the next instruction after the call will be the beginning
of the called function, we consider the frame to have changed to the callee.

stack bottom
Local Variables
main() frame
Caller-Save Registers
undef
Arguments to Pass to Callee
Caller's saved return address undef

stack top
@OpenSecurityTraining 43
Stack Frame Operation – Cont.
When foo() starts, the frame pointer (EBP) still points to main()'s frame. So the
first thing it does is to save the old frame pointer on the stack and set the new
value to point to its own frame.
stack bottom
Local Variables
main() frame
Caller-Save Registers
foo()'s frame
Arguments to Pass to Callee
Caller's saved return address undef
Saved Frame Pointer …
stack top

@OpenSecurityTraining 44
Stack Frame Operation – Cont.
Next, we'll assume the the callee foo() would like to use all the registers, and
must therefore save the callee-save registers. Then it will allocate space for its
local variables.
stack bottom
Local Variables
main() frame
Caller-Save Registers
foo()'s frame
Arguments to Pass to Callee
Caller's saved return address undef
Saved Frame Pointer …
Callee-Save Registers stack top

Local Variables
@OpenSecurityTraining 45
Stack Frame Operation – Cont.
At this point, foo() decides it wants to call bar(). It is still the callee-of-
main(), but it will now be the caller-of-bar. So it saves any caller-save
registers that it needs to. It then puts the function arguments on the stack as
well.
stack bottom
Saved Frame Pointer
main() frame
Callee-Save Registers
foo()'s frame
Local Variables
Caller-Save Registers undef
Arguments to Pass to Callee …
stack top
@OpenSecurityTraining 46
General Stack Frame Layout
Every part of the stack frame is technically optional (that is, you can hand
code asm without following the conventions.)
But compilers generate code which uses portions if they are needed. Which
pieces are used can sometimes be manipulated with compiler options. (E.g.
omit frame pointers, changing calling convention to pass arguments in
registers, etc.)

stack bottom
Saved Frame Pointer
main() frame
Callee-Save Registers
foo()'s frame
Local Variables
Caller-Save Registers undef
Arguments to Pass to Callee …
stack top
@OpenSecurityTraining 47
Stack Frames are a Linked List!

The EBP in the current frame points at the saved EBP of the previous frame.

stack bottom

main() frame
foo()'s frame
bar()'s frame

stack top

@OpenSecurityTraining 48
Example1.c
//Example1 - using the stack sub:
//to call subroutines 00401000 push ebp
00401001 mov ebp,esp
//New instructions: 00401003 mov eax,0BEEFh
//push, pop, call, ret, mov 00401008 pop ebp
int sub(){ 00401009 ret
return 0xbeef; main:
00401010 push ebp
}
00401011 mov ebp,esp
int main(){ 00401013 call sub (401000h)
sub(); 00401018 mov eax,0F00Dh
return 0xf00d; 0040101D pop ebp
} 0040101E ret

The stack frames in this example will be very simple.


Only saved frame pointer (EBP) and saved return addresses (EIP).
@OpenSecurityTraining 49
Example1.c 1:
EIP = 00401010, but no instruction yet executed
eax 0x003435C0 ⌘
Key:
⌧ executed instruction,
ebp 0x0012FFB8 ⌘
♍ modified value
esp 0x0012FF6C ⌘
⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp
0x0012FF68 undef
00401003 mov eax,0BEEFh
0x0012FF64 undef
00401008 pop ebp
Belongs to the
00401009 ret
0x0012FF60 undef
main: frame *before*
main() is called
00401010 push ebp
0x0012FF5C undef
00401011 mov ebp,esp
00401013 call sub (401000h)
0x0012FF58 undef
00401018 mov eax,0F00Dh
0040101D pop ebp
0040101E ret
@OpenSecurityTraining 50
Example1.c – Cont.
Key:
eax 0x003435C0 ⌘ ⌧ executed instruction,
ebp 0x0012FFB8 ⌘ ♍ modified value
esp 0x0012FF68 ♍ ⌘ start value

sub:
0x0012FF6C 0x004012E8 ⌘
00401000 push ebp
00401001 mov ebp,esp
0x0012FF68 0x0012FFB8 ♍
00401003 mov eax,0BEEFh 0x0012FF64 undef
00401008 pop ebp
00401009 ret 0x0012FF60 undef
main:
00401010 push ebp ⌧ 0x0012FF5C undef
00401011 mov ebp,esp
00401013 call sub (401000h)
0x0012FF58 undef
00401018 mov eax,0F00Dh
0040101D pop ebp
0040101E ret
@OpenSecurityTraining 51
Example1.c 3
Key:
eax 0x003435C0 ⌘ ⌧ executed instruction,
ebp 0x0012FF68 ♍ ♍ modified value
esp 0x0012FF68 ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp
0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
0x0012FF64 undef
00401008 pop ebp
00401009 ret
0x0012FF60 undef
main:
00401010 push ebp
0x0012FF5C undef
00401011 mov ebp,esp ⌧
00401013 call sub (401000h)
0x0012FF58 undef
00401018 mov eax,0F00Dh
0040101D pop ebp
0040101E ret
@OpenSecurityTraining 52
Example1.c 4
Key:
eax 0x003435C0 ⌘ ⌧ executed instruction
ebp 0x0012FF68 ♍ modified value
esp 0x0012FF64 ♍ ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp
0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
0x0012FF64 0x00401018 ♍
00401008 pop ebp
00401009 ret
0x0012FF60 undef
main:
00401010 push ebp
0x0012FF5C undef
00401011 mov ebp,esp
00401013 call sub (401000h) ⌧
0x0012FF58 undef
00401018 mov eax,0F00Dh
0040101D pop ebp
0040101E ret
@OpenSecurityTraining 53
Example1.c 5
Key:
eax 0x003435C0 ⌘ ⌧ executed instruction,
ebp 0x0012FF68 ♍ modified value
esp 0x0012FF60 ♍ ⌘ start value

sub:
0x0012FF6C 0x004012E8 ⌘
00401000 push ebp ⌧
00401001 mov ebp,esp
0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
0x0012FF64 0x00401018
00401008 pop ebp
00401009 ret
0x0012FF60 0x0012FF68 ♍
main:
00401010 push ebp
0x0012FF5C undef
00401011 mov ebp,esp
00401013 call sub (401000h)
0x0012FF58 undef
00401018 mov eax,0F00Dh
0040101D pop ebp
0040101E ret
@OpenSecurityTraining 54
Example1.c 6
Key:
eax 0x003435C0 ⌘ ⌧ executed instruction,
ebp 0x0012FF60 ♍ ♍ modified value
esp 0x0012FF60 ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp ⌧
0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
0x0012FF64 0x00401018
00401008 pop ebp
00401009 ret
0x0012FF60 0x0012FF68
main:
00401010 push ebp
0x0012FF5C undef
00401011 mov ebp,esp
00401013 call sub (401000h)
0x0012FF58 undef
00401018 mov eax,0F00Dh
0040101D pop ebp
0040101E ret
@OpenSecurityTraining 55
Example1.c 6
STACK FRAME TIME OUT

sub
push ebp
mov ebp, esp “Function-before-
main”'s frame
mov eax, 0BEEFh 0x0012FF6C 0x004012E8 ⌘
pop ebp
main's frame 0x0012FF68 0x0012FFB8
retn
(saved frame pointer
main and saved return address) 0x0012FF64
0x00401018
push ebp
mov ebp, esp sub's frame 0x0012FF60 0x0012FF68
call _sub (only saved frame pointer,
because it doesn't call 0x0012FF5C undef
mov eax, 0F00Dh
anything else, and doesn't
pop ebp have local variables) 0x0012FF58
undef
retn

@OpenSecurityTraining 56
Example1.c 7
Key:
eax 0x0000BEEF ⌧ executed instruction,
ebp 0x0012FF60 ♍ modified value
esp 0x0012FF60 ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp 0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh ⌧
00401008 pop ebp 0x0012FF64 0x00401018
00401009 ret
main: 0x0012FF60 0x0012FF68
00401010 push ebp
00401011 mov ebp,esp
00401013 call sub (401000h) 0x0012FF5C undef
00401018 mov eax,0F00Dh
0040101D pop ebp 0x0012FF58 undef
0040101E ret

@OpenSecurityTraining 57
Example1.c 8
Key:
eax 0x0000BEEF ⌧ executed instruction,
ebp 0x0012FF68 ♍ ♍ modified value
esp 0x0012FF64 ♍ ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp 0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
00401008 pop ebp ⌧ 0x0012FF64 0x00401018
00401009 ret
main: 0x0012FF60 undef ♍
00401010 push ebp
00401011 mov ebp,esp
00401013 call sub (401000h) 0x0012FF5C undef
00401018 mov eax,0F00Dh
0040101D pop ebp 0x0012FF58 undef
0040101E ret

@OpenSecurityTraining 58
Example1.c 9
Key:
eax 0x0000BEEF ⌧ executed instruction,
ebp 0x0012FF68 ♍ modified value
esp 0x0012FF68 ♍ ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp 0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
00401008 pop ebp 0x0012FF64 undef ♍
00401009 ret ⌧
main: 0x0012FF60 undef
00401010 push ebp
00401011 mov ebp,esp
00401013 call sub (401000h) 0x0012FF5C undef
00401018 mov eax,0F00Dh
0040101D pop ebp 0x0012FF58 undef
0040101E ret

@OpenSecurityTraining 59
Example1.c 9
Key:
eax 0x0000F00D ♍ ⌧ executed instruction,
ebp 0x0012FF68 ♍ modified value
esp 0x0012FF68 ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp 0x0012FF68 0x0012FFB8
00401003 mov eax,0BEEFh
00401008 pop ebp 0x0012FF64 undef
00401009 ret
main: 0x0012FF60 undef
00401010 push ebp
00401011 mov ebp,esp
00401013 call sub (401000h) 0x0012FF5C undef
00401018 mov eax,0F00Dh ⌧
0040101D pop ebp 0x0012FF58 undef
0040101E ret

@OpenSecurityTraining 60
Example1.c 10
Key:
eax 0x0000F00D ⌧ executed instruction,
ebp 0x0012FFB8 ♍ ♍ modified value
esp 0x0012FF6C ♍ ⌘ start value

sub: 0x0012FF6C 0x004012E8 ⌘


00401000 push ebp
00401001 mov ebp,esp 0x0012FF68 undef ♍
00401003 mov eax,0BEEFh
00401008 pop ebp 0x0012FF64 undef
00401009 ret
main: 0x0012FF60 undef
00401010 push ebp
00401011 mov ebp,esp
00401013 call sub (401000h) 0x0012FF5C undef
00401018 mov eax,0F00Dh
0040101D pop ebp ⌧ 0x0012FF58 undef
0040101E ret

@OpenSecurityTraining 61
Example1.c 11
Key:
eax 0x0000F00D ⌧ executed instruction,
ebp 0x0012FFB8 ♍ modified value
esp 0x0012FF70 ♍ ⌘ start value

sub: 0x0012FF6C undef ♍


00401000 push ebp
00401001 mov ebp,esp 0x0012FF68 undef
00401003 mov eax,0BEEFh
00401008 pop ebp 0x0012FF64 undef
00401009 ret
main: 0x0012FF60 undef
00401010 push ebp
00401011 mov ebp,esp
00401013 call sub (401000h) 0x0012FF5C undef
00401018 mov eax,0F00Dh
0040101D pop ebp 0x0012FF58 undef
0040101E ret ⌧

Execution would continue at the value ret removed from the stack: 0x004012E8
@OpenSecurityTraining 62
Example1 Notes
• sub() is deadcode - its return value is not used for anything,
and main always returns 0xF00D.
• If optimizations are turned on in the compiler, it would
remove sub()
• Also, because there are no input parameters to sub(), there is
no difference whether we compile as cdecl vs stdcall calling
conventions

@OpenSecurityTraining 63
"r/m32" Addressing Forms
• Anywhere you see an r/m32 it means it could be taking a
value either from a register or a memory address
• I'm just calling these “r/m32 forms” because anywhere you
see “r/m32” in the manual, the instruction can be a variation
of the forms in the next slide

More info: Intel v2a, Section 2.1.5 page 2-4


@OpenSecurityTraining 64
in particular Tables 2-2 and 2-3
"r/m32" Addressing – Cont.
• In Intel syntax, most of the time square brackets [] means to
treat the value within as a memory address, and fetch the
value at that address (like dereferencing a pointer)
– mov eax, ebx
– mov eax, [ebx]
– mov eax, [ebx+ecx*X] (X=1, 2, 4, 8)
– mov eax, [ebx+ecx*X+Y] (Y= one byte, 0-255 or 4 bytes, 0-2^32-1)
• Most complicated form is: [base + index*scale + disp]

More info: Intel v2a, Section 2.1.5 page 2-4


@OpenSecurityTraining 65
in particular Tables 2-2 and 2-3
7
LEA
Load Effective Address
• Frequently used with pointer arithmetic, sometimes for just
arithmetic in general
• Uses the r/m32 form but is the exception to the rule that the
square brackets [ ] syntax means dereference (“value at”)

• Example: suppose ebx = 0x2, edx = 0x1000


– lea eax, [edx+ebx*2]
– eax = 0x1004, not the value at 0x1004

@OpenSecurityTraining 66
8 9
ADD and SUB
• Adds or Subtracts, just as expected
• Destination operand can be r/m32 or register
• Source operand can be r/m32 or register or immediate
• No source and destination as r/m32s, because that could
allow for memory to memory transfer, which isn't allowed
on x86
• Evaluates the operation as if it were on signed AND unsigned
data, and sets flags as appropriate. Instructions modify OF,
SF, ZF, AF, PF, and CF flags
• add esp, 8
• sub eax, [ebx*2]

Add p. 202, Sub p. 210 @OpenSecurityTraining 67


Example2.c - 1 eax
ecx
0xcafe ⌘
0xbabe ⌘
edx 0xfeed ⌘
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF50 ⌘
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF24 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char **
.text:00000010 _main: push ebp ⌧ argv)⌘
0x0012FF2C
.text:00000011 mov ebp, esp 0x2 (int argc) ⌘
.text:00000013 push ecx 0x0012FF28
.text:00000014 mov eax, [ebp+0Ch] Addr after “call _main” ⌘
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50(saved ebp)♍
.text:0000001A push ecx
0x0012FF20 undef
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
.text:00000024 mov [ebp-4], eax
undef
.text:00000027 mov edx, [ebp-4] 0x0012FF18 undef
.text:0000002A push edx
.text:0000002B mov eax, [ebp+8] 0x0012FF14 undef
.text:0000002E push eax
0x0012FF10 undef
.text:0000002F call _sub
.text:00000034 add esp, 8 0x0012FF0C undef
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
executed instruction ⌧ modified value ♍, arbitrary
@OpenSecurityTraining example start value ⌘ 68
Example2.c - 2 eax
ecx
0xcafe
0xbabe
edx 0xfeed
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24 ♍
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF24
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp ⌧
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 undef
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
undef
.text:00000024 mov [ebp-4], eax undef
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 69
Example2.c - 3 eax
ecx
0xcafe
0xbabe
edx 0xfeed
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF20 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx ⌧ 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
Caller-save,
.text:0000001A or space push ecx
0x0012FF20 0xbabe (int a) ♍
for local var? This call dword ptr ds:__imp__atoi
.text:0000001B
.text:00000021
time it turns out toadd esp, 4 0x0012FF1C
undef
.text:00000024
be space for local varmov [ebp-4], eax undef
.text:00000027
since there is no mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
corresponding
.text:0000002B pop, mov eax, [ebp+8] 0x0012FF14
and the address
.text:0000002E is push eax undef
used later to refer tocall _sub 0x0012FF10
.text:0000002F undef
the value we know isadd esp, 8
.text:00000034 0x0012FF0C
.text:00000037
stored in a. mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 70
Example2.c - 4 eax
ecx
0x12FFB0 ♍
0xbabe
edx 0xfeed
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF20
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch] ⌧
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0xbabe (int a)
.text:0000001B call dword ptr ds:__imp__atoi
Getting the base add esp, 4
.text:00000021 0x0012FF1C
undef
.text:00000024
of the argv char * mov [ebp-4], eax undef
.text:00000027 mov edx, [ebp-4] 0x0012FF18
array (aka
.text:0000002A
argv[0])push edx undef
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 71
Example2 - 5 eax
ecx
0x12FFB0
0x12FFD4♍(arbitrary⌘)
edx 0xfeed
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF20
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] ⌧ 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0xbabe (int a)
.text:0000001B call dword ptr ds:__imp__atoi
Getting the
.text:00000021 char *add esp, 4 0x0012FF1C
undef
at argv[1]
.text:00000024 mov [ebp-4], eax undef
.text:00000027
(I chose mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
0x12FFD4
.text:0000002B mov eax, [ebp+8] 0x0012FF14
arbitrarily since push eax
.text:0000002E undef
0x0012FF10
it's out of the call _sub
.text:0000002F undef
.text:00000034
stack scope we're add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
currently
.text:00000039 looking pop ebp
at)
.text:0000003A retn
@OpenSecurityTraining 72
Example2 - 6 eax
ecx
0x100♍ (arbitrary⌘)
0x12FFD4
edx 0xfeed
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF20
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
Saving some slides… 0x0012FF2C 0x2 (int argc)
.text:00000011
This will push the mov ebp, esp
.text:00000013
address of the string push ecx 0x0012FF28 Addr after “call _main”
.text:00000014
at argv[1] (0x12FFD4). mov eax, [ebp+0Ch]
.text:00000017 0x0012FF24 0x0012FF50 (saved ebp)
atoi() will read the mov ecx, [eax+4]
string and turn in into push ecx ⌧
.text:0000001A
0x0012FF20 0xbabe (int a)
an int, put that int in call dword ptr ds:__imp__atoi ⌧
.text:0000001B
eax, and return. Then add esp, 4 ⌧
.text:00000021 0x0012FF1C
undef ♍
.text:00000024
the adding 4 to esp mov [ebp-4], eax undef ♍
.text:00000027
will negate the having mov edx, [ebp-4] 0x0012FF18
.text:0000002A
pushed the input push edx undef
.text:0000002B
parameter and make mov eax, [ebp+8] 0x0012FF14
.text:0000002E
0x12FF1C undefined push eax
undef
0x0012FF10
.text:0000002F
again (this is call _sub undef
.text:00000034
indicative of cdecl) add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 73
Example2 - 7 eax
ecx
0x100
0x12FFD4
edx 0x100 ♍
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF1C ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
First setting “a”
.text:0000001A
equal to the return
push ecx
0x0012FF20 0x100 (int a) ♍
.text:0000001B call dword ptr ds:__imp__atoi
value. Then pushing
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y) ♍
“a” as the second
.text:00000024 mov [ebp-4], eax ⌧
parameter in sub(). undef
.text:00000027 mov edx, [ebp-4] ⌧ 0x0012FF18
We can see an
.text:0000002A push edx ⌧ undef
obvious optimization
.text:0000002B mov eax, [ebp+8] 0x0012FF14
would have been to
.text:0000002E push eax undef
replace the last two 0x0012FF10
.text:0000002F
instructions with
call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
“push eax”.
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 74
Example2 - 8 eax
ecx
0x2 ♍
0x12FFD4
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF18 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y)
.text:00000024 mov [ebp-4], eax 0x2 (int x) ♍
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
.text:0000002B mov eax, [ebp+8] ⌧ 0x0012FF14
Pushing argc as
.text:0000002E push eax ⌧ undef
0x0012FF10
.text:0000002F
the first call _sub undef
.text:00000034
parameter (int x) add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
to sub()
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 75
Example2 - 9 eax
ecx
0x2
0x12FFD4
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF14 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y)
.text:00000024 mov [ebp-4], eax 0x2 (int x)
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx 0x00000034 ♍
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub ⌧ undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 76
Example2 - 10 eax
ecx
0x2
0x12FFD4
edx 0x100
.text:00000000 _sub: push ebp ⌧
.text:00000001 mov ebp, esp ⌧ ebp 0x0012FF10 ♍
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF10 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y)
.text:00000024 mov [ebp-4], eax 0x2 (int x)
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx 0x00000034
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax 0x0012FF24(saved ebp)♍
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 77
Example2 - 11 eax
ecx
0x2 ♍ (no value change)
0x100 ♍
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF10
.text:00000003 mov eax, [ebp+8] ⌧
esp 0x0012FF10
.text:00000006 mov ecx, [ebp+0Ch] ⌧
Move “x” into eax,
.text:00000009 lea eax, [ecx+eax*2]
and “y” into ecx.
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C 0x100 (int y)
.text:00000024 mov [ebp-4], eax
0x2 (int x)
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx 0x00000034
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax 0x0012FF24 (saved ebp)
0x0012FF10
.text:0000002F call _sub
undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 78
Example2 - 12 eax
ecx
0x104 ♍
0x100
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF10
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF10
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009
Set the return value lea eax, [ecx+eax*2] ⌧
.text:0000000C
(eax) to 2*x + y. pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
Note: neither pointer
.text:00000010
arith, nor_main:
an push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011
“address” which was mov ebp, esp
.text:00000013
loaded. Just an push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
afficient way to do a
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
calculation.
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y)
.text:00000024 mov [ebp-4], eax 0x2 (int x)
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx 0x00000034
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax 0x0012FF24 (saved ebp)
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 79
Example2 - 13 eax
ecx
0x104
0x100
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24 ♍
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF14 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp ⌧
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y)
.text:00000024 mov [ebp-4], eax 0x2 (int x)
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx 0x00000034
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef ♍
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 80
Example2 - 14 eax
ecx
0x104
0x100
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF18 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn ⌧ 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
0x100 (int y)
.text:00000024 mov [ebp-4], eax 0x2 (int x)
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef ♍
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 81
Example2 - 15 eax
ecx
0x104
0x100
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF20 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 0x100 (int a)
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
undef ♍
.text:00000024 mov [ebp-4], eax undef ♍
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 ⌧ 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 82
Example2 - 16 eax
ecx
0x104
0x100
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF24
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF24 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 0x0012FF50 (saved ebp)
.text:0000001A push ecx
0x0012FF20 undef ♍
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
undef
.text:00000024 mov [ebp-4], eax undef
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp ⌧
.text:00000039 pop ebp
.text:0000003A retn
@OpenSecurityTraining 83
Example2 - 17 eax
ecx
0x104
0x100
edx 0x100
.text:00000000 _sub: push ebp
.text:00000001 mov ebp, esp ebp 0x0012FF50 ♍
.text:00000003 mov eax, [ebp+8]
esp 0x0012FF28 ♍
.text:00000006 mov ecx, [ebp+0Ch]
.text:00000009 lea eax, [ecx+eax*2]
.text:0000000C pop ebp
.text:0000000D retn 0x0012FF30 0x12FFB0 (char ** argv)
.text:00000010 _main: push ebp
0x0012FF2C 0x2 (int argc)
.text:00000011 mov ebp, esp
.text:00000013 push ecx 0x0012FF28 Addr after “call _main”
.text:00000014 mov eax, [ebp+0Ch]
.text:00000017 mov ecx, [eax+4] 0x0012FF24 undef ♍
.text:0000001A push ecx
0x0012FF20 undef
.text:0000001B call dword ptr ds:__imp__atoi
.text:00000021 add esp, 4 0x0012FF1C
undef
.text:00000024 mov [ebp-4], eax undef
.text:00000027 mov edx, [ebp-4] 0x0012FF18
.text:0000002A push edx undef
.text:0000002B mov eax, [ebp+8] 0x0012FF14
.text:0000002E push eax undef
0x0012FF10
.text:0000002F call _sub undef
.text:00000034 add esp, 8 0x0012FF0C
.text:00000037 mov esp, ebp
.text:00000039 pop ebp ⌧
.text:0000003A retn
@OpenSecurityTraining 84
Control Flow
Two forms of control flow
• Conditional - go somewhere if a condition is met. Think “if”s,
switches, loops
• Unconditional - go somewhere no matter what. Procedure
calls, goto, exceptions, interrupts.
• We’ve already seen procedure calls manifest themselves as
push/call/ret, let’s see how goto manifests itself in
assmembly.

@OpenSecurityTraining 85
1
0 JMP
Jump
• Change EIP to the given address
• Main forms of the address
– Short relative (1 byte displacement from end of the instruction)
• “jmp 00401023” doesn’t have the number 00401023 anywhere in it, it’s
really “jmp 0x0E bytes forward”
• Some disassemblers will indicate this with a mnemonic by writing it as
“jmp short”
– Near relative (4byte displacement from current EIP)
– Absolute (hardcoded address in instruction)
– Absolute Indirect (address calculated with r/m32)

@OpenSecurityTraining 86
Example3.c
(Remain calm)

main:
00401010 push ebp
00401011 mov ebp,esp
00401013 sub esp,8
00401016 mov dword ptr [ebp-4],1
int main(){ 0040101D mov dword ptr [ebp-8],2
int a=1, b=2; 00401024 mov eax,dword ptr [ebp-4]
00401027 cmp eax,dword ptr [ebp-8]
if(a == b){ 0040102A jne 00401033
return 1; 0040102C mov eax,1
} 00401031 jmp 00401056
00401033 mov ecx,dword ptr [ebp-4]
if(a > b){
00401036 cmp ecx,dword ptr [ebp-8]
return 2; 00401039 jle 00401042
Jcc
} 0040103B mov eax,2
if(a < b){ 00401040 jmp 00401056
00401042 mov edx,dword ptr [ebp-4]
return 3; 00401045 cmp edx,dword ptr [ebp-8]
} 00401048 jge 00401051
return 0xdefea7; 0040104A mov eax,3
0040104F jmp 00401056
} 00401051 mov eax,0DEFEA7h
00401056 mov esp,ebp
00401058 pop ebp
00401059 ret
@OpenSecurityTraining 87
1
1 Jcc
Jump If Condition Is Met
• There are more than 4 pages of conditional jump types!
• Luckily a bunch of them are synonyms for each other.
• JNE == JNZ (Jump if not equal, Jump if not zero, both check if
the Zero Flag (ZF) == 0)

@OpenSecurityTraining 88
Notable Jcc Instructions
• JZ/JE: if ZF == 1
• JNZ/JNE: if ZF == 0
• JLE/JNG : if ZF == 1 or SF != OF
• JGE/JNL : if SF == OF
• JBE: if CF == 1 OR ZF == 1
• JB: if CF == 1

• Note: Don’t get hung up on memorizing which flags are set for
what. More often than not, you will be running code in a
debugger, not just reading it. In the debugger you can just
look at EFLAGS and/or watch whether it takes a jump.

@OpenSecurityTraining 89
Flag Setting
• Before you can do a conditional jump, you need something to
set the condition flags for you.
• Typically done with CMP, TEST, or whatever instructions are
already inline and happen to have flag-setting side-effects

@OpenSecurityTraining 90
1
2 CMP
Compare Two Operands
• “The comparison is performed by subtracting the second
operand from the first operand and then setting the status
flags in the same manner as the SUB instruction.”
• What’s the difference from just doing SUB? Difference is that
with SUB the result has to be stored somewhere. With CMP
the result is computed, the flags are set, but the result is
discarded. Thus this only sets flags and doesn’t mess up any of
your registers.
• Modifies CF, OF, SF, ZF, AF, and PF
• (implies that SUB modifies all those too)

@OpenSecurityTraining 91
1
3 TEST
Logical Compare
• “Computes the bit-wise logical AND of first operand (source 1
operand) and the second operand (source 2 operand) and
sets the SF, ZF, and PF status flags according to the result.”
• Like CMP - sets flags, and throws away the result

@OpenSecurityTraining 92
Example4.c

main:
00401010 push ebp
00401011 mov ebp,esp
#define MASK 0x100 00401013 push ecx
00401014 mov dword ptr [ebp-4],1301h
int main(){ 0040101B mov eax,dword ptr [ebp-4]
int a=0x1301; 0040101E and eax,100h
if(a & MASK){ jcc 00401023 je 0040102E
return 1; 00401025 mov eax,1
} 0040102A jmp 00401033 I actually
else{ 0040102C jmp 00401033 expected a
return 2; 0040102E mov eax,2 TEST,
} 00401033 mov esp,ebp because
} 00401035 pop ebp the result
00401036 ret isn't stored
Eventually found out why there
are 2 jmps!
(no optimization, so simple compiler rules) @OpenSecurityTraining 93
Refresher:
Boolean (”bitwise”) logic

AND “&” OR “|” XOR “^”

0 0 0 0 0 0 0 0 0
0 1 0 0 1 1 0 1 1
1 0 0 1 0 1 1 0 1
1 1 1 1 1 1 1 1 0
Operands Result
NOT “~”

0 1
1 0
@OpenSecurityTraining 94
1
4 AND
Logical AND
• Destination operand can be r/m32 or register
• Source operand can be r/m32 or register or immediate (No
source and destination as r/m32s)

and al, bl and al, 0x42


00110011b (al - 0x33) 00110011b (al - 0x33)
AND 01010101b (bl - 0x55) AND 01000010b (imm - 0x42)
result 00010001b (al - 0x11) result 00000010b (al - 0x02)

@OpenSecurityTraining 95
1
5 OR
Logical Inclusive OR

• Destination operand can be r/m32 or register


• Source operand can be r/m32 or register or
immediate (No source and destination as r/m32s)

or al, bl or al, 0x42


00110011b (al - 0x33) 00110011b (al - 0x33)
OR 01010101b (bl - 0x55) OR 01000010b (imm - 0x42)
result 01110111b (al - 0x77) result 01110011b (al - 0x73)

@OpenSecurityTraining 96
1
6 XOR
Logical Exclusive OR
• Destination operand can be r/m32 or register
• Source operand can be r/m32 or register or
immediate (No source and destination as r/m32s)

xor al, al xor al, 0x42


00110011b (al - 0x33) 00110011b (al - 0x33)
XOR 00110011b (al - 0x33) OR 01000010b (imm - 0x42)
result 00000000b (al - 0x00) result 01110001b (al - 0x71)

XOR is commonly used to zero a register, by XORing it with itself, because


it’s faster than a MOV
@OpenSecurityTraining Book p. 231 97
1
7 NOT
One's Complement Negation
• Single source/destination operand can be
r/m32

not al not [al+bl]


NOT 00110011b (al - 0x33) al 0x10000000
result 11001100b (al - 0xCC) bl 0x00001234
al+bl 0x10001234

Xeno trying to be clever on a boring [al+bl] 0 (assumed memory at 0x10001234)


example, and failing… NOT 00000000b
result 11111111b

Book p. 231 @OpenSecurityTraining 98


Example5.c - simple for loop
main:
00401010 push ebp
00401011 mov ebp,esp
00401013 push ecx
00401014 mov dword ptr [ebp-4],0
#include <stdio.h> 0040101B jmp 00401026
0040101D mov eax,dword ptr [ebp-4]
int main(){ 00401020 add eax,1
int i; 00401023 mov dword ptr [ebp-4],eax
for(i = 0; i < 10; i++){ 00401026 cmp dword ptr [ebp-4],0Ah
printf("i = %d\n“, i); 0040102A jge 00401040
} 0040102C mov ecx,dword ptr [ebp-4]
} 0040102F push ecx
00401030 push 405000h
00401035 call dword ptr ds:[00406230h]
What does this add say about the 0040103B add esp,8
calling convention of printf()? 0040103E jmp 0040101D
00401040 xor eax,eax
00401042 mov esp,ebp
Interesting note: Defaults to
00401044 pop ebp
returning 0 @OpenSecurityTraining 99
00401045 ret
Book p. 224
1
8 SHL
Shift Logical Left
• Can be explicitly used with the C “<<” operator
• First operand (source and destination) operand is an r/m32
• Second operand is either CL (lowest byte of ECX), or a 1 byte immediate.
The 2nd operand is the number of places to shift.
• It multiplies the register by 2 for each place the value is shifted. More
efficient than a multiply instruction.
• Bits shifted off the left hand side are “shifted into” (set) the carry flag
(CF)
• For purposes of determining if the CF is set at the end, think of it as n
independent 1 bit shifts.

shl cl, 2 shl cl, 3


00110011b (cl - 0x33) 00110011b (cl - 0x33)
result 11001100b (cl - 0xCC) CF = 0 result 10011000b (cl - 0x98) CF = 1

@OpenSecurityTraining 100
Book p. 225
1
9 SHR
Shift Logical Right
• Can be explicitly used with the C “>>” operator
• First operand (source and destination) operand is an r/m32
• Second operand is either cl (lowest byte of ecx), or a 1 byte immediate.
The 2nd operand is the number of places to shift.
• It divides the register by 2 for each place the value is shifted. More
efficient than a multiply instruction.
• Bits shifted off the right hand side are “shifted into” (set) the carry flag
(CF)
• For purposes of determining if the CF is set at the end, think of it as n
independent 1 bit shifts.

shr cl, 2 shr cl, 3


00110011b (cl - 0x33) 00110011b (cl - 0x33)
result 00001100b (cl - 0x0C) CF = 1 result 00000110b (cl - 0x06) CF = 0

@OpenSecurityTraining 101
Example6.c

//Multiply and divide transformations main:


//New instructions: push ebp
//shl - Shift Left, shr - Shift Right mov ebp,esp
sub esp,0Ch
int main(){ mov dword ptr [ebp-4],40h
unsigned int a, b, c; mov eax,dword ptr [ebp-4]
a = 0x40; shl eax,3
b = a * 8; mov dword ptr [ebp-8],eax
c = b / 16; mov ecx,dword ptr [ebp-8]
return c; shr ecx,4
} mov dword ptr [ebp-0Ch],ecx
mov eax,dword ptr [ebp-0Ch]
mov esp,ebp
pop ebp
ret
@OpenSecurityTraining 102
2
0 LEAVE
High Level Procedure Exit

1026EE94 mov eax,dword ptr [ebp+8]


1026EE97 pop esi
1026EE98 pop edi
1026EE99 leave
1026EE9A ret

• “Set ESP to EBP, then pop EBP”


• That’s all :)
• Then why haven’t we seen it elsewhere already?
• Depends on compiler and options

@OpenSecurityTraining 103
Back to Hello World
.text:00401730 main
.text:00401730 push ebp
.text:00401731 mov ebp, esp
.text:00401733 push offset aHelloWorld ; "Hello world\n“
.text:00401738 call ds:__imp__printf
.text:0040173E add esp, 4
.text:00401741 mov eax, 1234h
.text:00401746 pop ebp
.text:00401747 retn

Are we all comfortable with this now?

Windows Visual C++ 2005, /GS (buffer overflow protection) option turned off
Disassembled with IDA Pro 4.9 Free Version

@OpenSecurityTraining 104
Instructions we now know(20)
• NOP
• PUSH/POP
• CALL/RET
• MOV/LEA
• ADD/SUB
• JMP/Jcc
• CMP/TEST
• AND/OR/XOR/NOT
• SHR/SHL
• LEAVE

@OpenSecurityTraining 105
Intel vs. AT&T Syntax
• Intel: Destination <- Source(s)
– Windows. Think algebra or C: y = 2x + 1;
– mov ebp, esp
– add esp, 0x14 ; (esp = esp + 0x14)
• AT&T: Source(s) -> Destination
– *nix/GNU. Think elementary school: 1 + 2 = 3
– mov %esp, %ebp
– add $0x14,%esp
– So registers get a % prefix and immediates get a $
• Important to know both, so you can read documents in either
format
– We will use Intel syntax

@OpenSecurityTraining 106
Intel vs AT&T Syntax – Cont.
• IMO the hardest-to-read difference is for r/m32 values
• For intel it’s expressed as
[base + index*scale + disp]
• For AT&T it’s expressed as
disp(base, index, scale)
• Examples:
– call DWORD PTR [ebx+esi*4-0xe8]
– call *-0xe8(%ebx,%esi,4)

– mov eax, DWORD PTR [ebp+0x8]


– mov 0x8(%ebp), %eax

– lea eax, [ebx-0xe8]


– lea -0xe8(%ebx), %eax
@OpenSecurityTraining 107
Intel vs AT&T Syntax – Cont.
• For instructions which can operate on different sizes, the
mnemonic will have an indicator of the size.
– movb - operates on bytes
– mov/movw - operates on word (2 bytes)
– movl - operates on “long” (dword) (4 bytes)
• Intel does indicate size with things like “mov dword ptr [eax],
but it’s just not in the actual mnemonic of the instruction

@OpenSecurityTraining 108
SUMMARY
• Learned about the basic hardware registers and how they’re used
• Learned about how the stack is used
• Saw how C code translates to assembly
• Learned basic usage of compilers, disassemblers, and debuggers so that
assembly can easily be explored
• Learned about Intel vs AT&T asm syntax

@OpenSecurityTraining 109
References
• Open Security Training, Introductory Intel x86: (Architecture, Assembly,
Applications, & Alliteration) by Xeno Kovah,
http://www.opensecuritytraining.info/IntroX86.html
• Professional Assembly Language by Blum

@OpenSecurityTraining 110

You might also like