Guide To x86 Assembly 2019
Guide To x86 Assembly 2019
Guide To x86 Assembly 2019
abstraction from the computer's instruction set architecture. Low level programming languages
run generally on instructions and commands or functions in this low level language closely
map to the processor's instruction set. The word "Low" means there is little to no abstraction
turn generates machine code to be executed, machine code is the only language computers
can understand without any processing. Below is an example of machine code in hexadecimal
understand our language, that's where Assembly Language comes into play.
Assembler + Linker
Assembler + linker combined are called Translator which takes the assembly mnemonics we
Segment - CS, CD
Control - EIP
register has capacity of storing 32 bit of data. Think of an EAX register with 32 bit, Lower part
of EAX is called AX which contains 16 bit of data, AX is also further divided in two parts AH
and AL, each with 8 bits in size, the same goes with EBX, ECX and EDX.
EAX - Accumulator Register - used for storing operands and result data
small sizes of 8 bits, however they are divided in upper and lower 16 bits of register.Registers
in a cpu are limited, you can't use them to store larger chunks of data and that's where
memory comes to play. Data can be stored in memory in a stack data structure, the ESP
register serves as an indirect memory operand pointing to the top of the stack at any time.
Consider a stack which contains data, ESP points to the top of that stack. Consider that a
stack currently contains integer value 2 only. so 2 would be at the top of the stack. The ESP
register would point to integer value 2 and in the same way, EBP points to the base of a stack.
This is general memory hierarchy of a computer, Registers are at the top of it being fastest
than rest but smaller in size as well, while moving down the hierarchy, storage size increases
ways to store DataTypes in memory are Little Endian and Big Endian.
Little Endian Data Storage type is generally used in intel based processors where main focus
is processing speed not the amount of power consumed. However Arm makes processors for
mobile devices where battery and power consumption plays an important role, so Big endian
The above image is the representation of how 0x01234567 would be stored in memory. In Big
Endian the data is stored as given, but in Little Endian Bytes are written in another order, from
Text
Data
BSS
Hello World
In the above image is the structure of a Hello world program in assembly.
Entry point of program is a global variable called _start: and the program execution is started
from there. The Text section contains the instructions to print and exit the program, the Data
section contains the Message string "Hello World!" which is used in Instruction of print in text
section.
an orderly fashion.
_start:
Before the 1st instruction of "mov $5, ecx" is executed, EIP points to the address of the first
instruction. After it is executed, EIP is then incremented by 1, so it will now point to the second
instruction. Program execution would flow this way, as an attacker if we want to take control of
the program, we should manipulate the value of EIP. Same as if else statements in higher
level programming languages, assembly also provides mnemonics to control the flow of
jmp - it's like goto function in C, it would jump to the specified location unconditionally.
3. jmp 5
6. je function
7. function :
In above given snippet of code, 1st instruction and 2nd instruction would be executed one
after another, resulting 5 in ecx and edx. The jmp 5 instruction is encountered, so flow is
directly transffered to instruction number 5. So, instruction number 4 won't ever be executed.
Now lets see the cmp instruction, after executing the 3rd instruction, execution comes to the
5th instruction.
Which will compare ecx and edx by substracting one out of another, if substraction is zero, it
means both values stored in registers ecx, and edx are same.
JE instruction will check for the zero flag of above executed instruction. JE simply means jump
if equal as the above instruction, if ecx and eds are equal, je redirects flow to the function:
Level 2
Level 1
Operating System
Level 0
Bare Metal
offers some services to the application running on it. This services are accessible using these
system calls for opening files, mapping memory, reading directory content, etc. All these
actions require interaction with the hardware (the hard drive, the memory management unit)
Every possible linux system call is enumerated, so they can be referenced by the numbers
i.e. EXIT - 1
WRITE - 4
User space program calls for a system call by invoking an Interrupt. That interrupt is then
passed to Interrupt Handlers Table, which invokes system call handler which in turn invokes
specific system call, there are mainly two modes of invoking a SystemCall
Every syscall takes some arguments, so before executing a syscall we need our parameters
ready in registers.
EAX contains the syscall number and rest of the registers contain other arguments, we can
get details about a specific syscall by visiting its man page on linux with "man (syscall name)".
file descriptor, and we'd need ECX to point to our string which we need to print. and at last,
edx to contain the length we need to print. After storing all that we'd simply invoke interrupt
Now let's try writing our first program of printing Hello world! in assembly
global _start
section .text
_start:
mov edx, 12
int 0x80
to execute write, we pushed syscall number of write, which is "4" into eax
in data section,
message: db "Hello world!" means we are defining message as a double word of "Hello
world!"
The awesome image used in this article is called Dino ASCII and was created by Alexandra Hanson.