Intel Architecture
Intel Architecture
Decimal-to-Hexadecimal:
420.62510 =
420.62510 = 42010 + .62510
Division
420 16
26 16
1 16
Multiplication
.625 x 16
420.62510 = 1A4.A16
413510 = 102716
625.62510 = 271.A16
Quotient
26
1
0
Product
10.00
Remainder
4
LSB
10 (or A)
1
MSB
Carry-out
10 (or A)
Number Systems
Binary-Coded Hexadecimal (BCH):
2AC = 0010 1010 1100
1000 0011 1101 . 1110 = 83D.E
Complements
Data are stored in complement form to represent
negative numbers
Ones complements of 01001100
1111 1111
-0100 1100
1011 0011
Twos complements
1011 0011
+0000 0001
1011 0100
246 on 8085
20,000 on 8086 and 8088
RISC Reduced Instruction Set Computer
Executes one instruction per clock
...ha?
Pipelining
Registers
Pipelining
AL
BL
CL
DL
Registers
Overview
The stack
Registers
To store information temporarily
AX
16-bit register
AH
AL
8-bit reg. 8-bit reg.
Category
Bits
Register Names
General
Pointer
Index
Segment
16
8
16
16
16
Instruction
Flag
16
16
Anatomy of a Register
Extended Register
Word Register
Bits 16-31
Bits 8-15
Bits 0-7
General Registers
32 bit Registers
16 bit Registers
8 bit Registers
EAX
EBP
AX
BP
AH
AL
EBX
ESI
BX
SI
BH
BL
ECX
EDI
CX
DI
CH
CL
EDX
ESP
DX
SP
DH
DL
Bits 16-31
Bits 8-15
Bits 0-7
General Registers I
EAX Accumulator
accumulator for operands and results data
usually used to store the return value of a procedure
ECX Counter
counter for string and loop operations
General Registers II
ESI Source Index
source pointer for string operations
typically a pointer to data in the segment pointed to
by the DS register
Segment Registers
CS Code Segment
SS Stack Segment
contains the segment selector for the stack
segment, where the procedure stack is stored
Instruction Pointer
EIP
Instruction Pointer
Contains the offset within the code segment of
the next instruction to be executed
Cannot be accessed directly by software
The Stack
EBP
Current
stack
frame
Callers
stack
frame
stack
growth
Intel Assembly
Intel Assembly
Goal: to gain a knowledge of Intel 32-bit assembly instructions
References:
M. Pietrek, Under the Hood: Just Enough Assembly Language to Get By
MSJ Article, February 1998 www.microsoft.com/msj
Part II, MSJ Article, June 1998 www.microsoft.com/msj
IA-32 Intel Architecture Software Developers Manual,
Volume 1: Basic Architecture
www.intel.com/design/Pentium4/documentation.htm#manuals
Volume 2A: Instruction Set Reference A-M
www.intel.com/design/pentium4/documentation.htm#manuals
Volume 2B: Instruction Set Reference N-Z
www.intel.com/design/pentium4/documentation.htm#manuals
Assembly Programming
Machine Language
binary
hexadecimal
machine code or object code
Assembly Language
mnemonics
assembler
High-Level Language
Pascal, Basic, C
compiler
Preprocessing
& Compiling
Assembly Code
Assembly
Executable Code
Object Code
Linking
DLLs
Preprocessing
& Compiling
M
E
S
S
SA
I
D
Assembly Code
Y
L
B
Executable Code
Assembly
Object Code
Linking
DLLs
32-bit Instructions
Instructions are represented in memory by a series
of opcode bytes.
A variance in instruction size means that
disassembly is position specific.
Most instructions take zero, one, or two arguments:
instruction destination, source
For example: add eax, ebx
is equivalent to the expression eax = eax + ebx
Rule #3:
If a value less than FFH is moved into a 16-bit register, the rest of the
bits are assumed to be all zeros.
MOV BX, 5
BX =0005
BH = 00, BL = 05
Program Segments
A segment is an area of memory that includes up to 64K bytes
Begins on an address evenly divisible by 16
8085 could address a max. of 64K bytes of physical memory
- it has only 16 pins for the address lines (216 = 64K)
Program Segments
Program Segments
Code segment
The 8086 fetches the instructions (opcodes and operands) from the code segments.
Physical address
Offset address
A location within a 64KB segment range
A range of 0000H to FFFFH
Logical address
consist of a segment value and an offset address
Program Segmentsexample
Define the addresses for the 8086 when it fetches the instructions
(opcodes and operands) from the code segments.
Logical address:
Consist of a CS (code segment) and an IP (instruction pointer)
format is CS:IP
Offset address
IP contains the offset address
Physical address
generated by shifting the CS left one hex digit and then adding it to the
IP
the resulting 20-bit address is called the physical address
Program Segmentsexample
Suppose we have:
CS
IP
2500
95F3
Logical address:
Consist of a CS (code segment) and an IP (instruction pointer)
format is CS:IP
2500:95F3H
Offset address
IP contains the offset address which is
95F3H
Physical address
generated by shifting the CS left one hex digit and then adding it to the
IP
25000 + 95F3 = 2E5F3H
Program Segments
Data segment
Format DS:BX or DI or SI
example:
DS:0200 = 25
DS:0201 = 12
DS:0202 = 15
DS:0203 = 1F
DS:0204 = 2B
Program Segments
Data segment
Example:
Add 5 bytes of data: 25H, 12H, 15H, 1FH, 2BH
Not using data segment
MOV
ADD
ADD
ADD
ADD
ADD
AL,00H
AL,25H
AL,12H
AL,15H
AL,1FH
AL,2BH
;clear AL
;add 25H to AL
Program Segments
Data segment
Example:
Add 5 bytes of data: 25H, 12H, 15H, 1FH, 2BH
Program:
MOV
ADD
ADD
ADD
ADD
ADD
AL,0
AL,[0200]
AL,[0201]
AL,[0202]
AL,[0203]
AL,[0204]
Program Segments
Data segment
Example:
Add 5 bytes of data: 25H, 12H, 15H, 1FH, 2BH
AL,0
BX,0200H
AL,[BX]
BX
AL,[BX]
BX
AL,[BX]
BX
AL,[BX]
Endian conversion
Little endian conversion:
In the case of 16-bit data, the low byte goes to the low
memory location and the high byte goes to the high memory
address. (Intel, Digital VAX)
Program Segments
Stack segment
Stack
A section of RAM memory used by the CPU to store
information temporarily.
Registers: SS (Stack Segment) and SP (stack Pointer)
Operations: PUSH and POP
PUSH the storing of a CPU register in the stack
POP loading the contents of the stack back into the CPU
Flag Register
Flag Register (status register)
16-bit register
Conditional flags: CF, PF, AF, ZF, SF, OF
Control flags: TF, IF, DF
ZF
Flow Control I
JMP location
Transfers program control to a different point in the
instruction stream without recording return
information.
jmp eax
jmp 0x00934EE4
Flow Control II
CMP value, value / Jcc location
The compare instruction compares two values, setting or
clearing a variety of flags (e.g., ZF, SF, OF). Various
conditional jump instructions use flags to branch
accordingly.
cmp eax, 4
je 40320020
eax, eax
40DA0020
test
jz
edx, 0056FCE2
56DC0F20
;PA = DS (sl) + SI + 5
;PA = DS (sl) + DI + 20
;PA=DS(sl)+BX+DI +8
;PA=SS(sl)+BP+SI +29
Assembly Language
Programming
Assembly Programming
Assembly Language instruction consist of four fields
[label:]
Labels
See rules
mnemonic, operands
MOV AX, 6764
comment
; this is a sample program
Model Definition
MODEL directive selects the size of the memory model
MODEL MEDIUM
Data must fit into 64KB
Code can exceed 64KB
MODEL COMPACT
Data can exceed 64KB
Code cannot exceed 64KB
MODEL LARGE
Data can exceed 64KB (but no single set of data should exceed 64KB)
Code can exceed 64KB
MODEL HUGE
Data can exceed 64KB (data items i.e. arrays can exceed 64KB)
Code can exceed 64KB
MODEL TINY
Data must fit into 64KB
Code must fit into 64KB
Used with COM files
Segments
Segment definition:
The 80x86 CPU has four segment registers: CS, DS, SS, ES
Segments of a program:
.STACK
; marks the beginning of the stack segment
example:
.STACK 64
.DATA
example:
.DATA1 DB
52H
;DB directive allocates memory in byte-size chunks
Segments
.CODE
; marks the beginning of the code segment
- starts with PROC (procedures) directive
- the PROC directive may have the option FAR or NEAR
- ends by ENDP directives
INPUT
PROGRAM
OUTPUT
1.
keyboard
editor
myfile.asm
2.
myfile.asm
MASM or TASM
myfile.obj
myfile.lst
myfile.crf
3.
myfile.obj
LINK or TLINK
myfile.exe
myfile.map
0010H
DB
DB
DB
DB
DB
DB
DB
25
10001001B
12H
2591
?
Hello
O Hi
;decimal
;binary
;hex
;ASCII numbers
;set aside a byte
;ASCII characters
;ASCII characters
DATA3
DB
30 DUP(?)
DATA4
DB
DW
DW
DW
DW
DW
342
01010001001B
123FH
9,6,0CH, 0111B,Hi
8 DUP (?)
;decimal
;binary
;hex
;Data numbers
;set aside 8 words
EQU
25
DD
DD
DD
DD
1023
;decimal
01010001001001110110B ;binary
7A3D43F1H
;hex
54H, 65432H,65533
;Data numbers
example:
DATA1
DATA2
DATA3
DQ
DQ
DQ
6723F9H
Hi
?
;hex
;ASCII characters
;nothing
example:
DATA1
DATA2
DATA3
DT
DT
DT
123456789123
?
76543d
;BCD
;nothing
;assembler will convert decimal
number to hex and store it
SEGMENT
; statements
ENDS
[options]
108
Assembly language */
110
111
Using MASM
Developed by Microsoft
Used to translate 8086 assembly language
into machine language
3 steps:
Loading a file
General settings
Views
Navigating through the code
Adding analysis content
Searches (binary, text)
Patching & scripting
Exiting and saving