Lecture 2 - Arts of x86 Programming
Lecture 2 - Arts of x86 Programming
Definition
A program: A sequence of instructions, commands or statements to be
executed by a microprocessor. Programs are normally written in computer
language.
UCCE2043 Basic Microprocessor
Machine language: Is the coded sequence of 0's and 1', which is the only
thing that the CPU understands. Any program written in any computer
language must eventually be translated into machine language.
Assembly language: Uses alphanumeric mnemonics, which when
Arts of 8088/86 programming combined with certain character symbols forms instructions that a
microprocessor understands. Instructions are combined with other symbols to
make assembly language statements. The statements obey certain syntax
rules that are defined by the assembly language designers. Assembly
language statements need to be translated by another programs (called the
assembler and linker) into executable machine codes.
The instruction set: The collection of instructions (normally grouped by
functionality) that a certain microprocessor understands. Each microprocessor
H Y Lee has its own instruction set.
leehy@utar.edu.my High level language (HLL): Computer language, which is microprocessor
independent, as long as there is an operating system dependent program
(called the compiler) that translated it into machine code.
Source code: Any program written in assembly language or an HLL is
referred to as source code.
1 2
Software prospective
The foundation of many abstract issues in software lies in assembly
language and computer architecture
Data types, addressing modes, stack, input/output
3 4
Process of translating Compiler vs Assembler
English: Display the sum of A times B plus C.
The compiler takes HLL source codes (which
COMPILER is a file of ASCII characters) as input. It
C++: cout << (A * B + C);
produces either an assembly language, as an
intermediate step, or machine codes directly.
Assembly Language: Intel Machine Language: Either way, the final result is machine codes.
mov eax,A A1 00000000 The assembler takes assembly language
mul B F7 25 00000004
add eax,C (which is also a file of ASCII characters) as
03 05 00000008
call WriteInt its source code, and uses this to produce
E8 00500000
ASSEMBLER machine codes.
5 6
7 8
Creating A Program with Assembly MASM (1)
Language Type your assembly codes in NOTEPAD and save as
c:\MASM\OwnCodes\abc.asm.
1. Edit or create a source codes file using a Type C:\MASM\MASM613\BIN\MASM
text editor. c:\MASM\OwnCodes\abc.asm /l
[few lines will appear and you just need to press enter for
2. Assemble the source codes into machine three times and this process will compile your file for errors
and generate abc.obj and abc.lst]
codes object file.
Type C:\MASM\MASM613\BIN\LINK
3. Link the object files into an executable c:\MASM\OwnCodes\abc.obj
program (.com or .exe files). Executable file created
Run the executable file…which is abc.exe
Voila! You have just successfully created and executed
your 1st Assembly Language Program!
9 10
Note: To learn more about DEBUG, kindly refer to “Debug_Tutorial” posted on WBLE.
13 14
15 16
8088/86 Instructions Labels
[Label:] Instruction_mnemonic dest., source ; comment
Act as place markers
• The label is optional.
marks the address (offset) of code and data
• It must always begin with a letter and may contain only letters and
digits. Code label
• It cannot duplicate a register name or instruction mnemonic.
target of jump and loop instructions
A three- or four-letter mnemonic indicating the example:
instruction to be performed
S1: Opcodes [operands]
S2 Assembler Directives
register, register
or register, memory
the assembler ignores
or memory, register
everything after the
NOT memory, memory
semicolon
17 18
19 20
Example of Assembly Language Program
;THE FORM OF AN ASSEMBLY Language PROGRAM (From
Example 2
Muhammad
Program usage no more than 64K
;NOTE: USING SIMPLIFIED SEGMENT DEFINITION Ali Mazidi The
80x86 IBM PC data and 64K code
.MODEL SMALL Compatible
Computers
Stack=256bytes
.STACK 64 Vol.1 and .data=where variables are stored
Vol.2)
.DATA
DATA1 DB 52H
DATA2 DB 29H
SUM DB ?
.CODE .code – beginning of the code
MAIN PROC FAR ;this is the program entry point segment
Beginning of procedure called main
MOV AX, @DATA ;load the data segment address @=address of data
MOV DS, AX ;assign value to DS
MOV AL, DATA1 ;get the first operand
MOV BL, DATA2 ;get the second operand
9=msdos function to display a
ADD AL, BL ;add the operands string
MOV SUM, AL ;store the result in location
MOV AH, 4CH ;set up to return to DOS
Halt the program to return to O/S
INT 21 H End of procedure
MAIN ENDP End of program
END MAIN ;this is the program exit point
21 22
23 24
Segment Definition
Assembler Directives (1)
It is possible to write an Assembly language program that instructions to the assembler instead of instructions
uses only one segment, but normally a program consists of
at least three segments:
to be executed at run-time.
.STACK The most common used directives are the DB and
marks the beginning of the stack segment DW commands to reserve memory.
stack segment defines storage for the stack
E.g.
E.g. .STACK 64 ; reserves 64 bytes of memory for the stack
Data1 dB 88H
.DATA
marks the beginning of the data segment Reply dB ‘Press any key to continue’
data segment defines the data that the program will use
.CODE
variable name assembler directive Numbers or ASCII string
marks the beginning of the code segment
code segment contains the Assembly language instructions
25 26
27 28
Assembler Directives-PROC (function) Summarize of directive
The first line of the segment after the .CODE directive is the
PROC
the procedure is a group of instructions designed to accomplish a
specific function.
A code segment may consist of only one procedure, but usually it
is organized into several small procedures in order to make the
program more structured.
Every procedure must have :
a name defined by the PROC directive,
followed by the assembly language instructions
and closed by the ENDP directive.
The PROC and ENDP statements must have the same label.
The PROC directive may have the option FAR (both IP and CS is
saved) or NEAR (default- IP is saved).
29 30
31 32
Addressing modes (2) Register addressing (1)
Based Relative - 8-bit or 16-bit instruction operand is added to the
contents of a base register (BX or BP), the resulting value is a pointer Memory is not accessed when this mode is
to location where data resides.
Indexed Relative - 8-bit or 16-bit instruction operand is added to the executed.
contents of an index register (SI or DI), the resulting value is a
pointer to location where data resides. Fast
Based Indexed - the contents of a base register (BX or BP) is added
to the contents of an index register (SI or DI), the resulting value is a
Source and destination must have the same
pointer to location where data resides. size
Based Indexed with displacement - 8-bit or 16-bit instruction
operand is added to the contents of a base register (BX or BP) and e.g. ADD AL,BL
index register (SI or DI), the resulting value is a pointer to location
where data resides.
33 34
37 38
39 40
Register indirect (1) Quiz
A choice of four registers (BX,BP,SI,DI) to use within the square
brackets to specify a memory location.
The operand is held by these register.
DS=1234H, SI=2345H, and AX=17ABH.
They must be combined with DS in order to generate the 20-bit MOV [SI],AX
physical address.
is executed. What is the contents of the
E.g. memory location?
MOV [DI],AH ;move the contents of AH into DS:DI
MOV DL,[SI] ;move the contents of DS:SI into DL
MOV AL,[BX] ;move the contents of the memory location Contents of AX are moved into memory locations with logical address
DS:SI and DS:SI+1
;pointed by DS:BX into AL
Therefore the physical address starts at DS + SI = 14685H
According to little endian…
Low address 14685H contains ABH, the low byte
High address 14686H contains 17H, the high byte
41 42
43 44
Based Relative (2) Indexed Relative (1)
6935H
AX 0000
DS:
BX 1B69 Similar to based relative addressing
1B66 35
CX 1643 Except that registers DI and SI hold the offset
1B67 69
DX 29A2 + address
1B68 1B E.g.
SP FF03 1B69 24 MOV DX,[SI]+3 ;Physical address= DS*10 + SI +3
BP 1B03 1B6A 01 MOV CL,[DI]+6 ;Physical address= DS*10 + DI +6
SI 0001 1B6B A2
DI 0004 -3
MOV AX, [BX-3H] ;AX =6935H
45 46
47 48
Based Indexed Relative (1) Based Indexed Relative (2)
241BH
Combines based and indexed addressing. AX 0000
Contents of both registers are not signed numbers (0 DS:
BX 1B69
to 65535) 1B66 35
CX 1643
One base register and one index register are used. 1B67 69
E.g.
DX 29A2 + 1B68 1B
MOV CH,[BX][SI] + 10 ;Physical address= DS*10 + BX +SI +10 SP FF03 1B69 24
MOV AH,[BP][SI] + 5 ;Physical address= SS*10 + BP +SI + 5
BP 1B03 1B6A 01
SI 0001 1B6B A2
DI 0004 -2
MOV AX, [BX+SI-2H] ;AX =241BH
49 50
Note: There are many different ways of naming the addressing modes!
51 52
Port Addressing Addressing Modes: Additional INFO
Use of I/O ports for data communication
MOV AH,[BP][SI] + 5
between the CPU and outside world. ; Physical address= SS*10 + BP +SI + 5
One way to get data is read it from input port Equivalent to
IN AL,DX
MOV AH, [BP+SI+5]; or
; 8-bits are input to AL from I/O port DX
MOV AH, [SI+BP+5] ;
To write data to output port
OUT DX,AL
;8-bits are output from AL to I/O port DX
MOV AX, [SI][DI] + offsets ; Valid?
53 54
57 58
59 60
ADC/SBB AAA/AAS
ADC AX, BX ; AX=1234h, BX=2345h
; AX=1234h + 2345h + 1=357Ah,
; C=0
MOV AL,”9”
AL=39H
AL=10
61 62
DAA/DAS MUL/IMUL(signed)/DIV/IDIV(signed)
Add 06 or 60 if lower or DIV BH ; should BX=05H and DIV BX give you AX=332C
and DX=3 as remainder
upper 4-bits greater than 9
63 64
CBW/CWD Smart programming (1)
A program to add 5 bytes of data The data is first placed in the
CBW (Convert singed byte to signed word) 25h,12h,15h,1Fh, and 2Bh. memory locations
DS:0200 = 25h
will copy D7 (sign flag) to all bits in AH. MOV AL,00h
ADD AL, 25h DS:0201 = 12h
DS:0202 = 15h
CWD (convert signed word to singed ADD AL, 12h
ADD AL,15h DS:0203 = 1Fh
double word) copies D15 of AX to all bits of ADD AL,1Fh DS:0204 = 2Bh
MOV AL,0
ADD AL,2Bh
the DX register. Data and code are mixed in the ADD AL,[0200] ; add the
contents of DS:0200 to AL RETYPE
instructions here THE
MOV AX,6E2FH ; 28,207 = 0110 1110 0010 1111 The problem with it is if the data ADD AL,[0201]
ADD AL,[0202] WHOLE
MOV CX,13D4H ; +5,076 = 0001 0011 1101 0100 changes, the code must be
searched for every place the data ADD AL,[0203] PROG.
ADD AX,CX ;=33,283 = 1000 0010 0000 0011=-32,523 is included and data retyped. ADD AL,[0204]
It is a good idea then to set aside If the data is stored at a different
an area of memory strictly for data offset address, say 0100 h ???
65 66
Smart programming
BEST
Conditional jump instruction will be used to implement the counter
Use BX as a pointer checking logic.
data segment Numlist db 12h,….
MOV AL,0 count equ 5
result dw 01h dup (?)
MOV BX,0200h
ADD AL,[BX]
data ends
code segment
@
org 100h
INC BX start: mov AX,data
ADD AL,[BX] mov DS,AX
xor AX,AX
INC BX SMARTER WAY xor BX,BX
BUT STILL LONG… mov SI,offset numlist
ADD AL,[BX] again: mov BL,[SI]
INC BX add AX,BX
inc SI
ADD AL,[BX] dec CX
INC BX jnz again
mov DI,offset result
ADD AL,[BX] mov [DI],AX
code ENDS
If the offset address of data is to be changed, only end start
one instructions will need to be modified
67 68
Decision Instructions for Signed and Unsigned Integers
Jump instruction
Divided into:
Unconditional jump
Conditional jump
Two cases of jump action:
SHORT jump --- target location within –128 to +127 from
the current location.
Intra-segment or NEAR jump --- only IP is changed;
displacement for direct jump is up to 32K.
Inter-segment or FAR jump --- both CS and IP are
changed;
Useful in decision and repetition of a specific portion
of the program.
69 70
71 72
E.g.1 E.g. 2
CMP dest, src ; dest-src to set flags Copy 1 byte from
; result is NOT stored memory location Compare AL with
DEBUG DS:(BX+2) into the value 61h
The compare instruction can be used AL
1234:0100 cmp ax, [800] to set flags without modifying the
1234:0104 jle 109 destination operand. AGAIN: MOV AL, [BX]+2
1234:0106 jmp 200
1234:0109 CMP AL, 61H
JB NEXT Is AL below the
E.g.3 E.g.4
75 76
E.g.5 Repeat-Until program and instruction
al = *(bx + 2);
sequence
if ( al >= 0x61 && al <= 0x7A ) {
al = al & 0xDF;
}
*si = al;
0005 8A 47 02 AGAIN: MOV AL, [BX]+2
0008 3C 61 CMP AL, 61H
000A 72 06 JB NEXT
000C 3C 7A CMP AL, 7AH
000E 77 02 JA NEXT
0010 24 DF AND AL, 0DFH
0012 88 04 NEXT: MOV [SI], AL
77 78
79 80
LOOPE / LOOPZ LOOPNE / LOOPNZ Single and nested loop
81 82
Example Example
Write a program that calculates the average of five temperatures and writes
Assume that the daily temperatures for the last 30 days the result in AX
have been stored starting at memory location 1200H. Find DATA DB +13,-10,+19,+14,-18 ;0d,f6,13,0e,ee
MOV CX,5 ;LOAD COUNTER]
first day that had a 20-degree temperature. SUB BX, BX ;CLEAR BX,
MOV SI, OFFSET DATA ;SET UP POINTER
MOV CX, 30 ; Set up counter BACK: MOV AL,[SI] ;MOVE BYTE INTO AL
MOV DI, 1200H ; Set up the pointer CBW ;SIGN EXTEND INTO AX
ADD BX, AX ;ADD TO BX
AGAIN: CMP [DI], 20 ; Check temperature
INC SI ;INCREMENT POINTER
INC DI ; Does not affect flags
DEC CX ;DECREMENT COUNTER
LOOPNE AGAIN JNZ BACK ;LOOP IF NOT FINISHED
; If ZF is 0, no day was found MOV AL,5 ;MOVE COUNT TO AL
CBW ;SIGN EXTEND INTO AX
MOV CX,AX ;SAVE DENOMINATOR IN CX
MOV AX,BX ;MOVE SUM TO AX
CWD ;SIGN EXTEND THE SUM
IDIV CX ;FIND THE AVERAGE
85 86
87 88
CALL SUBR1 DIFFERENT STYLES
CALL SUBR1 PROC NEAR
... ; your code
Full Segment Definition Simplified Format
... ;stack segment .model small
These labels do NOT RET Name1 SEGMENT .stack 64
have colons after them. SUBR1 ENDP db 64 dup (?)
Name1 ENDS
MOV AL, 200 ; ;data segment ;data segment
Name2 SEGMENT .data
X value1 db 54 value1 db 54
X CALL SUBR1 ;data
CALL SUBR1 Name2 ENDS .code
SUBR1 PROC FAR ;code segment MAIN PROC FAR
CALL SUBR2 Name3 SEGMENT
... ; your code MAIN PROC FAR mov AX,@data
SUBR1: blah1 ... ASSUME CS: ,DS: ,SS: mov DS,AX
RETF mov AX,Name2
RET mov DS,AX
SUBR1 ENDP MAIN ENDP MAIN ENDP
SUBR2: blah2 Name3 ENDS END MAIN
END MAIN
RET
89 90
DIFFERENT STYLES II
Full Segment Definition Simplified Format
CODE SEGMENT .model small
ASSUME CS:CODE DS:CODE .stack 64
main proc far
.
. ;data segment
call test1 .data
. value1 db 54
.
main proc end .code
MAIN PROC FAR
Test1 proc near
. mov AX,@data
MANY MORE
.
Test endp
mov DS,AX
INSTRUCTIONS…. LEARN IT
Msg db “…” MAIN ENDP YOURSELF
END MAIN
CODE ENDS
END MAIN
91 92
Hand Coding (1) Encoding of reg Field when w field is
OPCODE D W MOD REG R/M present in instruction
2 TO 6 BYTES
Opcode field ---8-BITS
Register Direction Bit (D bit)
1: destination
0: source
Data Size Bit (W bit)
0: 8 bits 1: 16 bits
• Byte 2 has two fields:
Mode field (MOD)
Register field (REG)
Register/memory field (R/M field)
93 94
2-bit MOD field and 3-bit R/M field together specify the second
operand
95 96
Refer to 8088/86 datasheet
HAND CODING HAND CODING (2)
OPCODE D W MOD REG R/M
OPCODE MOD 0 REG R/M
1000 10 0 0 11 000 011 = 88 C3h
1000 1110 11 0 11 000 = 8E D8h
MOV AX, 2000H ; LOAD AX REGISTER MOV BL,AL
8A D8h
MOV DS, AX ; LOAD DATA SEGMENT ADDRESS MOV DS,AX Opcode = 100010
MOV SI, 100H ; LOAD SOURCE BLOCK POINTER Opcode = Move reg. to segment D = 0 (AL source operand -
1000 1110 from)
MOV DI, 120H ; LOAD DESTINATION BLOCK POINTER MOD=11-reg mode no displacement W bit = 0 (8-bits) OR
MOV CX, 10H ; LOAD REPEAT COUNTER REG=11- to DS MOD = 11 (register mode)
R/M=000 – from AX REG = 000 (from AL)
NXTPT: MOV AH,[SI] ; MOVE SOURCE BLOCK ELEMENT TO AH
MOV [DI],AH ; MOVE SOURCE BLOCK ELEMENT FROM AH TO DEST. BLOCK R/M = 011 (to BL)
INC SI ; INCREMENT SOURCE BLOCK POINTER OPCODE D W MOD REG R/M
INC DI ; INCREMENT DESTINA. BLOCK POINTER OPCODE D W MOD REG R/M
DEC CX ; DECREMENT REPET COUNTER 000000 1 1 00 000 100 = 03 04 h 000000 0 1 10 000 001 = 01 81
JNZ NXTPT ; JUMP TO NXTPT IF CX NOT EQUAL TO ZERO ADD AX,[SI] 34 12h
Opcode=000000
NOP ; NO OPERATION ADD [BX][DI] + 1234h, AX
D = 1 (to register)
W bit = 1 (16-bits) Opcode=000000
Identify the type of instruction and hand code the above assembly program!
MOD = 00 (displacement absent) D = 0 (from register)
REG = 000 (to AX) W bit = 1 (16-bits)
MOD = 10 (16-bits displacement)
R/M = 100 ([SI]+disp)
REG = 000
Refer page 74-79, 113-116 The 8088 and 8086 Microprocessors by Walter A.Triebel and Avtar Singh
R/M = 001 ([BX][DI]+disp)
97 98
99 100
Answer QUIZ
Hand code the following instructions
MOV AX, 2000H ; IMMEDIATE DATA TO REGISTER B80020 MOV CX,7
MOV DS, AX ; MOVE REGISTER TO SEGMENT REG 8ED8
MOV SI, 100H ; MOV IMMED. TO REG BE0001 MOV AL,BL
MOV DI, 120H ; MOV IMMED. TO REG BF2001
MOV CX, 10H ; MOV IMMED. TO REG B91000 MOV [6465H],AX
NXTPT: MOV AH,[SI] ; MOV MEMORY DATA TO REG 8A24
MOV [DI],AH ; MOV REGISTER DATA TO MEMORY 8825
MOV DL,[SI]
INC SI ; INCREMENT REG. 46 MOV AX,[BX+4]
INC DI ; INCREMENT REG. 47
DEC CX ; DECREMENT REG. 49 MOV [DL-8],AL
JNZ NXTPT ; JUMP ON NOT EQUAL TO ZERO 75F7
NOP ; NO OPERATION 90 MOV CL,[BX+DI+2080H]
AND AL,[345H]
TEST DX,2003H
101 102