Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 4

LESSON NO.

07
1.1. SIZE MISMATCH ERRORS
If we change the directive in the last example from DW to DB, the program will still
assemble and debug without errors, however the results will not be the same as
expected. When the first operand is read 0A05 will be read in the register which was
actually two operands place in consecutive byte memory locations. The second number
will be read as 000F which is the zero byte of num4 appended to the 15 of num3. The
third number will be junk depending on the current state of the machine. According to
our data declaration the third number should be at 0114 but it is accessed at 011D
calculated with word offsets. This is a logical error of the program. To keep the
declarations and their access synchronized is the responsibility of the programmer and
not the assembler. The assembler allows the programmer to do everything he wants to
do, and that can possibly run on the processor. The assembler only keeps us from
writing illegal instructions which the processor cannot execute. This is the difference
between a syntax error and a logic error. So the assembler and debugger have both
done what we asked them to do but the programmer asked them to do the wrong chore.
The programmer is responsible for accessing the data as word if it was declared as a
word and accessing it as a byte if it was declared as a byte. The word case is shown in
lot of previous examples. If however the intent is to treat it as a byte the following code
shows the appropriate way.

Example 2.5
001 ; a program to add three numbers using byte variables
002 [org 0x0100]
003 mov al, [num1] ; load first number in al
004 mov bl, [num1+1] ; load second number in bl
005 add al, bl ; accumulate sum in al
006 mov bl, [num1+2] ; load third number in bl
007 add al, bl ; accumulate sum in al
008 mov [num1+3], al ; store sum at num1+3
009
010 mov ax, 0x4c00 ; terminate program
011 int 0x21
012
013 num1: db 5, 10, 15, 0
003 The number is read in AL register which is a byte register since the
memory location read is also of byte size.
005 The second number is now placed at num1+1 instead of num1+2
because of byte offsets.
013 To declare data db is used instead of dw so that each data declared
occupies one byte only.

Inside the debugger we observe that the AL register takes appropriate values and the
sum is calculated and stored in num1+3. This time there is no alignment or
synchronization error. The key thing to understand here is that the processor does not
match defines to accesses. It is the programmer’s responsibility. In general assembly
language gives a lot of power to the programmer but power comes with responsibility.
Assembly language programming is not a difficult task but a responsible one.
In the above examples, the processor knew the size of the data movement operation
from the size of the register involved, for example in “mov ax, [num1]” memory can be
accessed as byte or as word, it has no hard and fast size, but the AX register tells that
this operation has to be a word operation. Similarly in “mov al, [num1]” the AL register
tells that this operation has to be a byte operation. However in “mov ax, bl” the AX
register tells that the operation has to be a word operation while BL tells that this has
to be a byte operation. The assembler will declare that this is an illegal instruction. A
5Kg bag cannot fit inside a 1Kg bag and according to Intel a 1Kg cannot also fit in a 5Kg
bag. They must match in size. The instruction “mov [num1], [num2]” is illegal as
previously discussed not because of data movement size but because memory to
memory moves are not allowed at all.
The instruction “mov [num1], 5” is legal but there is no way for the processor to know
the data movement size in this operation. The variable num1 can be treated as a byte or
as a word and similarly 5 can be treated as a byte or as a word. Such instructions are
declared ambiguous by the assembler. The assembler has no way to guess the intent of
the programmer as it previously did using the size of the register involved but there is
no register involved this time. And memory is a linear array and label is an address in
it. There is no size associated with a label. Therefore to resolve its ambiguity we clearly
tell our intent to the assembler in one of the following ways.
mov byte [num1], 5
mov word [num1], 5

1.2. REGISTER INDIRECT ADDRESSING


We have done very elementary data access till now. Assume that the numbers we had
were 100 and not just three. This way of adding them will cost us 200 instructions.
There must be some method to do a task repeatedly on data placed in consecutive
memory cells. The key to this is the need for some register that can hold the address of
data. So that we can change the address to access some other cell of memory using the
same instruction. In direct addressing mode the memory cell accessed was fixed inside
the instruction. There is another method in which the address can be placed in a
register so that it can be changed. For the following example we will take 10 instead of
100 numbers but the algorithm is extensible to any size.
There are four registers in iAPX88 architecture that can hold address of data and
they are BX, BP, SI, and DI. There are minute differences in their working which will be
discussed later. For the current example, we will use the BX register and we will take
just three numbers and extend the concept with more numbers in later examples.

Example 2.6
001 ; a program to add three numbers using indirect addressing
002 [org 0x100]
003 mov bx, num1 ; point bx to first number
004 mov ax, [bx] ; load first number in ax
005 add bx, 2 ; advance bx to second number
006 add ax, [bx] ; add second number to ax
007 add bx, 2 ; advance bx to third number
008 add ax, [bx] ; add third number to ax
009 add bx, 2 ; advance bx to result
010 mov [bx], ax ; store sum at num1+6
011
012 mov ax, 0x4c00 ; terminate program
013 int 0x21
014
015 num1: dw 5, 10, 15, 0
003 Observe that no square brackets around num1 are used this time.
The address is loaded in bx and not the contents. Value of num1 is
0005 and the address is 0117. So BX will now contain 0117.
004 Brackets are now used around BX. In iapx88 architecture brackets
can be used around BX, BP, SI, and DI only. In iapx386 more
registers are allowed. The instruction will be read as “move into ax the
contents of the memory location whose address is in bx.” Now since
bx contains the address of num1 the contents of num1 are
transferred to the ax register. Without square brackets the meaning of
the instruction would have been totally different.
005 This instruction is changing the address. Since we have words not
bytes, we add two to bx so that it points to the next word in memory.
BX now contains 0119 the address of the second word in memory.
This was the mechanism to change addresses that we needed.

Inside the debugger we observe that the first instruction is “mov bx, 011C.” A
constant is moved into BX. This is because we did not use the square brackets around
“num1.” The address of “num1” has moved to 011C because the code size has changed
due to changed instructions. In the second instruction BX points to 011C and the value
read in AX is 0005 which can be verified from the data window. After the addition BX
points to 011E containing 000A, our next word, and so on. This way the BX register
points to our words one after another and we can add them using the same instruction
“mov ax, [bx]” without fixing the address of our data in the instructions. We can also
subtract from BX to point to previous cells. The address to be accessed is now in total
program control.
One thing that we needed in our problem to add hundred numbers was the capability
to change address. The second thing we need is a way to repeat the same instruction
and a way to know that the repetition is done a 100 times, a terminal condition for the
repetition. For the task we are introducing two new instructions that you should read
and understand as simple English language concepts. For simplicity only 10 numbers
are added in this example. The algorithm is extensible to any size.

Example 2.7
001 ; a program to add ten numbers
002 [org 0x0100]
003 mov bx, num1 ; point bx to first number
004 mov cx, 10 ; load count of numbers in cx
005 mov ax, 0 ; initialize sum to zero
006
007 l1: add ax, [bx] ; add number to ax
008 add bx, 2 ; advance bx to next number
009 sub cx, 1 ; numbers to be added reduced
010 jnz l1 ; if numbers remain add next
011
012 mov [total], ax ; write back sum in memory
013
014 mov ax, 0x4c00 ; terminate program
015 int 0x21
016
017 num1: dw 10, 20, 30, 40, 50, 10, 20, 30, 40, 50
018 total: dw 0
006 Labels can be used on code as well. Just like data labels they
remember the address at which they are used. The assembler does
not differentiate between code labels and data labels. The
programmer is responsible for using a data label as data and a code
label as code. The label l1 in this case is the address of the following
instruction.
009 SUB is the counterpart to ADD with the same rules as that of the
ADD instruction.
010 JNZ stands for “jump if not zero.” NZ is the condition in this
instruction. So the instruction is read as “jump to the location l1 if
the zero flag is not set.” And revisiting the zero flag definition “the zero
flag is set if the last mathematical or logical operation has produced a
zero in its destination.” For example “mov ax, 0” will not set the zero
flag as it is not a mathematical or logical instruction. However
subtraction and addition will set it. Also it is set even when the
destination is not a register. Now consider the subtraction
immediately preceding it. If the CX register becomes zero as a result
of this subtraction the zero flag will be set and the jump will be taken.
And jump to l1, the processor needs to be told each and everything
and the destination is an important part of every jump. Just like
when we ask someone to go, we mention go to this market or that
house. The processor is much more logical than us and needs the
destination in every instruction that asks it to go somewhere. The
processor will load l1 in the IP register and resume execution from
there. The processor will blindly go to the label we mention even if it
contains data and not code.

The CX register is used as a counter in this example, BX contains the changing


address, while AX accumulates the result. We have formed a loop in assembly language
that executes until its condition remains true. Inside the debugger we can observe that
the subtract instruction clears the zero flag the first nine times and sets it on the tenth
time. While the jump instruction moves execution to address l1 the first nine times and
to the following line the tenth time. The jump instruction breaks program flow.
The JNZ instruction is from the program control group and is a conditional jump,
meaning that if the condition NZ is true (ZF=0) it will jump to the address mentioned
and otherwise it will progress to the next instruction. It is a selection between two
paths. If the condition is true go right and otherwise go left. Or we can say if the
weather is hot, go this way, and if it is cold, go this way. Conditional jump is the most
important instruction, as it gives the processor decision making capability, so it must
be given a careful thought. Some processors call it branch, probably a more logical
name for it, however the functionality is same. Intel chose to name it “jump.”
An important thing in the above example is that a register is used to reference
memory so this form of access is called register indirect memory access. We used the
BX register for it and the B in BX and BP stands for base therefore we call register
indirect memory access using BX or BP, “based addressing.” Similarly when SI or DI is
used we name the method “indexed addressing.” They have the same functionality, with
minor differences because of which the two are called base and index. The differences
will be explained later, however for the above example SI or DI could be used as well,
but we would name it indexed addressing instead of based addressing.

You might also like