Linking & Relocation
Linking & Relocation
Linking & Relocation
The DOS linking program links the different object modules of a source program and
function library routines to generate an integrated executable code of the source program.
The
main input
to
the
linker
is
the
.OBJ
modules of the source programs. Other supporting information may be obtained from
the files generated by the MASM.
The linker program is invoked using the following options.
C> LINK
or
C>LINK MS.OBJ
1
The .OBJ extension is a must for a file to be accepted by the LINK as a valid object file.
The first object may generate a display asking for the object file, list file and libraries as
inputs and an expected name of the .EXE file to be generated. The output of the link
program is an executable file with the entered
filename
executable filename can further be entered at the DOS prompt to execute the file.
In
the
advanced
version
of
the
assembling and linking is combined under a single menu invokable compile function.
The recent versions of MASM have much more sophisticated and user-friendly facilities
and options. A linker links the machine codes with the other required assembled
codes. Linking is necessary because of the number of codes to be linked for the final
binary file.
The linked file in binary for run on a computer is commonly known as
executable file or simply .exe. file. After linking, there has to be re-allocation of the
sequences of placing the codes before actually placement of the codes in the memory.
The loader program performs the task of reallocating the codes after finding the
physical RAM addresses available at a given instant.
The DOS linking program links the different object modules of a source
program and function library routines to generate an integrated executable code of the
source program. The main input
to
the
linker
is
may
be
memory addresses may not start from 0x0000, and binary codes have to be loaded at the
different addresses during the run. The loader finds the appropriate start address.
In a computer, the loader is used and it loads into a section of RAM the program
that is ready to run. A program called locator reallocates the linked file and creates
a file for permanent location of codes in a standard format.
SEGMENT
Combination-type
where the combine-type indicates how the segment is to be located within the load
module. Segments that have different names cannot be combined and segments with the
same name but no combine-type will cause a linker error. The possible combine-types
are:
PUBLIC If the segments in different modules have the same name and combine-type
PUBLIC, then they are concatenated into a single element in the load module. The
ordering in the concatenation is specified by the linker command.
COMMON If the segments in different object modules have the same name and the
combine-type is COMMON, then they are overlaid so that they have the same starting
address. The length of the common segment is that of the longest segment being overlaid.
STACK If segments in different object modules have the same name and the combinetype STACK, then they become one segment whose length is the sum of the lengths of
the individually specified segments. In effect, they are combined to form one large stack
MEMORY This combine-type causes the segment to be placed at the last of the load
module. If more than one segment with the MEMORY combine-type is being linked,
only the first one will be treated as having the MEMORY combine type; the others will
be overlaid as if they had COMMON combine-type.
Source module 1
DATA
SEGMENT
COMMON
DATA
ENDS
CODE
SEGMENT
CODE
ENDS
PUBLIC
Source module 2
DATA
SEGMENT
COMMON
.
.
DATA
ENDS
CODE
SEGMENT
PUBLIC
.
.
CODE
ENDS
Code in Source
Module 1
Code Segment
Code in Source
Module 2
Figure 4.2 Segment combinations resulting from the PUBLIC and COMMON
combination types
Source module 1
STACK_SEG SEGMENT STACK
DW
20 DUP (?)
Source module 2
STACK_SEG
DW
SEGMENT
STACK
30 DUP (?)
STACK_SEG
ENDS
.
.
END
STACK_SEG
50
Words
TOP_OF_STACK
contain two lists, one containing the external identifiers that can be referred to by other
modules.
Two lists are implemented by the EXTRN and PUBLIC directives, which have
the forms:
EXTRN Identifier: Type, Identifier: Type
and
PUBLIC
Identifier,, Identifier
where the identifiers are the variables and labels being declared or as being available to
other modules.
The assembler must know the type of all external identifiers before it can generate the
proper machine code, a type specifier must be associated with each identifier in an
EXTRN statement.
For a variable the type may be BYTE, WORD, or DWORD and for a label it may
be NEAR or FAR.
One of the primary tasks of the linker is to verify that every identifier appearing in an
EXTRN statement is matched by one in a PUBLIC statement. If this is not the case, then
there will be an undefined reference and a linker error will occur.
The offsets for the local identifier will be inserted by the assembler, but the
offsets for the external identifiers and all segment addresses must be inserted by the
linking process.The offsets associated with all external references can be assigned once
all of the object modules have been found and their external symbol tables have been
examined.
The assignment of the segment addresses is called relocation and is done after the
linking process has determined exactly where each segment is to be put in memory.
4.2 STACKS
The stack is a block of memory that may be used for temporarily storing the contents of
the registers inside the CPU. It is a top-down data structure whose elements are accessed
using the stack pointer (SP) which gets decremented by two as we store a data word into
the stack and gets incremented by two as we retrieve a data word from the stack back to
the CPU register.
The process of storing the data in the stack is called pushing into the stack and
the reverse process of transferring the data back from the stack to the CPU register is
known as popping off the stack.
data segment. This means that the data which is pushed into the stack last will be on top
of stack and will be popped off the stack first. The stack pointer is a 16-bit register that
contains the offset address of the memory location in the stack segment.
The stack segment, like any other segment, may have a memory block of a
maximum of 64 Kbytes locations, and thus may overlap with any other segments. Stack
Segment register (SS) contains the base address of the stack segment in the memory.
The Stack Segment register (SS) and Stack pointer register (SP) together address the
stack-top as explained below:
SS
5000H
SP
2050H
If the stack top points to a memory location 52050H, it means that the location 52050H is
already occupied with the previously pushed data. The next 16 bit push operation will
decrement the stack pointer by two, so that it will point to the new stack-top 5204EH and
the decremented contents of SP will be 204EH . This location will now be occupied by
the recently pushed data.
Physical address
50000H
SP
SS
2050H
52050H
Stack top
Physical address
5000H
Thus for a selected value of SS, the maximum value of SP=FFFFH and the segment can
have maximum of 64K locations.
ASSUME CS:CODE,DS:DATA,SS:STACK
DATA SEGMENT
ORG 2000H
SQUARES DB 0FH DUP (?)
DATA ENDS
STACK SEGMENT
STACKDATA DB 100H DUP (?) ;Reserve 256 bytes for stack
STACK ENDS
CODE SEGMENT
START: MOV AX,DATA ;Initialize data segment
MOV DS,AX
MOV AX,STACK ;Initialize stack segment
MOV SS,AX
MOV SP,OFFSET STACKDATA ; Initialize stack pointer
MOV CL,0AH
MOV AL,00H
; Go to next number
INC SI
DCR CL
; Decrement counter
JNZ NEXTNUM
MOV AH,4CH
INT 21H
; Successively add CH to AL
DAA
DCR CH
JNZ AGAIN
MOV AH,AL
POP BH
MOV AL,BH
RET
SQUARE ENDP
CODE ENDS
END START
10
Program 4.2
WAP to program change a sequence of Sixteen 2-byte numbers from ascending to
descending order. The numbers are stored in the data segment. Store the new series at
addresses starting from 6000H. Use LIFO property of stack.
ASSUME CS:CODE, DS:DATA
DATA SEGMENT
LIST DW 10H
ORG 6000H
RESULT DW 10H
COUNT EQU 10H
STACKDATA DB FFH DUP (?)
CODE SEGMENT
START: MOV AX,DATA ;Initialize data segment
MOV DS,AX
MOV SS,AX
MOV SP,OFFSET LIST
MOV CL,COUNT
MOV BX, OFFSET RESULT+COUNT
NEXT:POP AX
MOV DX,SP
MOV SP,BX
PUSH AX
MOV BX,SP
MOV SP,DX
DEC CL
JNZ NEXT
MOV AH,4CH
INT 21H
CODE ENDS
END START
11
4.3 PROCEDURES
A procedure is a set of code that can be branched to and returned from in such a way that
the code is as if it were inserted at the point from which it is branched to. The branch to
procedure is referred to as the call, and the corresponding branch back is known as the
return. The return is always made to the instruction immediately following the call
regardless of where the call is located.
PROC
Attribute
ENDP
The attribute that can be used will be either NEAR or FAR. If the attribute is NEAR, the
RET instruction will only pop a word into the IP register, but if it is FAR, it will also pop
a word into the CS register.
A procedure may be in:
1. The same code segment as the statement that calls it.
2. A code segment that is different from the one containing the statement that calls
it, but in the same source module as the calling statement..
3. A different source module and segment from the calling statement.
12
In the first case, the attribute could be NEAR provided that all calls are in the same
code segment as the procedure. For the latter two cases the attribute must be FAR. If
the procedure is given a FAR attribute, then all calls to it must be intersegment calls
even if the call is from the same code segment. For the third case, the procedure name
must be declared in EXTRN and PUBLIC statements.
13
exceptional condition which initiates type 0 interrupt and such an interrupt is also called
execution) .
In general, the process of interrupting the normal program execution to carry out a
specific task/work is referred to as interrupt.
The interrupt is initiated by a signal generated by an external device or by a signal
generated internal to the processor. When a microprocessor receives an interrupt signal it
14
stops executing current normal program, save the status (or content) of various registers
(IP, CS and flag registers in case of 8086) in stack and then the processor executes a
subroutine/procedure in order to perform the specific task/work requested by the
interrupt. The subroutine/procedure that is executed in response to an interrupt is also
called Interrupt Service Subroutine. (ISR). At the end of ISR, the stored status of registers
in stack is restored to respective registers, and the processor resumes the normal program
execution from the point {instruction) where it was interrupted.
The external interrupts are used to implement interrupt driven data transfer
scheme. The interrupts generated by special instructions are called software interrupts
and they are used to implement system services/calls (or monitor services/calls). The
system/monitor services are procedures developed by system designer for various
operations and stored in memory. The user can call these services through software
interrupts. The interrupts generated by exceptional conditions are used to implement error
conditions in the system.
15
16
vectored interrupts. The vector address for an 8086 interrupt is obtained from a vector
table implemented in the first 1kb memory space (00000h to 03FFFh).
The processor has the facility for accepting or rejecting hardware interrupts.
Programming the processor to reject an interrupt is referred to as masking or disabling
and programming the processor to accept an interrupt is referred to as unmasking or
enabling. In 8086 the interrupt flag (IF) can be set to one to unmask or enable all
hardware interrupts and IF is cleared to zero to mask or disable a hardware interrupts
except NMI.
The interrupts whose request can be either accepted or rejected by the processor
are called maskable interrupts. The interrupts whose request has to be definitely accepted
(or cannot be rejected) by the processor are called non-maskable interrupts. Whenever a
request is made by non-maskable interrupt, the processor has to definitely accept that
request and service that interrupt by suspending its current program and executing an
ISR. In 8086 processor all the hardware interrupts initiated through INTR pin are
maskable by clearing interrupt flag (IF). The interrupt initiated through NMI pin and all
software interrupts are non-maskable.
4.4.4 Sources of Interrupts in 8086
An interrupt in 8086 can come from one of the following three sources.
1. One source is from an external signal applied to NMI or INTR input pin of the
processor. The interrupts initiated by applying appropriate signals to these input
pins are called hardware interrupts.
2. A second source of an interrupt is execution of the interrupt instruction "INT n",
where n is the type number. The interrupts initiated by "INT n" instructions are
called software interrupts.
3. The third source of an interrupt is from some condition produced in the 8086 by
the execution of an instruction. An example of this type of interrupt is divide by
zero interrupt. Program execution will be automatically interrupted if you attempt
to divide an operand by zero. Such conditional interrupts are also known as
exceptions.
17
Contents
Pointer
for
type 0
Address
00000
18
Pointer
for
type 1
00004
00008
Pointer
for
type 2
Pointer
for
type 3
0000C
00010
Pointer
for
type 4
Pointer
for
type N
4*N
Pointer
for
type 255
Reserved for
two-byte INT
instructions
and maskable
external
interrupts
003FC
19
Name
Interrupt with Type
Mnemonics
INT
TYPE
Description
(SP)
(SP) 2
((SP)+1:(SP))
(SP)
(PSW)
(SP) 2
((SP)+1:(SP))
((SP)
(CS)
(SP) 2
((SP)+1:(SP))
(IP)
(CS)
One byte interrupt
INT
(IP)
(TYPE * 4)
(TYPE * 4) + 2
(SP)
(SP) 2
((SP)+1:(SP))
(SP)
(PSW)
(SP) 2
((SP)+1:(SP))
((SP)
(CS)
(SP) 2
((SP)+1:(SP))
Interrupt on Overflow
INTO
(IP)
(IP)
(0CH)
(CS)
(0EH)
If (OF) = 1, then
(SP)
(SP) 2
((SP)+1:(SP))
(SP)
(PSW)
(SP) 2
((SP)+1:(SP))
((SP)
(CS)
(SP) 2
((SP)+1:(SP))
(IP)
(IP)
(10H)
(CS)
(12H)
20
IRET
(IP)
(SP)
(CS)
(SP)
(PSW)
(SP)
((SP)+1:(SP))
(SP) + 2
((SP)+1:(SP))
(SP) + 2
((SP)+1:(SP))
(SP) + 2
IRET is used to return from an interrupt service routine. It is similar to the RET
instruction except that it pops the original contents of the PSW from the stack as well as
the return address.
The INT instruction has one of the forms
INT
or
INT Type
The INT instruction is also often used as a debugging aid in cases where single stepping
provides more detail than is wanted. By inserting INT instructions at key points, called
breakpoints. Within a program a programmer can use an interrupt routine to provide
messages and other information at these points. Hence the 1 byte INT instruction (Type 3
interrupt) is also referred to as breakpoint interrupt.
The INTO instruction has type 4 and causes an interrupt if and only if the OF flag
is set to 1. It is often placed just after an arithmetic instruction so that special processing
will be done if the instruction causes an overflow. Unlike a divide-by-zero fault, an
overflow condition does not cause an interrupt automatically; the interrupt must be
explicitly specified by the INTO instruction. The remaining interrupt types correspond to
interrupts instructions imbedded in the interrupt program or to external interrupts.
21
MOV DS,AX
MOV CX,00H
MOV AH,3CH
INT 21H
JNC FURTHER
22
INT 21H
ISR0A PROC NEAR
MOV BX,AX ; Take file handle in BX,
MOV CX,500H ; Byte count in CX
MOV DX,1000H ; Offset of block in DX
MOV AX,1000H ; Segment value of block
MOV DS,AX
; in DS
MOV AH,40H
INT 21H
; return
ISR0AH ENDP
CODE ENDS
END START
Program 4.4
Write a program that gives display IRT2 is OK if a hardware Signal appears on IRQ2
pin and IRT3 is OK if it appears on IRQ3 pin of PC IO channel.
ASSUME CS:CODE, DS:DATA
DATA SEGMENT
MSG1 DB IRT2 is OK,0AH,0DH,$
MSG2 DB IRT3 is OK,0AH,0DH,$
DATA ENDS
CODE SEGMENT
START : MOV AX,CODE
MOV DS,AX
INT 21H
MOV DX, OFFSET ISR2 ; Set IVT for Type 0BH
MOV AX,250BH
INT 21H
23
MOV AH,09H
INT 21H
IRET
ISR2 ENDS
END START
ISR2 PROC LOCAL
MOV AX,DATA
MOV DS,AX
24
MOV AH,09H
INT 21H
IRET
ISR2 ENDS
END START
4.5 MACROS
4.5.1 Disadvantages of Procedure
1.
2. It sometimes requires more code to program the linkage than is needed to perform
the task. If this is the case, a procedure may not save memory and execution time
is considerably increased.
3. Hence a means is needed for providing the programming ease of a procedure
while avoiding the linkage. This need is fulfilled by Macros.
Macro is a segment of code that needs to be written only once but whose basic structure
can be caused to be repeated several times within a source module by placing a single
statement at the point of each reference.
A macro is unlike a procedure in that the machine instructions are repeated each time
the macro is referenced.Therefore, no memory is saved, but programming time is
conserved (no linkage is required) and some degree of modularity is achieved. The code
that is to be repeated is called the prototype code. The prototype code along with the
statements for referencing and terminating is called the macro definition.
Once a macro is defined, it can be inserted at various points in the program by using
macro calls. When a macro call is encountered by the assembler, the assembler replaces
the call with the macro code. Insertion of the macro code by the assembler for a macro
call is referred to as a macro expansion.
In order to allow the prototype code to be used in a variety of situations, macro
definition and the prototype code can use dummy parameters which can be replaced by
the actual parameters when the macro is expanded. During a macro expansion, the first
25
actual parameter replaces the first dummy parameter in the prototype code, the second
actual parameter replaces the second dummy parameter, and so on.
Macro name has to begin with a letter and can contain letters, numbers and underscore
characters. Dummy parameters in the parameter list should be separated by commas.
Each dummy parameter appearing in the prototype code should be preceded by a %
character. Consider an example that shows the definition of macro for multiplying 2
word operands and storing the result which does not exceed 16 bit.
26
27
associated with more than one location. One solution to this problem would be to have
NEXT replaced by a dummy parameter for the label. This would require the programmer
to keep track of dummy parameters used. One solution to this problem is the use of Local
Labels.
Local labels are special labels that will have suffixes that gets incremented each time
the macros are called. These suffixes are two digit numbers that gets incremented by one
starting from zero. Labels can be declared as local label by attaching a prefix Local.
Local List of Local labels at the end of first statement in the macro definition.
Consider a macro which makes use local labels.
%*DEFINE(ABSOL(OPER)) LOCAL NEXT
( CMP %OPER,0
JGE %NEXT
NEG %OPER
%NEXT:NOP
)
If this macro is called twice using %ABSOL(VAR) and %ABSOL(BX) would
result in following set of codes.
CMP VAR,0
JGE NEXT00
NEG VAR
NEXT00: NOP
.
CMP BX,0
JGE NEXT01
NEG VAR
NEXT01: NOP
28
%*DEFINE(DIFSOR(OPR1,OPR2,RESULT))
( PUSH DX
PUSH AX
%DIF(%OPR1,%OPR2)
IMUL AX
MOV %RESULT,AX
POP AX
POP DX
)
%*DEFINE(DIF(X,Y))
MOV AX,%X
SUB AX,%Y
.
.
%DIFSOR(VAR1,VAR2,ERROR)
.
.
This results in following set of codes
PUSH DX
PUSH AX
MOV AX,VAR1
SUB AX,VAR2
IMUL AX
MOV ERROR,AX
POP AX
POP DX
Program 4.5
A program using Macro for saving the contents of GPRs in the stack.
ASSUME CS:CODE, DS:DATA
DATA SEGMENT
29
SAVEREG MACRO
PUSH AX
PUSH BX
PUSH CX
PUSH DX
ENDM
DATA ENDS
CODE SEGMENT
START : MOV AX,DATA
MOV DS,AX
MOV AX,1234H
MOV BX,2345H
MOV CX,3456H
MOV DX,4567H
SAVEREG
MOV AH,4CH
INT 21H
CODE ENDS
END START
30