Module 2
Module 2
Module 2
MODULE II
ASSEMBLERS
SYLLABUS:
Basic Functions of Assembler. Assembler output format – Header, Text and End
Records- Assembler data structures, Two pass assembler algorithm, Hand assembly
of SIC/XE program, Machine dependent assembler features.
Consider the following assembly language program for SIC. This program contains
a main routine that calls the subroutine RDREC which reads records from an input device(
code F1) and WRREC which copies them to an output device(code 05).
The main routine calls subroutines:
• RDREC – To read a record into a buffer.
• WRREC – To write the record from the buffer to the output device.
Page 1
CS303 System Software Module 2
At the end of the file it writes EOF on the output device.(The end of each record is marked
with a null character (hexadecimal 00)).
The line numbers are for reference only. Indexed addressing is indicated by adding
the modifier ”X” following the operand. Lines beginning with ”.” contain comments only.
Page 2
CS303 System Software Module 2
Page 3
CS303 System Software Module 2
• Header record
Col. 1 H
Col. 2 7 Program name
Col. 8 13 Starting address of object program (hex)
Col. 14 19 Length of object program in bytes (hex)
• Text record
Col. 1 T
Col. 2 7 Starting address for object code in this record (hex)
Col. 8 9 Length of object code in this record in bytes (hex)
Page 4
CS303 System Software Module 2
• End record
Col.1 E
Col.2 7 Address of first executable instruction in object program (hex).
(”ˆ” is only for separation only)
We have two columns per byte for object code. Each machine instruction is 3 bytes
that is it occupies 6 columns. In the first text record we are saving 10 machine instructions
each of 3 bytes size. So we are storing a total of 30 bytes (60 columns) which is 1E in
decimal.(IE marked in a circle in the example given).
Forward reference: It is the reference to a label that is defined later in the program.
In the above example in line number 1000 the instruction STL will store the linkage
register with the contents of RETADR. But during the processing of this instruction the
value of this symbol is not known as it is defined at the line number 1033.
To generate the object code for the instruction at 1000 we need the opcode for STL
and the value for the symbol RETADR. But the value or address of RETADR is not
available until 1033. This reference of RETADR before it is defined is called forward
referencing.
Page 5
CS303 System Software Module 2
So generating the object code by scanning the entire program only once becomes
difficult. Due to this reason usually the design is done in two passes. A two pass assembler
resolves the forward references with the help of a SYMBOL TABLE and then converts the
program into the object code.
Page 6
CS303 System Software Module 2
the pass 1. During Pass 2, symbols used as operands are looked up the symbol table to
obtain the address value to be inserted in the assembled instructions. SYMTAB is usually
organized as a hash table for efficiency of insertion and retrieval. A sample SYMTAB is
shown below.
(Both pass 1 and pass 2 require reading the source program. Apart from this an
intermediate file is created by pass 1 that contains each source statement together with its
assigned address, error indicators, etc. This file is one of the inputs to the pass 2. A copy of
the source program is also an input to the pass 2, which is used to retain the operations that
may be performed during pass 1 (such as scanning the operation field for symbols and
addressing flags), so that these need not be performed during pass 2. )
Page 7
CS303 System Software Module 2
Page 8
CS303 System Software Module 2
Page 9
CS303 System Software Module 2
Page 10
CS303 System Software Module 2
Then the first text record is initialized. Comment lines are ignored. OPTAB is
searched to find the object code of an opcode. If there is a symbol in the operand field, the
symbol table is searched to get the address value for this which gets appended to the object
code of the opcode. If the address is not found then zero value is stored as operand's address.
An error flag is set indicating it as undefined. If symbol itself is not found then store 0 as
operand address and the object code instruction is assembled.
If the opcode is BYTE or WORD, then the constant value is converted to its
equivalent object code( for example, for character EOF, its equivalent hexadecimal value
'454f46' is stored). If the object code cannot fit into the current text record, a new text record
is created and the rest of the instructions object code is listed. The text records are written to
the object program. Once the whole program is assemble and when the END directive is
encountered, the End record is written.
Program Relocation
Sometimes it is required to load and run several programs at the same time. The
system must be able to load these programs wherever there is place in the memory.
Therefore the exact starting address is not known until the load time.
In an absolute program the starting address to which the program has to be loaded is
mentioned in the program itself using the START directive. So the address of every
instruction and labels are known while assembling itself. This is called absolute addressing.
Consider an example
This statement says that the register A is loaded with the value stored at location
102D(which is the address of THREE). Suppose we need to load and execute the program at
location 3000 instead of location 1000. Since program is loaded into location 3000, at
address 102D (address of THREE) the required value which needs to be loaded in the
Page 11
CS303 System Software Module 2
register A is no more available. The address of the symbols also get changed relative to the
displacement of the program. Hence we need to make some changes in the address portion
of the instruction so that we can load and execute the program at location 3000.
Since assembler will not know actual location where the program will get loaded, it
cannot make the necessary changes in the addresses used in the program. However, the
assembler can identifies and informs the loader those parts of the program which need
modification. An object program that has the information necessary to perform this kind of
modification is called the relocatable program.
The above diagram shows the concept of relocation. Initially the program is loaded at
location 0000. The instruction JSUB is loaded at location 0006. The address field of this
instruction contains 01036, which is the address of the instruction labeled RDREC. The
second figure shows that if the program is to be loaded at new location 5000. The address of
the instruction JSUB gets modified to new location 6036. Likewise the third figure shows
that if the program is relocated at location 7420, the JSUB instruction would need to be
changed to 4B108456 that correspond to the new address of RDREC.
The only part of the program that require modification at load time are those that
specify direct addresses. The rest of the instructions need not be modified. The instructions
which doesn't require modification are the ones that is not a memory address (immediate
addressing) and PC-relative, Base-relative instructions.
Page 12
CS303 System Software Module 2
It is not possible for the loader to distinguish the address and constant from the
object program. So the assembler must keep some information to tell the loader which part
of the object program need to be modified. For this the concept of modification record is
record.
Modification record is a type of record which is added to the object program. One
modification record is created for each address to be modified. The assembler produces a
modification record to store the starting location and the length of the address field to be
modified.
Page 13
CS303 System Software Module 2
The object code lines at the end starting with M are the descriptions of the
modification records for those instructions which need change if relocation occurs.
M00000705 is the modification suggested for the statement at location 0007 and
requires modification 5-half bytes.
Similarly for the remaining modification records.
Page 14