Assembler
Assembler
Assembler
Assembly Language
The basic unit of an assembly language program is a line of code.
The specific language is defined by a set of rules that specify the symbols
that can be used and how they may be combined to form a line of code.
A. Symbolic Address
It consists of one, two, or three, but not more than three alphanumeric
characters.
The first character must be a letter; the next two may be letters or
numerals. The symbol can be chosen arbitrarily by the programmer.
Assembler 1
The first must be a three-letter symbol defining an MRI operation.
The third symbol, which may or may not be present, is the letter I. If I is
missing, the line denotes a direct address instruction. The presence of the
symbol I denotes an indirect address instruction.
Assembler 2
→Pseudo Instruction
The ORG (origin) pseudo instruction informs the assembler that the
instruction or operand in the following line is to be placed in a memory
location specified by the number next to ORG.
It is possible to use ORG more than once in a program to specify more than
one segment of memory.
Assembler 3
The END symbol is placed at the end of the program to inform the
assembler that the program is terminated.
The other two pseudo instructions specify the radix of the operand and tell
the assembler how to convert the listed number to a binary number.
C. Comments
The third field in a program is reserved for comments.
A line of code may or may not have a comment, but if it has, it must be
preceded by a slash for the assembler to recognize the beginning of a
comment field.
EXAMPLE - The first line has the pseudo instruction ORG to define the origin of
the program at memory location (100)16. The next six lines define machine
instructions, and the last four have pseudo instructions. Three symbolic
addresses have been used and each is listed in column 1 as a label and in
column 2 as an address of a memory-reference instruction. Three of the
pseudo instructions specify operands, and the last one signifies the END of the
program.
ASSEMBLER
Translation to Binary
Assembler 4
The translation of the symbolic program into binary is done by a special
program called an assembler.
Starting from the first line, we encounter an ORG pseudo instruction. This
tells us to start the binary program from hexadecimal location 100.
The symbolic name of the operation is LDA. Checking Table 6-1 we find
that the first hexadecimal digit of the instruction should be 2.
The binary value of the address part must be obtained from the address
symbol SUB .
NOTE- We scan the label column and find this symbol in line 9. To determine
its hexadecimal value we note that line 2 contains an instruction for location 100
and every other line specifies a machine instruction or an operand for
sequential memory locations. Counting lines, we find that label SUB in line 9
corresponds to memory location 107. So the hexadecimal address of the
instruction LOA must be 107. When the two parts of the instruction are
assembled, we obtain the hexadecimal code 2107.
Assembler 5
Two lines in the symbolic program specify decimal operands with the
pseudo instruction DEC. A third specifies a zero by means of a HEX pseudo
instruction (DEC could be used as well). Decimal 83 is converted to binary
and placed in location 106 in its hexadecimal equivalent. Decimal -23 is a
negative number and must be converted into binary in signed-2's
complement form.
The location assignment will define the address value of labels and
facilitate the translation process during the second scan.
When the first scan is completed, we associate with each label its
location number and form a table that defines the hexadecimal value of
each symbolic address.
During the second scan of the symbolic program we refer to the address
symbol table to determine the address value of a memory-reference
instruction.
Assembler 6
For example, the line of code LDA SUB is translated during the second scan
by getting the hexadecimal value of LDA from Table 6-1 and the
hexadecimal value of SUB from the address-symbol table listed above. We
then assemble the two parts into a four-digit hexadecimal instruction.
The Assembler
An assembler is a program that accepts a symbolic language program and
produces its binary machine language equivalent.
The input symbolic program is called the source program and the resulting
binary program is called the object program .
LINE OF CODE
Two characters can be stored in each word since a memory word has a
capacity of 16 bits.
Operation and address symbols are terminated with a space and the end
of the line is recognized by the CR code.
Assembler 7
The label PL3 occupies two words and is terminated by the code for
comma (2C).
The instruction field in the line of code may have one or more symbols.
Each symbol is terminated by the code for space (20) except for the last
symbol, which is terminated by the code of carriage return (0D).
If the line of code has a comment, the assembler recognizes it by the code
for a slash (2F).
The assembler neglects all characters in the comment field and keeps
checking for a CR code. When this code is encountered, it replaces the
space code after the last symbol in the line of code.
Assembler 8
FIRST PASS
A two-pass assembler scans the entire symbolic program twice.
During the first pass, it generates a table that correlates all user-defined
address symbols with their binary equivalent value.
The content of LC stores the value of the memory location assigned to the
instruction or operand presently being processed.
The ORG pseudo instruction initializes the location counter to the value of
the first location. Since instructions are stored in sequential locations,
the content of LC is incremented by 1 after processing each line of code.
To avoid ambiguity in case ORG is missing, the assembler sets the location
counter to 0 initially.
Assembler 9
1. LC is initially set to 0. A line of symbolic code is analyzed to determine if it
has a label (by the presence of a comma).
2. If the line of code has no label, the assembler checks the symbol in the
instruction field.
4. If the line has an END pseudo instruction, the assembler terminates the first
pass and goes to the second pass.
5. Note that a line with ORG or END should not have a label.
6. If the line of code contains a label, it is stored in the address symbol table
together with its binary equivalent number specified by the content of LC.
Nothing is stored in the table if no label is encountered.
SECOND PASS
Machine instructions are translated during the second pass by means of
table look up procedures.
Assembler 10
A table-lookup procedure is a search of table entries to determine whether
a specific item matches one of the items stored in the table.
The assembler uses four tables. Any symbol that is encountered in the
program must be available as an entry in one of these tables; otherwise, the
symbol cannot be interpreted.
2. MRI table.
3. Non-MRI table.
The entries of the pseudo instruction table are the four symbols ORG, END,
DEC, and HEX. Each entry refers the assembler to a subroutine that
processes the pseudo instruction when encountered in the program.
The non-MRI table contains the symbols for the 18 register-reference and
input-output instructions and their 16-bit binary code equivalent.
The address symbol table is generated during the first pass of the
assembly process.
The assembler searches these tables to find the symbol that it is currently
processing in order to determine its binary value.
Assembler 11
A.
2. Labels are neglected during the second pass, so the assembler goes
immediately to the instruction field and proceeds to check the first
symbol encountered.
3. It first checks the pseudo instruction table. A match with ORG sends the
assembler to a subroutine that sets LC to an initial value. A match with
END terminates the translation process.
Assembler 12
4. An operand pseudo instruction causes a conversion of the operand into
binary. This operand is placed in the memory location specified by the
content of LC.
B.
2. If the symbol is not found in this table, the assembler refers to the non-MRI
table.
4. The assembler stores the 16-bit instruction code into the memory word
specified by LC. The location counter is incremented and a new line
analyzed.
C.
1. When a symbol is found in the MRI table, the assembler extracts its
equivalent 3-bit code and inserts it in bits 2 through 4 of a word.
4. The three parts of the binary instruction code are assembled and then
stored in the memory location specified by the content of LC.
D.
Assembler 13
1. Task of an assembler is to check for possible errors in the symbolic
program. This is called error diagnostics .
2. The assembler cannot translate such a symbol because it does not know its
binary equivalent value. In such a case, the assembler prints an error
message to inform the programmer that his symbolic program has an error
at a specific line of code
Assembler 14