Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Cs 303 System Software

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 67

CS 303 SYSTEM SOFTWARE

1
Module 2
Assembly language programming andAssemblers

SIC,SIC/XE Programming,Basic Functions of


Assembler. Assembler output format – Header,
Text and End Records- Assembler data
structures, Two pass assembler algorithm, Hand
assembly of SIC/XE program

2
Overview

3
Basic Assembler functions
• Translating mnemonic operation codes to
their machine language equivalents

• Assigning machine addresses to symbolic


labels used by the programmers

4
Assembler Directive(module 1)
• Assembler directives are pseudo instructions
• They provide instructions to the assembler
itself
• They are not translated into machine
operation codes
• START, END, BYTE, WORD, RESW, RESB are
some of the assembler directives

5
Assembler Directive (conti.)
• START : Specify name and starting address for
the program
eg: COPY START 1000

• END : Indicate the end of the source program


and (optionally) specify the first executable
instruction in the program
eg: END FIRST
6
Assembler Directive (conti.)
• There are four different ways of defining storage
for data items in the SIC Assembler language.
1. BYTE : Generate character or hexadecimal
constant, occupying as many bytes as needed to
represent the constant
– Eg: CHARZ BYTE C’Z’

2. WORD : Generate one-word integer constant


– Eg: FIVE WORD 5

7
Assembler Directive (conti.)
3. RESB : Reserve the indicated number of bytes
for a data area
– eg: C1 RESB 1

4. RESW : Reserve the indicated number of


words for a data area
– eg: ALPHA RESW 1

8
Example of a SIC Assembler language
program
• Figure shows an assembler language program
for the basic version of SIC
• Line numbers are given only for reference and
are not the part of the program
• Then there are labels defined by the
programmer
• Then mnemonic instructions (opcode) eg: STL,
JSUB, ….

9
10
• Indexing addressing is indicated by adding
modifier “X” (line 160)
• Then comments are represented by “.”

11
• The program contains a main routine that
reads records from an input device, identified
by device code F 1 and copies them to an
output device 05
• Main routine calls subroutine RDREC to read a
record into buffer and another subroutine
WRREC to write the record from the buffer to
the output device

12
• Each subroutine must transfer the record one
character at a time because the only I/O
instructions available are RD and WD
• The buffer is necessary: because the I/O rates
for two devices, such as a disk and a slow
printing terminal, may be different.
• The end of each record is marked with a null
character (hexadecimal 00)

13
• If a record is longer than the length of a buffer
(4096 bytes), only the first 4096 bytes are
copied.
• (for simplicity, the program does not deal with
the error recovery when a record containing
4096 bytes or more is read)
• The end of the file to be copied is indicated by
a zero-length record

14
• When the end of file is detected, the program
writes EOF on the output device and
terminates by executing an RSUB instruction

• Assumed that the this program was called by


the operating system using a JSUB instruction
and thus the RSUB will return the control to
the operating system

15
Assembler Functions
• Convert symbolic operands to their equivalent
machine addresses (eg: RETADR to 1033)

• Convert mnemonic operation codes to their


machine language equivalents (eg: STL to 14)

• Convert the data constants specified in the


source program into their internal machine
representations (eg: EOF to 454F46)
16
• Write the object program and the assembly
listing
• Build the machine instruction into proper
format

17
• Convert symbolic operands to their equivalent
machine addresses (eg: RETADR to 1033)
– This cannot be achieved in the sequential
processing of the source program, one line at a
time
– This poses a problem : Forward Reference

18
Each object code will be: 3 byte
length

19
20
21
22
• Forward Reference – a reference to label
(RETADR) that is defined later in the program
• If we attempt to translate the program line by
line, we will unable to process this statement
because we do not know the address that will
be assigned to RETADR
• Because of this, most assemblers make two
passes over the source program

23
• PASS 1:
– Scan the source program for label definitions and
assign addresses (such as the Loc column)

• PASS 2:
– Performs the actual translation

24
Slide created only for the lecture section.

24 bit instruction
1000 Opcode --14
1001
Operand 10
1002
Operand 33
1003

25
Assembler Functions
• Convert symbolic operands to their equivalent
machine addresses (eg: RETADR to 1033)

• Convert mnemonic operation codes to their


machine language equivalents (eg: STL to 14)

• Convert the data constants specified in the


source program into their internal machine
representations (eg: EOF to 454F46)
26
27
28
29
• Finally, the assembler must write the
generated object code onto some output
device.
• The object program will later be loaded into
memory for execution.
• The simple object program format uses 3
types of records: Header, Text and End.

30
Assembler Output Format
Object program format

31
• Header record contains the program name,
starting address and length.
• Text records contain the translated (ie.,
machine code) instructions and data of the
program, together with an indication of the
addresses where these are to be loaded.
• End record marks the end of the object
program and specifies the address in the
program where execution is to begin.
32
33
34
35
36
• To avoid confusions, we have used he term
column rather than byte to refer to positions
within object program records.
• This is not meant to imply the use of any
particular medium for the object program
• “^” used to separate fields visually and is not
present in the actual object program.

37
• Note there is no object code corresponding to
the addresses 1033- 2038  this storage is
simply reserved by the loader for use by the
program during execution.

38
Passes of Assembler
• A Pass is defined as the processing activity of
every single statement in the source code to
perform a set of language processing
functions.
• Pass can also be defined as the activity of
scanning the assembly language
programming.

39
• Single pass Assembler: The assembler scans the
entire source program (assembly language program)
once and convert into an object code.
• Multi-pass Assembler: The translation of assembly
language program into object code requiring many
passes.
• The breaking of the entire assembly process into
passes makes design simpler and enables better
control over the subtasks and intermediate
operations.
40
Functions of Two Passes of Assembler

• PASS 1 (Define symbols)


– Assign addresses to all statements in the program
– Save the values (addresses) assigned to all labels
for use in Pass 2
– Perform some processing of assembler directives.
(This includes processing that affects address
assignment, such as determining the length of
data areas defined by BYTE, RESW, etc.)

41
• PASS 2 (Assemble instructions and generate
object program)
– Assemble instructions (translating operation codes
and looking up addresses)
– Generate data values defined by BYTE, WORD, etc.
– Perform processing of assembler directives not
done during Pass 1
– Write the object program and assembly listing

42
Assembler Data Structures (internal tables
required by the assembler)

• Simple Assembler uses two major internal


data structures:
– Operation Code Table (OPTAB)
– Symbol Table (SYMTAB)

• Also need a variable Location Counter


(LOCCTR)

43
OPTAB
• OPTAB is used to look up mnemonic operation
codes and translate them to machine language
equivalents
• This must contain at least mnemonic operation
code and its machine language equivalent
• In more complex assemblers, this table also
contains information about instruction format
and length.

44
• During Pass 1 OPTAB is used to look up and
validate operation codes in the source
program
• In Pass 2, it is used to translate the operation
codes to machine language

45
• In case of SIC/XE machine that has instruction
of different length.
• We must search OPTAB in the first pass to find
the instruction length for incrementing
LOCCTR.
• In second pass, the information from OPTAB
tell us which instruction format to use in
assembling the instruction, and any
peculiarities of the object code instructions
(typically most real assemblers)
46
• OPTAB is usually organised as a hash table,
with mnemonic operation code as the key
• This information in OPTAB is predefined when
the assembler itself is written, rather than
being loaded into the table at the execution
time.
• This hash table organisation provides fast
retrieval with a minimum of searching

47
• OPTAB is static table – entries are not normally
added to or deleted from it.

48
LOCCTR
• LOCCTR is a variable that is used to help in the
assignment of addresses

• LOCCTR is initialized to the beginning address


specified in the START statement

49
• After each source statement is processed, the
length of the assembled instruction or data
area to be generated is added to LOCCTR

• Thus whenever we reach a label in the source


program, the current value of LOCCTR gives
the address to be associated with that label

50
SYMTAB
• SYMTAB is used to store values(address)
assigned to labels
• SYMTAB includes the name and value(address)
for each label in the source program, together
with flags to indicate error conditions(eg: a
symbol defined in two different places)
• The table may contain other information about
the data area or instruction labelled (eg: it’s
type or length)
51
• During Pass 1 , the labels are entered into
SYMTAB as they are encountered in the source
program, along with their assigned addresses
(from LOCCTR).

• During Pass 2, symbols used as operands are


looked up in SYMTAB to obtain the addresses
to be inserted in the assembled instructions

52
• SYMTAB is usually organised as hash table for
efficiency in insertion and retrieval.
• Entries are rarely deleted from this table

• Programmers often select many labels that have


similar characteristics (eg: label start or end with
the same characters , like LOOP1, LOOP2,
LOOPA,...or are of same length like A, X, Y, Z).

53
• Hashing function selected
should perform well with
such non random keys.
• Care should be taken in the
selection of hashing function
because the SYMTAB is used
throughout the assembly.
• Good option is the selection
of hash function which
divide the entire key by a
prime table length.

54
Assembler Algorithm
• Both passes of the assembler reads the original
source program as input
• However, there is certain information (such as
location counter values and error flags for the
statements) that can or should be
communicated between the two passes.
• Pass 1 usually writes an intermediate file that
contains each source statements together with
its assigned address, error indicators etc.
55
56
• This file is used as input to Pass 2
• Means this working copy of the source
program(intermediate file) can also be used to
retain the results of certain operations that may
be performed during Pass 1 (such as scanning
the operand field for symbols and addressing
flags), so these need not be performed again
during Pass 2
• Similarly, pointers into OPTAB and SYMTAB may
be retained for each operation code and symbol
used.
57
• Algorithm explains the logic flow of two
passes of assembler

• Apply the algorithm to source program


(assembly language) to generate object
program

58
59
60
• For simplicity, we assume that source lines are
written in the fixed format with fields:

LABEL OPCODE OPERAND


• If one of these fields contains a character
string that represents a number, we denote its
numeric value with a prefix #
(eg: #(OPERAND))
61
Previous Year Questions
1. Explain program relocation with an example. (3
marks)
2. Describe the data structures used in the two
pass SIC assembler algorithm. (3 marks)
3. Give the algorithm for pass 1 of a two pass SIC
assembler. (5 marks)
4. Describe the format of object program
generated by the two-pass SIC assembler
algorithm.(4 marks)
62
5. Explain the syntax of the records in the Object
Program File. (3 marks)
6. Explain the different data structures used in the
implementation of Assemblers. (3 marks)
7. Explain the two passes of the assembler
algorithm with proper example. (9 marks)
8. a) With suitable example, explain the concept of
Program Relocation. (5 marks)
b) List out the basic functions of Assemblers with
proper examples. (4 marks)
63
9. What is meant by forward reference? How it
is resolved by two pass assembler? (3 marks)
10.Write down the format of Modification
record. Describe each field with the help of
an example. (3 marks)
11.With the aid of an algorithm explain the
Second pass of a Two Pass Assembler. (6
marks)

64
12.Briefly describe the format of object program
generated by SIC assembler. (3 marks)
13.What are the uses of OPTAB and SYMTAB during
the assembling process? Specify the uses of each
during pass 1 and pass2 of a two pass assembler.
(3 marks)
14.Design an algorithm for performing the pass 1
operations of a two pass assembler.(5 marks)

65
15.Explain program relocation with examples. Is there a need to
use modification records for the given SIC/XE program
segment? Explain your answer. If yes, show the contents of
modification record. (6 marks)

0000 COPY START 0


..................................................
0006 +JSUB RDREC
000A LDA LENGTH
.............................................................
0033 LENGTH RESW 1
...................................................
1036 RDREC CLEAR X 66
16.Explain the format of the object program generated
by a two-pass SIC Assembler, highlighting the
contents of each record type. (3 marks)
17.Explain the data structures used and their purposes
in a two-pass assembler. (3 marks)
18.Explain the concept of program relocation with an
example. (4 marks)
19.Write the algorithms for Pass 1 and Pass 2 of a two-
pass assembler (9 marks)

67

You might also like