SSCD Mod 1

System Software 15CS63
System Software
Semester : VI Course Code : 15CS63
Course Title : System Software AND Compiler Design
Faculty : Niranjan Murthy C
Dept : Computer Science & engineering
Prerequisites: Basic concepts of microprocessors (10CS45)
This course gives an introduction to the design and implementation of

Description various types of system software. A central theme of the course is the
relationship between machine architecture and system software. The
design of an assembler or an operating system is greatly influenced by
the architecture of the machine on which it runs. These influences are
emphasized and demonstrated through the discussion of actual pieces
of system softare fo a variety of real machines.
Outcomes
The students should be able to:
1. Student able to Define System Sotware such as Assembler and Macroprocessor.
2. Student able to Define System Sotware such as Loaders and Linkers
3. Student able to lexical analysis and syntax analysisFamiliaize with source file ,object and
executable file structures and libraries
4. Describe the front and back end phases of compiler and their importance to students
1 GMIT, Davangere Deepak D J

MODULE- 1
 Introduction to System Software,
 Machine Architecture of SIC and SIC/XE.
 Assemblers: Basic assembler functions, machine dependent assembler features,
 machine independent assembler features, assembler design options.
 Macroprocessors: Basic macro processor functions, ->10 Hours
MACHINE ARCHITECTURE
System Software:
 System software consists of a variety of programs that support the operation of a computer.
 Application software focuses on an application or problem to be solved.
 System softwares are the machine dependent softwares that allows the user to focus on the
application or problem to be solved, without bothering about the details of how the
machine works internally.
Examples: Operating system, compiler, assembler, macroprocessor, loader or linker, debugger, text
editor, database management systems, etc.
Difference between System Software and application software
System Software Application Software

System software is machine dependent Application software is not dependent on the
underlying hardware.
System software focus is on the computing Application software provides solution to a
system. problem
Examples: Operating system, compiler, Examples: Antivirus, Microsoft office
assembler
SIC – Simplified Instructional Computer

Simplified Instructional Computer (SIC) is a hypothetical computer that includes the hardware
features most often found on real machines. There are two versions of SIC, they are,
standard model (SIC), and, extension version (SIC/XE) (extra equipment or extra expensive).
SIC Machine Architecture:
We discuss here the SIC machine architecture with respect to its Memory and Registers,
Data Formats, Instruction Formats, Addressing Modes, Instruction Set, Input and Output.
Memory:
There are 215 bytes in the computer memory, that is 32,768 bytes. It uses Little Endian format to
store the numbers, 3 consecutive bytes form a word , each location in memory contains 8-bit bytes.
Registers:
There are five registers, each 24 bits in length. Their mnemonic, number and use are given in the
following table.

Mnemonic Number Use
A 0 Accumulator; used for arithmetic operations
X 1 Index register; used for addressing
L 2 Linkage register; JSUB
PC 8 Program counter
SW 9 Status word, including CC
Data Formats:
Integers are stored as 24-bit binary numbers. 2’s complement representation is used for negative
values, characters are stored using their 8-bit ASCII codes.No floating-point hardware on the
standard version of SIC.
Instruction Formats:
Opcode(8) x Address (15)

X is used to indicate indexed-addressing mode.
All machine instructions on the standard version of SIC have the 24-bit format as shown above.
Addressing Modes:
Only two modes are supported: Direct and Indexed
Mode Indication Target address calculation
Direct x= 0 TA = address
Indexed x= 1 TA = address + (x)
() are used to indicate the content of a register.
Instruction Set
 Load and store registers (LDA, LDX, STA, STX)

 Integer arithmetic (ADD, SUB, MUL, DIV), all involve register A and a word in memory.
 Comparison (COMP), involve register A and a word in memory.
 Conditional jump (JLE, JEQ, JGT, etc.)
 Subroutine linkage (JSUB, RSUB)
Input and Output
 One byte at a time to or from the rightmost 8 bits of register A.

 Each device has a unique 8-bit ID code.
 Test device (TD): test if a device is ready to send or receive a byte of data.
 Read data (RD): read a byte from the device to register A
 Write data (WD): write a byte from register A to the device.
SIC/XE Machine Architecture:
Memory

 Maximum memory available on a SIC/XE system is 1 Megabyte (2 20 bytes).
Registers
 Additional B, S, T, and F registers are provided by SIC/XE, in addition to the registers of SIC.
Mnemonic Number Special use
B 3 Base register
S 4 General working register
T 5 General working register
F 6 Floating-point accumulator (48 bits)
Floating-point data type:
 There is a 48-bit floating-point data type, F*2(e-1024)
Instruction Formats :
The new set of instruction formats fro SIC/XE machine architecture are as follows.
Format 1 (1 byte): contains only operation code (straight from table).
Format 2 (2 bytes): first eight bits for operation code, next four for register 1 and following four for
register 2. The numbers for the registers go according to the numbers indicated at the registers
section (ie, register T is replaced by hex 5, F is replaced by hex 6).
Format 3 (3 bytes): First 6 bits contain operation code, next 6 bits contain flags, last 12 bits contain
displacement for the address of the operand. Operation code uses only 6 bits, thus the second hex
digit will be affected by the values of the first two flags (n and i). The flags, in order, are: n, i, x, b, p,
and e. Its functionality is explained in the next section. The last flag e indicates the instruction format
(0 for 3 and 1 for 4).
Format 4 (4 bytes): same as format 3 with an extra 2 hex digits (8 bits) for addresses that require
more than 12 bits to be represented.
Addressing Modes:
Five possible addressing modes plus the combinations are as follows.
1. Direct (x, b, and p all set to 0): operand address goes as it is. n and i are both set to the same
value, either 0 or 1. While in general that value is 1, if set to 0 for format 3 we can assume that the
rest of the flags (x, b, p, and e) are used as a part of the address of the operand, to make the format
compatible to the SIC format.
2. Relative (either b or p equal to 1 and the other one to 0): the address of the operand should be
added to the current value stored at the B register (if b = 1) or to the value stored at the PC register
(if p = 1)
3. Immediate(i = 1, n = 0): The operand value is already enclosed on the instruction (ie. lies on the
last 12/20 bits of the instruction)
4. Indirect(i = 0, n = 1): The operand value points to an address that holds the address for the
operand value.

5. Indexed (x = 1): value to be added to the value stored at the register x to obtain real address of
the operand. This can be combined with any of the previous modes except immediate.
The various flag bits used in the above formats have the following meanings
e - > e = 0 means format 3, e = 1 means format 4
Bits x,b,p : Used to calculate the target address using relative, direct, and indexed addressing Modes.
Bits i and n: Says, how to use the target address b and p - both set to 0, disp field from format 3
instruction is taken to be the target address.
For a format 4 bits b and p are normally set to 0, 20 bit address is the target address
x -x is set to 1, X register value is added for target address calculation
i=1, n=0 Immediate addressing, TA: TA is used as the operand value, no memory reference
i=0, n=1 Indirect addressing, ((TA)): The word at the TA is fetched. Value of TA is taken as the address
of the operand value
i=0, n=0 or i=1, n=1 Simple addressing, (TA):TA is taken as the address of the operand value
Two new relative addressing modes are available for use with instructions assembled using format 3.
Instruction Set:
SIC/XE provides all of the instructions that are available on the standard version. In addition we
have, Instructions to load and store the new registers LDB, STB, etc, Floating-point arithmetic
operations, ADDF, SUBF, MULF, DIVF, Register move instruction : RMO, Register-to-register
arithmetic operations, ADDR, SUBR, MULR, DIVR and, Supervisor call instruction : SVC.
Input and Output:
There are I/O channels that can be used to perform input and output while the CPU is executing
other instructions. Allows overlap of computing and I/O, resulting in more efficient system
operation. The instructions SIO, TIO, and HIO are used to start, test and halt the operation of I/O
channels.
Example programs SIC:

Example 1: Simple data and character movement operation
LDA FIVE
STA ALPHA
LDCH CHARZ
STCH C1
ALPHA RESW 1
FIVE WORD 5
CHARZ BYTE C’Z’
C1 RESB 1
Example 2: Arithmetic operations

LDA ALPHA
ADD INCR
SUB ONE
STA BETA
……..

……..
……..
ONE WORD 1
ALPHA RESW 1
BEETA RESW 1
INCR RESW 1
Example 3: Looping and Indexing operation

LDX ZERO ; X=0
MOVECH LDCH STR1, X
STCH STR2, X
TIX ELEVEN
JLT MOVECH
......
......
......
STR1 BYTE C ‘HELLO WORLD’
STR2 RESB 11
ZERO WORD 0
ELEVEN WORD 11
Example 4: Input and Output operation

INLOOP TD INDEV ; TEST INPUT DEVICE
JEQ INLOOP ; LOOP UNTIL DEVICE IS READY
RD INDEV ; READ ONE BYTE INTO A
STCH DATA ; STORE A TO DATA
.
.
OUTLP TD OUTDEV ; TEST OUTPUT DEVICE
JEQ OUTLP ; LOOP UNTIL DEVICE IS READY
LDCH DATA ; LOAD DATA INTO A
WD OUTDEV ; WRITE A TO OUTPUT DEVICE
.
.
INDEV BYTE X ‘F5’ ; INPUT DEVICE NUMBER
OUTDEV BYTE X ‘08’ ; OUTPUT DEVICE NUMBER
DATA RESB 1 ; ONE-BYTE VARIABLE
Example 5: To transfer two hundred bytes of data from input device to memory
LDX ZERO
CLOOP TD INDEV
JEQ CLOOP
RD INDEV
STCH RECORD, X
TIX B200
JLT CLOOP
.
.
INDEV BYTE X ‘F5’
RECORD RESB 200
ZERO WORD 0
B200 WORD 200

Example Programs (SIC/XE)

Example 1: Simple data and character movement operation
LDA #5
STA ALPHA
LDA #90
.
.
.
ALPHA RESW 1
C1 RESB 1
Example 2: Arithmetic operations

LDS INCR
LDA ALPHA
ADD S,A
SUB #1
STA BETA
…………
…………
ALPHA RESW 1
BETA RESW 1
INCR RESW 1
Example 3: Looping and Indexing operation

LDT #11
LDX #0 ;X = 0
MOVECH LDCH STR1, X ; LOAD A FROM STR1
STCH STR2, X ; STORE A TO STR2
TIXR T
JLT MOVECH
.
.
STR1 BYTE C ‘HELLO WORLD’
STR2 RESB 11
Assemblers - 1
A Simple Two-Pass Assembler
Main Functions
 Translate mnemonic operation codes to their machine language equivalents

 Assign machine addresses to symbolic labels used by the programmers
 Depend heavily on the source language it translates and the machine language it produces.
 E.g., the instruction format and addressing modes
Basic Functions of an Assembler


• It is a copy function that reads some records from a specified input device and then copies
them to a specified output device
– Reads a record from the input device (code F1)
– Copies the record to the output device (code 05)
– Repeats the above steps until encountering EOF.
– Then writes EOF to the output device
– Then call RSUB to return to the caller
–
RDREC and WRREC
 Data transfer
– A record is a stream of bytes with a null character (0016) at the end.
– If a record is longer than 4096 bytes, only the first 4096 bytes are copied.
– EOF is indicated by a zero-length record. (I.e., a byte stream with only a null
character.
– Because the speed of the input and output devices may be different, a buffer is used
to temporarily store the record
 Subroutine call and return
– On line 10, “STL RETADDR” is called to save the return address that is already stored
in register L.
– Otherwise, after calling RD or WR, this COPY cannot return back to its caller.
Assembler Directives
 Assembler directives are pseudo instructions

– They will not be translated into machine instructions.
– They only provide instruction/direction/information to the assembler.
 Basic assembler directives :
o START : Specify name and starting address for the program
o END : Indicate the end of the source program, and (optionally) the first executable
instruction in the program. Assembler Directives (cont’d)
o BYTE : Generate character or hexadecimal constant, occupying as many bytes as
needed to represent the constant.

o WORD : Generate one-word integer constant

o RESB : Reserve the indicated number of bytes for a data area
o RESW : Reserve the indicated number of words for a data area
An Assembler’s Job
 Convert mnemonic operation codes to their machine language codes

 Convert symbolic (e.g., jump labels, variable names) operands to their machine addresses
 Use proper addressing modes and formats to build efficient machine instructions
 Translate data constants into internal machine representations
 Output the object program and provide other information (e.g., for linker and loader)
Object Program Format
 Header
Col. 1 H
Col. 2~7 Program name
Col. 8~13 Starting address of object program (hex)
Col. 14-19 Length of object program in bytes (hex)
 Text
Col.1 T
Col.2~7 Starting address for object code in this record (hex)
Col. 8~9 Length of object code in this record in bytes (hex)
Col. 10~69 Object code, represented in hexa (2 col. per byte)
 End
Col.1 E
Col.2~7 Address of first executable instruction in object program (hex)
The Object Code for COPY

H COPY 001000 00107A
T 001000 1E 141033 482039 001036 281030 301015 482061 3C1003
00102A 0C1039 00102D
T 00101E 15 0C1036 482061 081044 4C0000 454F46 000003 000000
T 002039 1E 041030 001030 E0205D 30203F D8205D 281030 302057
549039 2C205E 38203F
T 002057 1C 101036 4C0000 F1 001000 041030 E02079 302064 509039
DC2079 2C1036

T 002073 07 382064 4C0000 05
E 001000
NOTE: There is no object code corresponding to addresses 1033-2038. This storage is simply
reserved by the loader for use by the program during execution.
Two Pass Assembler
 Pass 1
– Assign addresses to all statements in the program
– Save the values (addresses) assigned to all labels (including label and variable
names) for use in Pass 2 (deal with forward references)
– Perform some processing of assembler directives (e.g., BYTE, RESW, these can affect
address assignment)
 Pass 2
– Assemble instructions (generate opcode and look up addresses)
– Generate data values defined by BYTE, WORD
– Perform processing of assembler directives not done in Pass 1
– Write the object program and the assembly listing
A Simple Two Pass Assembler Implementation
Algorithms and Data Structures
Three Main Data Structures

• Operation Code Table (OPTAB)
• Location Counter (LOCCTR)
• Symbol Table (SYMTAB)
OPTAB (operation code table)
 Content
– The mapping between mnemonic and machine code. Also include the instruction
format, available addressing modes, and length information.
 Characteristic
– Static table. The content will never change.
 Implementation

– Array or hash table. Because the content will never change, we can optimize its
search speed.
 In pass 1, OPTAB is used to look up and validate mnemonics in the source program.
 In pass 2, OPTAB is used to translate mnemonics to machine instructions.
Location Counter (LOCCTR)

• This variable can help in the assignment of addresses.
• It is initialized to the beginning address specified in the START statement.
• After each source statement is processed, the length of the assembled instruction and data
area
 to be generated is added to LOCCTR.
• Thus, when we reach a label in the source program, the current value of LOCCTR gives the
address to be associated with that label.
Symbol Table (SYMTAB)

• Content
– Include the label name and value (address) for each label in the source program.
– Include type and length information (e.g., int64)
– With flag to indicate errors (e.g., a symbol defined in two places)
• Characteristic
– Dynamic table (I.e., symbols may be inserted, deleted, or searched in the table)
• Implementation
– Hash table can be used to speed up search – Because variable names may be very similar
(e.g., LOOP1, LOOP2), the selected hash function must perform well with such non-random
keys.
The Pseudo Code for Pass 1

Begin
read first input line
if OPCODE = ‘START’ then begin
save #[Operand] as starting addr
initialize LOCCTR to starting address
write line to intermediate file
read next line
end( if START)
else
initialize LOCCTR to 0
While OPCODE != ‘END’ do
begin
if this is not a comment line then

begin
if there is a symbol in the LABEL field then
begin
search SYMTAB for LABEL
if found then
set error flag (duplicate symbol)
else
(if symbol)
search OPTAB for OPCODE
if found then
add 3 (instr length) to LOCCTR
else if OPCODE = ‘WORD’ then
add 3 to LOCCTR
else if OPCODE = ‘RESW’ then
add 3 * #[OPERAND] to LOCCTR
else if OPCODE = ‘RESB’ then
add #[OPERAND] to LOCCTR
else if OPCODE = ‘BYTE’ then
begin
find length of constant in bytes
add length to LOCCTR
end
else
set error flag (invalid operation code)
end (if not a comment)
write line to intermediate file
read next input line
end { while not END}
write last line to intermediate file
Save (LOCCTR – starting address) as program length

End {pass 1}
The Pseudo Code for Pass 2

Begin
read 1st input line
if OPCODE = ‘START’ then
begin
write listing line
end
write Header record to object program
initialize 1st Text record
while OPCODE != ‘END’ do
begin
if this is not comment line then
begin
search OPTAB for OPCODE
if found then
begin
if there is a symbol in OPERAND field then
begin
search SYMTAB for OPERAND field then
if found then
begin
store symbol value as operand address
else
begin
store 0 as operand address
set error flag (undefined symbol)
end

end (if symbol)
else store 0 as operand address
assemble the object code instruction
else if OPCODE = ‘BYTE’ or ‘WORD” then
convert constant to object code
if object code doesn’t fit into current Text record then
begin
Write text record to object code
initialize new Text record
end
add object code to Text record
end {if not comment}
write listing line
end
write listing line
write last listing line
End {Pass 2}
Machine dependent Assembler Features

Assembler Features
• Machine Dependent Assembler Features
– Instruction formats and addressing modes (SIC/XE)
– Program relocation
• Machine Independent Assembler Features
– Literals
– Symbol-defining statements
– Expressions

– Program blocks
– Control sections and program linking
A SIC/XE Program

SIC/XE Instruction Formats and Addressing Modes
• PC-relative or Base-relative (BASE directive needs to be used) addressing: op m
• Indirect addressing: op @m
• Immediate addressing: op #c
• Extended format (4 bytes): +op m
• Index addressing: op m,X
• Register-to-register instructions
Relative Addressing Modes
• PC-relative or base-relative addressing mode is preferred over direct addressing

mode.
– Can save one byte from using format 3 rather than format 4.
• Reduce program storage space
• Reduce program instruction fetch time
– Relocation will be easier.
The Differences Between the SIC and SIC/XE Programs
• Register-to-register instructions are used whenever possible to improve execution

speed.
– Fetch a value stored in a register is much faster than fetch it from the
memory.
• Immediate addressing mode is used whenever possible.
– Operand is already included in the fetched instruction. There is no need to

fetch the operand from the memory.
• Indirect addressing mode is used whenever possible.

– Just one instruction rather than two is enough.
The Object Code

Generate Relocatable Programs
• Let the assembled program starts at address 0 so that later it can be easily moved to
any place in the physical memory.
• Actually, as we have learned from virtual memory, now every process

(executed program) has a separate address space starting from 0.
• Assembling register-to-register instructions presents no problems. (e.g., line 125 and

150)
• Register mnemonic names need to be converted to their corresponding

register numbers.
• This can be easily done by looking up a name table.
PC or Base-Relative Modes
• Format 3: 12-bit displacement field (in total 3 bytes)
– Base-relative: 0~4095
– PC-relative: -2048~2047
• Format 4: 20-bit address field (in total 4 bytes)
• The displacement needs to be calculated so that when the displacement is added to

PC (which points to the following instruction after the current instruction is fetched)
or the base register (B), the resulting value is the target address.
• If the displacement cannot fit into 12 bits, format 4 then needs to be used. (E.g., line
15 and 125)
– Bit e needs to be set to indicate format 4.
– A programmer must specify the use of format 4 by putting a + before the

instruction. Otherwise, it will be treated as an error.

Base-Relative v.s. PC-Relative
• The difference between PC and base relative addressing modes is that the assembler
knows the value of PC when it tries to use PC-relative mode to assembles an

instruction. However, when trying to use base-relative mode to assemble an

instruction, the assembler does not know the value of the base register.
– Therefore, the programmer must tell the assembler the value of register B.
– This is done through the use of the BASE directive. (line 13)
– Also, the programmer must load the appropriate value into register B by
himself.
– Another BASE directive can appear later, this will tell the assembler to change
its notion of the current value of B.
– NOBASE can also be used to tell the assembler that no more base-relative
addressing mode should be used.


Relocatable Is Desired
• The program in Fig. 2.1 specifies that it must be loaded at address 1000 for correct
execution. This restriction is too inflexible for the loader.
• If the program is loaded at a different address, say 2000, its memory references will
access wrong data! For example:
– 55 101B LDA THREE 00102D
• Thus, we want to make programs relocatable so that they can be loaded and execute
correctly at any place in the memory.
Address Modification Is Required
If we can use a hardware relocation register (MMU), software relocation can be avoided
here. However, when linking multiple object Programs together, software relocation is still
needed.

What Instructions Needs to be Modified?
• Only those instructions that use absolute (direct) addresses to reference symbols.
• The following need not be modified:
– Immediate addressing (no memory references)
– PC or Base-relative addressing (Relocatable is one advantage of relative

addressing, among others.)
– Register-to-register instructions (no memory references)
The Modification Record
• When the assembler generate an address for a symbol, the address to be inserted
into the instruction is relative to the start of the program.
• The assembler also produces a modification record, in which the address and length
of the need-to-be-modified address field are stored.
• The loader, when seeing the record, will then add the beginning address of the
loaded program to the address field stored in the record.

The Relocatable Object Code

SSCD Mod 1

Uploaded by

Copyright:

Available Formats

SSCD Mod 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSCD Mod 1

Uploaded by

Copyright:

Available Formats

System Software 15CS63

Course Title : System Software AND Compiler Design

Faculty : Niranjan Murthy C

Dept : Computer Science & engineering

Prerequisites: Basic concepts of microprocessors (10CS45)

This course gives an introduction to the design and implementation of

1 GMIT, Davangere Deepak D J

Difference between System Software and application software

System Software Application Software

SIC – Simplified Instructional Computer

SIC Machine Architecture:

2 GMIT, Davangere Deepak D J

Mnemonic Number Use

A 0 Accumulator; used for arithmetic operations

X 1 Index register; used for addressing

L 2 Linkage register; JSUB

SW 9 Status word, including CC

Opcode(8) x Address (15)

Only two modes are supported: Direct and Indexed

Mode Indication Target address calculation

Indexed x= 1 TA = address + (x)

() are used to indicate the content of a register.

 Load and store registers (LDA, LDX, STA, STX)

Input and Output

 One byte at a time to or from the rightmost 8 bits of register A.

SIC/XE Machine Architecture:

3 GMIT, Davangere Deepak D J

 Maximum memory available on a SIC/XE system is 1 Megabyte (2 20 bytes).

Mnemonic Number Special use

S 4 General working register

T 5 General working register

F 6 Floating-point accumulator (48 bits)

Floating-point data type:

 There is a 48-bit floating-point data type, F*2(e-1024)

Format 1 (1 byte): contains only operation code (straight from table).

Five possible addressing modes plus the combinations are as follows.

4 GMIT, Davangere Deepak D J

e - > e = 0 means format 3, e = 1 means format 4

x -x is set to 1, X register value is added for target address calculation

Input and Output:

Example programs SIC:

Example 2: Arithmetic operations

5 GMIT, Davangere Deepak D J

Example 3: Looping and Indexing operation

Example 4: Input and Output operation

6 GMIT, Davangere Deepak D J

Example Programs (SIC/XE)

Example 2: Arithmetic operations

Example 3: Looping and Indexing operation

 Translate mnemonic operation codes to their machine language equivalents

Basic Functions of an Assembler

7 GMIT, Davangere Deepak D J

8 GMIT, Davangere Deepak D J

RDREC and WRREC

 Assembler directives are pseudo instructions

9 GMIT, Davangere Deepak D J

o WORD : Generate one-word integer constant

 Convert mnemonic operation codes to their machine language codes