Computer Science Notes On System Software
Computer Science Notes On System Software
SYSTEM SOFTWARE
BY: SOLOMON KIPNGETICH
0727091924
System Software is a set of programs that manage the resources of a computer system. System
Software is a collection of system programs that perform a variety of functions.
• File Editing
• Resource Accounting I/O Management
• Storage, Memory Management access management.
Application Software: It performs specific tasks for the computer user. Application software is
a program which program written for, or, by, a user to perform a particular job.
System Software controls the execution of the application software & provides other support
functions such as data storage. E.g. when you use an electronic spreadsheet on the computer,
MS-DOS, the computer’s Operating System, handles the storage of the worksheet files on disk.
The language translators and the operating system are themselves programs. Their function is
to get the users program, which is written, in a programming language to run-on the computer
system.
The collection of such SPs is the “System Software” of a particular computer system. Mast
computer systems have support software, called Utility Programs, which perform routine tasks.
These programs sort data, copy data from one storage medium to another, o/p data from a storage
medium to printer & perform other tasks
SYSTEM SOFTWARE
Program Execution
Two popular models for program execution are translation and interpretation.
Program translation
The program translation model bridges the execution gap by
translating a program written in a PL, called the source program (SP), into an equivalent
program in the machine or assembly language of the computer system, called the target
program (TP) Characteristics of the program translation model are:
Program interpretation
The interpreter reads the source program and stores it in its memory. During interpretation it
takes a source statement, determines its meaning and performs actions which implement it. This
includes computational and input-output actions.
LANGUAGE TRANSLATORS
It is the program that takes an input program in one language and produces an output in another
language.
Language
Source Program Translator Object Program
Compilers
Compiler
High level language Machine language program
Interpreters:
Interpreter Memory
Source
Program Program
counter
Assemblers:
Assembler
· Fundamental functions:
Definition
An operating system is a program that acts as an interface between the user and the computer
hardware and controls the execution of all kinds of programs.
Keeps tracks of primary memory i.e. what part of it are in use by whom, what part are not in
use.
In multiprogramming, OS decides which process will get memory when and how much.
Allocates the memory when the process requests it to do so.
De-allocates the memory when the process no longer needs it or has been terminated.
Processor Management
In multiprogramming environment, OS decides which process gets the processor when and
how much time. This function is called process scheduling. Operating System does the
following activities for processor management.
Keeps tracks of processor and status of process. Program responsible for this task is known as
traffic controller.
Allocates the processor (CPU) to a process.
De-allocates processor when processor is no longer required.
Device Management
OS manages device communication via their respective drivers. Operating System does the
following activities for device management.
Keeps tracks of all devices. Program responsible for this task is known as the I/O controller.
Decides which process gets the device when and for how much time.
Allocates the device in the efficient way.
De-allocates devices.
File Management
A file system is normally organized into directories for easy navigation and usage. These
directories may contain files and other directions. Operating System does the following
activities for file management.
Keeps track of information, location, uses, status etc. The collective facilities are often known
as file system.
Decides who gets the resources.
Allocates the resources.
De-allocates the resources.
Operating systems are there from the very first computer generation. Operating systems keep
evolving over the period of time.
Following are few of the important types of operating system which are most commonly
used.
A) Resource allocation:
If there are more than one user or jobs running at the same time, then resources must be
allocated to each of them. Operating system manages different types of resources. Some
resources require special allocation code, i.e., main memory, CPU cycles and file storage.
• There are some resources which require only general request and release code. For
allocating CPU, CPU scheduling algorithms are used for better utilization of CPU. CPU
scheduling routines consider the speed of the CPU, number of available registers and other
required factors.
B) Accounting:
• Logs of each user must be kept. It is also necessary to keep record of which user uses how
much and what kinds of computer resources. This log is used for accounting purposes.
• The accounting data may be used for statistics or for the billing. It also used to improve
system efficiency.
C) System Calls:
• Protection involves ensuring that all access to system resources is controlled.
Security starts with each user having to authenticate to the system, usually by means of a
password. External I/O devices must be also protected from invalid access attempts.
• In protection, all the access to the resources is controlled. In multiprocess environment, it
is possible that, one process to interface with the other, or with the operating system, so
protection is required.
• Modern processors provide instructions that can be used as system calls. System calls
provide the interface between a process and the operating system. A system call
instruction is an instruction that generates an interrupt that cause the operating system to
gain control of the processor.
Types of System Call:
A system call is made using the system call machine language instruction. System calls can be
grouped into five major categories.
1. File management
2. Interprocess communication
3. Process management
4. I/O device management
5. Information maintenance.
Program execution
Operating system handles many kinds of activities from user programs to system programs like
printer spooler, name servers, file server etc. Each of these activities is encapsulated as a
process. A process includes the complete execution context (code to execute, data to
manipulate, registers, OS resources in use). Following are the major activities of an operating
system with respect to program management.
I/O Operation
I/O subsystem comprised of I/O devices and their corresponding driver software. Drivers hides
the peculiarities of specific hardware devices from the user as the device driver knows the
peculiarities of the specific device. Operating System manages the communication between
user and device drivers. Following are the major activities of an operating system with respect
to I/O Operation.
I/O operation means read or write operation with any file or any specific I/O device.
Program may require any I/O device while running.
Operating system provides the access to the required I/O device when required.
File System manipulation
A file represents a collection of related information. Computer can store files on the disk
(secondary storage), for long term storage purpose. Few examples of storage media are
magnetic tape, magnetic disk and optical disk drives like CD, DVD. Each of these media has its
own properties like speed, capacity, data transfer rate and data access methods. A file system is
normally organized into directories for easy navigation and usage. These directories may
contain files and other directions. Following are the major activities of an operating system
with respect to file management.
Communication
In case of distributed systems which are a collection of processors that do not share memory,
peripheral devices, or a clock, operating system manages communications between processes.
Multiple processes with one another through communication lines in the network. OS handles
routing and connection strategies, and the problems of contention and security. Following are
the major activities of an operating system with respect to communication.
Error Handling
Error can occur anytime and anywhere. Error may occur in CPU, in I/O devices or in the
memory hardware. Following are the major activities of an operating system with respect to
error handling.
OS constantly remains aware of possible errors.
OS takes the appropriate action to ensure correct and consistent computing.
Resource Management
In case of multi-user or multi-tasking environment, resources such as main memory, CPU
cycles and files storage are to be allocated to each user or job. Following are the major
activities of an operating system with respect to resource management.
OS manages all kind of resources using schedulers.
CPU scheduling algorithms are used for better utilization of CPU.
Protection
Considering computer systems having multiple users the concurrent execution of multiple
processes, then the various processes must be protected from each another's activities.
Protection refers to mechanism or a way to control the access of programs, processes, or users
to the resources defined by computer systems. Following are the major activities of an
operating system with respect to protection.
OS ensures that all access to system resources is controlled.
OS ensures that external I/O devices are protected from invalid access attempts.
OS provides authentication feature for each user by means of a password.
Following are few of very important tasks that Operating System handles.
Batch processing
Batch processing is a technique in which Operating System collects one programs and data
together in a batch before processing starts. Operating system does the following activities
related to batch processing.
OS defines a job which has predefined sequence of commands, programs and data as a single
unit.
OS keeps a number a jobs in memory and executes them without any manual information.
Jobs are processed in the order of submission i.e. first come first served fashion.
When job completes its execution, its memory is released and the output for the job gets copied
into an output spool for later printing or processing.
Advantages
Batch processing takes much of the work of the operator to the computer.
Increased performance as a new job gets started as soon as the previous job finished without
any manual intervention.
Disadvantages
Difficult to debug program.
A job could enter an infinite loop.
Due to lack of protection scheme, one batch job can affect pending jobs.
Multitasking
Multitasking refers to term where multiple jobs are executed by the CPU simultaneously by
switching between them. Switches occur so frequently that the users may interact with each
program while it is running. Operating system does the following activities related to
multitasking.
The user gives instructions to the operating system or to a program directly, and receives an
immediate response.
Operating System handles multitasking in the way that it can handle multiple operations
/ executes multiple programs at a time.
Multitasking Operating Systems are also known as Time-sharing systems.
These Operating Systems were developed to provide interactive use of a computer system at a
reasonable cost.
A time-shared operating system uses concept of CPU scheduling and multiprogramming to
provide each user with a small portion of a time-shared CPU.
Each user has at least one separate program in memory.
A program that is loaded into memory and is executing is commonly referred to as a process.
When a process executes, it typically executes for only a very short time before it either
finishes or needs to perform I/O.
Since interactive I/O typically runs at people speeds, it may take a long time to complete.
During this time a CPU can be utilized by another process.
Operating system allows the users to share the computer simultaneously. Since each action or
command in a time-shared system tends to be short, only a little CPU time is needed for each
user.
As the system switches CPU rapidly from one user/program to the next, each user is given the
impression that he/she has his/her own CPU, whereas actually one CPU is being shared among
many users.
When two or more programs are residing in memory at the same time, then sharing the
processor is referred to the multiprogramming. Multiprogramming assumes a single shared
processor. Multiprogramming increases CPU utilization by organizing jobs so that the CPU
always has one to execute. Following figure shows the memory layout for a multiprogramming
system.
Operating system does the following activities related to multiprogramming.
The operating system keeps several jobs in memory at a time.
This set of jobs is a subset of the jobs kept in the job pool.
The operating system picks and begins to execute one of the job in the memory.
Multiprogramming operating system monitors the state of all active programs and system
resources using memory management programs to ensures that the CPU is never idle unless
there are no jobs
Advantages
High and efficient CPU utilization.
User feels that many programs are allotted CPU almost simultaneously.
Disadvantages
CPU scheduling is required.
To accommodate many jobs in memory, memory management is required.
Interactivity
Interactivity refers that a User is capable to interact with computer system. Operating system
does the following activities related to interactivity.
OS provides user an interface to interact with system.
OS managers input devices to take inputs from the user. For example, keyboard.
OS manages output devices to show outputs to the user. For example, Monitor.
OS Response time needs to be short since the user submits and waits for the result.
Distributed Environment
Distributed environment refers to multiple independent CPUs or processors in a computer
system. Operating system does the following activities related to distributed environment.
OS Distributes computation logics among several physical processors.
The processors do not share memory or a clock.
Instead, each processor has its own local memory.
OS manages the communications between the processors. They communicate with
each other through various communication lines.
Spooling
Spooling is an acronym for simultaneous peripheral operations on line. Spooling refers to
putting data of various I/O jobs in a buffer. This buffer is a special area in memory or hard disk
which is accessible to I/O devices. Operating system does the following activities related to
distributed environment.
OS handles I/O device data spooling as devices have different data access rates.
OS maintains the spooling buffer which provides a waiting station where data can rest while the
slower device catches up.
OS maintains parallel computation because of spooling process as a computer can perform I/O
in parallel fashion. It becomes possible to have the computer read data from a tape, write data to
disk and to write out to a tape printer while it is doing its computing task.
Advantages
The spooling operation uses a disk as a very large buffer.
Spooling is capable of overlapping I/O operation for one job with processor operations for
another job.
OperatingSystemProcesses
This section describes process, process states and process control block (PCB).
Process
A process is a program in execution. The execution of a process must progress in a sequential
fashion. Definition of process is following.
A process is defined as an entity which represents the basic unit of work to be implemented in
the system.
Components of a process are following.
Object Program
1 Code to be executed.
2 Data
Data to be used for executing the program.
3 Resources
While executing the program, it may require some resources.
Status
4 Verifies the status of the process execution. A process can run to completion
only when all requested resources have been allocated to the process. Two or
more processes could be executing the same program, each using their own
data and resources.
Program
A program by itself is not a process. It is a static entity made up of program statement while
process is a dynamic entity. Program contains the instructions to be executed by processor. A
program takes a space at single place in main memory and continues to stay there. A program
does not perform any action by itself.
Process States
As a process executes, it changes state. The state of a process is defined as the current activity of
the process. Process can have one of the following five states at a time.
Ready
2 The process is waiting to be assigned to a processor. Ready processes are
waiting to have the processor allocated to them by the operating system so
that they can run.
Running
3 Process instructions are being executed (i.e. The process that is currently
being executed).
Waiting
4 The process is waiting for some event to occur (such as the completion of
an I/O operation).
5 Terminated
The process has finished execution.
Process control block includes CPU scheduling, I/O resource management, file management
information etc. The PCB serves as the repository for any information which can vary from
process to process. Loader/linker sets flags and registers when a process is created. If that
process gets suspended, the contents of the registers are saved on a stack and the pointer to the
particular stack frame is stored in the PCB. By this technique, the hardware state can be restored
so that the process can be scheduled to run again.
OperatingSystemProcess Scheduling
This section describes process scheduling, scheduling queues and various types of process
schedulers.
Definition
The process scheduling is the activity of the process manager that handles the removal of the
running process from the CPU and the selection of another process on the basis of a particular
strategy. Process scheduling is an essential part of a Multiprogramming operating system. Such
operating systems allow more than one process to be loaded into the executable memory at a
time and loaded process shares the CPU using time multiplexing.
Scheduling Queues
Scheduling queues refers to queues of processes or devices. When the process enters into the
system, then this process is put into a job queue. This queue consists of all processes in the
system. The operating system also maintains other queues such as device queue. Device queue is
a queue for which multiple processes are waiting for a particular I/O device. Each device has its
own device queue.
Ready queue
Device queue
A newly arrived process is put in the ready queue. Processes waits in ready queue for allocating
the CPU. Once the CPU is assigned to a process, then that process will execute. While executing
the process, any one of the following events can occur.
The process could issue an I/O request and then it would be placed in an I/O queue.
The process could create new sub process and will wait for its termination.
The process could be removed forcibly from the CPU, as a result of interrupt and put back in the
ready queue.
Two state process model refers to running and non-running states which are described below.
Schedulers
Schedulers are special system software which handles process scheduling in various ways.
Their main task is to select the jobs to be submitted into the system and to decide which process
to run. Schedulers are of three types
S.N. Long Term Scheduler Short Term Scheduler Medium Term Scheduler
1 It is a job scheduler It is a CPU scheduler It is a process swapping
scheduler.
2 Speed is lesser than short Speed is fastest among Speed is in between both short
term scheduler other two and long term scheduler.
It controls the degree of It provides lesser control
multiprogramming over degree of It reduces the degree of
3 multiprogramming multiprogramming.
It is almost absent or It is also minimal in time It is a part of Time sharing
4 minimal in time sharing sharing system systems.
system
It selects processes from It selects those processes It can re-introduce the process
5 pool and loads them into which are ready to execute into memory and execution can
memory for execution be continued.
A context switch is the mechanism to store and restore the state or context of a CPU in Process
Control block so that a process execution can be resumed from the same point at a later time.
Using this technique a context switcher enables multiple processes to share a single CPU.
Context switching is an essential part of a multitasking operating system features.
When the scheduler switches the CPU from executing one process to execute another, the
context switcher saves the content of all processor registers for the process being removed from
the CPU, in its process descriptor. The context of a process is represented in the process control
block of a process. Context switch time is pure overhead. Context switching can significantly
affect performance as modern computers have a lot of general and status registers to be saved.
Content switching times are highly dependent on hardware support. Context switch requires ( n
+ m ) bxK time units to save the state of the processor with n general registers, assuming b are
the store operations are required to save n and m registers of two process control blocks and
each store instruction requires K time units.
Context switching
Some hardware systems employ two or more sets of processor registers to reduce the amount of
context switching time. When the process is switched, the following information is stored.
Program Counter
Scheduling Information
Base and limit register value
Currently used register
Changed State
I/O State
Accounting
Problems
The problem was designed to illustrate the challenges of avoiding deadlock, a system state in
which no progress is possible. To see that a proper solution to this problem is not obvious,
consider a proposal in which each philosopher is instructed to behave as follows:
think until the left fork is available; when it is, pick it up;
think until the right fork is available; when it is, pick it up;
when both forks are held, eat for a fixed amount of time;
then, put the right fork down;
then, put the left fork down;
Repeat from the beginning.
This attempted solution fails because it allows the system to reach a deadlock state, in which no
progress is possible. This is a state in which each philosopher has picked up the fork to the left,
and is waiting for the fork to the right to become available. With the given instructions, this state
can be reached, and when it is reached, the philosophers will eternally wait for each other to
release a fork
monitor DP
{
enum { THINKING; HUNGRY, EATING) state [5] ;
condition self [5];
void pickup (int i) {
state[i] = HUNGRY;
test(i);
if (state[i] != EATING) self [i].wait;
}
void putdown (int i) {
state[i] = THINKING;
// test left and right neighbors
test((i + 4) % 5);
test((i + 1) % 5);
}
void test (int i) {
if ( (state[(i + 4) % 5] != EATING) &&
(state[i] == HUNGRY) &&
(state[(i + 1) % 5] != EATING) ) {
state[i] = EATING ;
self[i].signal () ;
}
}
initialization_code() {
for (int i = 0; i < 5; i++)
state[i] = THINKING;
}
}
Each philosopher I invokes the operations pickup() and putdown() in the following
sequence:
dp.pickup (i)
EAT
dp.putdown (i)
OperatingSystemMulti-Threading
This section describes thread, types of threads and various thread models.
What is Thread?
A thread is a flow of execution through the process code, with its own program counter, system
registers and stack. A thread is also called a light weight process. Threads provide a way to
improve application performance through parallelism. Threads represent a software approach to
improving performance of operating system by reducing the overhead thread is equivalent to a
classical process. Each thread belongs to exactly one process and no thread can exist outside a
process. Each thread represents a separate flow of control. Threads have been successfully used
in implementing network servers and web server. They also provide a suitable foundation for
parallel execution of applications on shared memory multiprocessors. Following figure shows
the working of the single and multithreaded processes.
S.N. Process Thread
Process is heavy weight or Thread is light weight taking lesser
1 resource intensive. resources than a process.
Process switching needs Thread switching does not need to
1 interaction with operating system. interact with operating system.
In multiple processing
environments each process All threads can share same set of open
1 executes the same code but has its files, child processes.
own memory and file resources.
If one process is blocked then no While one thread is blocked and
other process can execute until the waiting, second thread in the same task
1 first process is unblocked. can run.
Multiple processes without using Multiple threaded processes use
1 threads use more resources. fewer resources.
In multiple processes each process One thread can read, write or change
1 operates independently of the another thread's data.
others.
Advantages
Thread switching does not require Kernel mode privileges.
User level thread can run on any operating system.
Scheduling can be application specific in the user level thread.
User level threads are fast to create and manage.
Disadvantages
In a typical operating system, most system calls are blocking.
Multithreaded application cannot take advantage of multiprocessing.
The Kernel maintains context information for the process as a whole and for individuals’ threads
within the process. Scheduling by the Kernel is done on a thread basis. The Kernel performs
thread creation, scheduling and management in Kernel space. Kernel threads are generally
slower to create and manage than the user threads.
Advantages
Kernel can simultaneously schedule multiple threads from the same process on multiple
processes.
If one thread in a process is blocked, the Kernel can schedule another thread of the same process.
Kernel routines themselves can multithreaded.
Disadvantages
Kernel threads are generally slower to create and manage than the user threads.
Transfer of control from one thread to another within same process requires a mode switch to the
Kernel.
Some operating system provides a combined user level thread and Kernel level thread facility.
Solaris is a good example of this combined approach. In a combined system, multiple threads
within the same application can run in parallel on multiple processors and a blocking system call
need not block the entire process. Multithreading models are three types
Race Condition?
A race condition is an undesirable situation that occurs when a device or system attempts to
perform two or more operations at the same time, but because of the nature of the device or
system, the operations must be done in the proper sequence to be done correctly.
A race condition occurs when two threads access a shared variable at the same time. The first
thread reads the variable, and the second thread reads the same value from the variable. Then the
first thread and second thread perform their operations on the value, and they race to see which
thread can write the value last to the shared variable. The value of the thread that writes its value
last is preserved, because the thread is writing over the value that the previous thread wrote.
MemoryManagement
This section describes memory management techniques, logical v/s actual address space and various
paging techniques.
Memory management is the functionality of an operating system which handles or manages
primary memory. Memory management keeps track of each and every memory location either it
is allocated to some process or it is free. It checks how much memory is to be allocated to
processes. It decides which process will get memory at what time. It tracks whenever some
memory gets freed or unallocated and correspondingly it updates the status. Memory
management provides protection by using two registers, a base register and a limit register. The
base register holds the smallest legal physical memory address and the limit register specifies the
size of the range. For example, if the base register holds 300000 and the limit register is
1209000, then the program can legally access all addresses from 300000 through 411999.
Instructions and data to memory addresses can be done in following ways
Compile time -- When it is known at compile time where the process will reside, compile time
binding is used to generate the absolute code.
Load time -- When it is not known at compile time where the process will reside in memory,
then the compiler generates re-locatable code.
Execution time -- If the process can be moved during its execution from one memory segment
to another, then binding must be delayed to be done at run time
MEMORY ALLOCATION
Dynamic loading
In dynamic loading, a routine of a program is not loaded until it is called by the program. All
routines are kept on disk in a re-locatable load format. The main program is loaded into memory
and is executed. Other routines methods or modules are loaded on request. Dynamic loading
makes better memory space utilization and unused routines are never loaded.
Dynamic Linking
Linking is the process of collecting and combining various modules of code and data into a
executable file that can be loaded into memory and executed. Operating system can link system
level libraries to a program. When it combines the libraries at load time, the linking is called
static linking and when this linking is done at the time of execution, it is called as dynamic
linking. In static linking, libraries linked at compile time, so program code size becomes bigger
whereas in dynamic linking libraries linked at execution time so program code size remains
smaller.
Swapping
Swapping is a mechanism in which a process can be swapped temporarily out of main memory
to a backing store, and then brought back into memory for continued execution. Backing store is
a usually a hard disk drive or any other secondary storage which fast in access and large enough
to accommodate copies of all memory images for all users. It must be capable of providing
direct access to these memory images.
External fragmentation can be reduced by compaction or shuffle memory contents to place all
free memory together in one large block. To make compaction feasible, relocation should be
dynamic. External fragmentation is avoided by using paging technique. Paging is a technique in
which physical memory is broken into blocks of the same size called pages (size is power of 2,
between 512 bytes and 8192 bytes). When a process is to be executed, it's corresponding pages
are loaded into any available memory frames.
Logical address space of a process can be non-contiguous and a process is allocated physical
memory whenever the free memory frame is available. Operating system keeps track of all free
frames. Operating system needs n free frames to run a program of size n pages.
address.
Segmentation
Segmentation is a technique to break memory into logical pieces where each piece represents a
group of related information. For example, data segments or code segment for each process, data
segment for operating system and so on. Segmentation can be implemented using or without
using paging.
Segment number (s) -- segment number is used as an index into a segment table which contains
base address of each segment in physical memory and a limit of segment.
Segment offset (o) -- segment offset is first checked against limit and then is combined with
base address to define the physical memory address.
I/O Buffering
Buffering of I/O is performed for ( at least ) 3 major reasons:
1. Speed differences between two devices. A slow device may write data into a buffer, and
when the buffer is full, the entire buffer is sent to the fast device all at once. So that the
slow device still has somewhere to write while this is going on, a second buffer is used,
and the two buffers alternate as each becomes full. This is known asdouble buffering. (
Double buffering is often used in ( animated ) graphics, so that one screen image can be
generated in a buffer while the other ( completed ) buffer is displayed on the screen.
This prevents the user from ever seeing any half-finished screen images. )
2. Data transfer size differences. Buffers are used in particular in networking systems to
break messages up into smaller packets for transfer, and then for re-assembly at the
receiving side.
3. To support copy semantics. For example, when an application makes a request for a disk
write, the data is copied from the user's memory area into a kernel buffer. Now the
application can change their copy of the data, but the data which eventually gets written
out to disk is the version of the data at the time the write request was made.
VirtualMemory
This section describes concepts of virtual memory, demand paging and various page
replacement algorithms.
Virtual memory is a technique that allows the execution of processes which are not
completely available in memory. The main visible advantage of this scheme is that programs can
be larger than physical memory. Virtual memory is the separation of user logical memory from
physical memory. This separation allows an extremely large virtual memory to be provided for
programmers when only a smaller physical memory is available. Following are the situations,
when entire program is not required to be loaded fully in main memory.
User written error handling routines are used only when an error occured in the data or
computation.
Certain options and features of a program may be used rarely.
Many tables are assigned a fixed amount of address space even though only a small amount of
the table is actually used.
The ability to execute a program that is only partially in memory would counter many benefits.
Less number of I/O would be needed to load or swap each user program into memory.
A program would no longer be constrained by the amount of physical memory that is available.
Each user program could take less physical memory, more programs could be run the same time,
with a corresponding increase in CPU utilization and throughput.
Virtual memory is commonly implemented by demand paging. It can also be implemented in a
segmentation system. Demand segmentation can also be used to provide virtual memory.
Reference String
The string of memory references is called reference string. Reference strings are generated
artificially or by tracing a given system and recording the address of each memory reference.
The latter choice produces a large number of data, where we note two things.
A translation look-aside buffer (TLB): A translation lookaside buffer (TLB) is a memory cache
that stores recent translations of virtual memory to physical addresses for faster retrieval. When a
virtual memory address is referenced by a program, the search starts in the CPU. First, instruction
caches are checked. If the required memory is not in these very fast caches, the system has to look
up the memory’s physical address. At this point, TLB is checked for a quick reference to the
location in physical memory.
When an address is searched in the TLB and not found, the physical memory must be searched
with a memory page crawl operation. As virtual memory addresses are translated, values
referenced are added to TLB. When a value can be retrieved from TLB, speed is enhanced
because the memory address is stored in the TLB on processor. Most processors include TLBs to
increase the speed of virtual memory operations through the inherent latency-reducing proximity
as well as the high-running frequencies of current CPU’s.
TLBs also add the support required for multi-user computers to keep memory separate, by having
a user and a supervisor mode as well as using permissions on read and write bits to enable sharing.
TLBs can suffer performance issues from multitasking and code errors. This performance
degradation is called a cache thrash. Cache thrash is caused by an ongoing computer activity that
fails to progress due to excessive use of resources or conflicts in the caching system.
ALGORITHM
FIFO Page replacement algorithm
Oldest page in main memory is the one which will be selected for replacement.
Easy to implement, keep a list, replace pages from the tail and add new pages at the head.
OperatingSystemSecurity
This section describes various security related aspects like authentication, one time password,
threats and security classifications.
Security refers to providing a protection system to computer system resources such as CPU,
memory, disk, software programs and most importantly data/information stored in the computer
system. If a computer program is run by unauthorized user then he/she may cause severe damage
to computer or data stored in it. So a computer system must be protected against unauthorized
access, malicious access to system memory, viruses, worms etc. We're going to discuss
following topics in this article.
Authentication
One Time passwords
Program Threats
System Threats
Computer Security Classifications
Authentication refers to identifying the each user of the system and associating the executing
programs with those users. It is the responsibility of the Operating System to create a protection
system which ensures that a user who is running a particular program is authentic. Operating
Systems generally identifies/authenticates users using following three ways:
Username / Password - User need to enter a registered username and password with Operating
system to login into the system.
User card/key - User need to punch card in card slot, or enter key generated by key generator in
option provided by operating system to login into the system.
User attribute - fingerprint/ eye retina pattern/ signature - User need to pass his/her attribute
via designated input device used by operating system to login into the system.
One time passwords provides additional security along with normal authentication. In One-
Time Password system, a unique password is required every time user tries to login into the
system. Once a one-time password is used then it cannot be used again. One time password are
implemented in various ways.
Random numbers - Users are provided cards having numbers printed along with corresponding
alphabets. System asks for numbers corresponding to few alphabets randomly chosen.
Secret key - User are provided a hardware device which can create a secret id mapped with user
id. System asks for such secret id which is to be generated every time prior to login.
Network password - Some commercial applications send one time password to user on
registered mobile/ email which is required to be entered prior to login.
Operating system's processes and kernel do the designated task as instructed. If a user
program made these process do malicious tasks then it is known as Program Threats. One of the
common examples of program threat is a program installed in a computer which can store and
send user credentials via network to some hacker. Following is the list of some well-known
program threats.
Trojan horse - Such program traps user login credentials and stores them to send to malicious
user who can later on login to computer and can access system resources.
Trap Door - If a program which is designed to work as required, have a security hole in its code
and perform illegal action without knowledge of user then it is called to have a trap door.
Logic Bomb - Logic bomb is a situation when a program misbehaves only when certain
conditions met otherwise it works as a genuine program. It is harder to detect.
Virus - Virus as name suggests can replicate them on computer system .They are highly
dangerous and can modify/delete user files, crash systems. A virus is generally a small code
embedded in a program. As user accesses the program, the virus starts getting embedded in other
files/ programs and can make system unusable for user.
System threats refer to misuse of system services and network connections to put user in
trouble. System threats can be used to launch program threats on a complete network called as
program attack. System threats create such an environment that
operating system resources/ user files are mis-used. Following is the list of some well-known
system threats.
Worm -Worm is a process which can choke down a system performance by using system
resources to extreme levels. A Worm process generates its multiple copies where each copy uses
system resources, prevents all other processes to get required resources. Worm processes can
even shut down an entire network.
Port Scanning - Port scanning is a mechanism or means by which a hacker can detects system
vulnerabilities to make an attack on the system.
Denial of Service - Denial of service attacks normally prevents user to make legitimate use of
the system
CHAPTER 2: FUNDAMENTALS OF LANGUAGE
PROCESSING
Definition
Language Processing = Analysis of SP + Synthesis of TP.
Definition motivates a generic model of language processing activities. We refer to the collection
of language processor components engaged in analyzing a source program as the analysis phase of
the language processor. Components engaged in synthesizing a target program constitute the
synthesis phase.
a) Preprocessor:A preprocessor produce input to compilers. They may perform the following
functions.
1. Macro processing: A preprocessor may allow a user to define macros that are short hands for
longer constructs.
2. File inclusion: A preprocessor may include header files into the program text.
3. Rational preprocessor: these preprocessors augment older languages with more modern flow-of-
control and data structuring facilities.
4. Language Extensions: These preprocessor attempts to add capabilities to the language by certain
amounts to build-in macro
b) COMPILER:Compiler is a translator program that translates a program written in (HLL) the source
program and translate it into an equivalent program in (MLL) the target program. As an important
part of a compiler is error showing to the programmer.
What is an assembler ?
A tool called an assembler translates assembly language into binary instructions. Assemblers
provide a friendlier representation than a computer’s 0s and 1s that simplifies writing and reading
programs. Symbolic names for operations and locations are one facet of this representation. Another
facet is programming facilities that increase a program’s clarity.
An assembler reads a single assembly language source file and produces an object file containing
machine instructions and bookkeeping information that helps combine several object files into a
program. Figure (1) illustrates how a program is built. Most programs consist of several files—also
called modules— that are written, compiled, and assembled independently. A program may also use
prewritten routines supplied in a program library . A module typically contains References to
subroutines and data defined in other modules and in libraries. The code in a module cannot be
executed when it contains unresolved References to labels in other object files or libraries. Another
tool, called a linker, combines a collection of object and library files into an executable file , which a
computer can run.
For efficiency reasons SYMTAB must remain in main memory throughout passes I and II of the assembler.
LITTAB is not accessed as frequently as SYMTAB, however
it may be accessed sufficiently frequently to justify its presence in the memory. If memory is at a premium,
only a part of LITTAB can be kept in memory. OPTAB should be in memory during pass I
1. Lexical analysis
2. Synatx analysis
3. Semantic analysis
4. Direct Execution
e) Loader and Link-editor: Once the assembler procedures an object program, that program must be
placed into memory and executed. The assembler could place the object program directly in
memory and transfer control to it, thereby causing the machine language program to be execute.
This would waste core by leaving the assembler in memory while the user’s program was being
executed. Also the programmer would have to retranslate his program with each execution, thus
wasting translation time. To overcome this problems of wasted translation time and memory.
System programmers developed another component called loader “A loader is a program that
places programs into memory and prepares them for execution.” It would be more efficient if
subroutines could be translated into object form the loader could “relocate” directly behind the
user’s program.
PHASES OF A COMPILER:
A compiler can broadly be divided into two phases based on the way they compile.
Analysis Phase
Known as the front-end of the compiler, the analysis phase of the compiler reads the source
program, divides it into core parts and then checks for lexical, grammar and syntax errors.The
analysis phase generates an intermediate representation of the source program and symbol table,
which should be fed to the Synthesis phase as input.
Synthesis Phase
Known as the back-end of the compiler, the synthesis phase generates the target program with the
help of intermediate source code representation and symbol table.
Pass : A pass refers to the traversal of a compiler through the entire program.
Phase : A phase of a compiler is a distinguishable stage, which takes input from the previous stage,
processes and yields output that can be used as input for the next stage. A pass can have more than
one phase.
Syntax analysis This phase takes the list of tokens produced by the lexical analysis and arranges these in a tree-structure (called
the syntax tree) that reflects the structure of the program. This phase is often called parsing.
Type checking This phase analyses the syntax tree to determine if the program violates certain consistency requirements, e.g., if
a variable is used but not declared or if it is used in a context that does not make sense given the type of the variable, such as
trying to use a boolean value as a function pointer.
Intermediate code generation The program is translated to a simple machine- independent intermediate
language.
Register allocation The symbolic variable names used in the intermediate code are translated to numbers, each of which
corresponds to a register in the target machine code.
Lexical Analysis
OVER VIEW OF LEXICAL ANALYSIS
The word “lexical” in the traditional sense means “pertaining to words”. In terms of programming
languages, words are objects like variable names, numbers, keywords etc. Lexical analysis is the first
phase of a compiler. It takes the modified source code from language preprocessors that are written in
the form of sentences. The lexical analyzer breaks these syntaxes into a series of tokens, by removing
any whitespace or comments in the source code.
If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works closely
with the syntax analyzer. It reads character streams from the source code, checks for legal tokens, and
passes the data to the syntax analyzer when it demands.
Tokens
Lexemes are said to be a sequence of characters (alphanumeric) in a token. There are some
predefined rules for every lexeme to be identified as a valid token. These rules are defined by
grammar rules, by means of a pattern. A pattern explains what can be a token, and these patterns
are defined by means of regular expressions.
Syntax Analysis
Introduction
Syntax analysis or parsing is the second phase of a compiler. In this chapter, we shall learn the basic
concepts used in the construction of a parser.
We have seen that a lexical analyzer can identify tokens with the help of regular expressions and pattern
rules. But a lexical analyzer cannot check the syntax of a given sentence due to the limitations of the regular
expressions. Regular expressions cannot check balancing tokens, such as parenthesis. Therefore, this phase
uses context-free grammar (CFG), which is recognized by push-down automata.
Syntax Analyzers
A syntax analyzer or parser takes the input from a lexical analyzer in the form of token streams.
The parser analyzes the source code (token stream) against the production rules to detect any errors
in the code. The output of this phase is a parse tree.
This way, the parser accomplishes two tasks, i.e., parsing the code, looking for errors and
generating a parse tree as the output of the phase.
Parsers are expected to parse the whole code even if some errors exist in the program. Parsers use
error recovering strategies, which we will learn later in this chapter.
Parse Tree
A parse tree is a graphical depiction of a derivation. It is convenient to see how strings are derived
from the start symbol. The start symbol of the derivation becomes the root of the parse tree. Let us
see this by an example from the last topic.
Types of Parsing
Syntax analyzers follow production rules defined by means of context-free grammar. The way the
production rules are implemented (derivation) divides parsing into two types : top-down parsing
and bottom-up parsing.
Top-down Parsing
When the parser starts constructing the parse tree from the start symbol and then tries to transform
the start symbol to the input, it is called top-down parsing.
Recursive descent parsing : It is a common form of top-down parsing. It is called
recursive as it uses recursive procedures to process the input. Recursive descent parsing
suffers from backtracking.
Backtracking : It means, if one derivation of a production fails, the syntax analyzer
restarts the process using different rules of same production. This technique may process
the input string more than once to determine the right production.
Recursive Descent Parsing
Recursive descent is a top-down parsing technique that constructs the parse tree from the top and
the input is read from left to right. It uses procedures for every terminal and non-terminal entity.
This parsing technique recursively parses the input to make a parse tree, which may or may not
require back-tracking. But the grammar associated with it (if not left factored) cannot avoid back-
tracking. A form of recursive-descent parsing that does not require any back-tracking is known
as predictive parsing.
This parsing technique is regarded recursive as it uses context-free grammar which is recursive in
nature.
Back-tracking
Top- down parsers start from the root node (start symbol) and match the input string against the
production rules to replace them (if matched). To understand this, take the following example of
CFG:
S → rXd | rZd
X → oa | ea
Z → ai
For an input string: read, a top-down parser, will behave like this:
It will start with S from the production rules and will match its yield to the left-most letter of the
input, i.e. ‘r’. The very production of S (S → rXd) matches with it. So the top-down parser
advances to the next input letter (i.e. ‘e’). The parser tries to expand non-terminal ‘X’ and checks
its production from the left (X → oa). It does not match with the next input symbol. So the top-
down parser backtracks to obtain the next production rule of X, (X → ea).
Now the parser matches all the input letters in an ordered manner. The string is accepted.
Predictive Parser
Predictive parser is a recursive descent parser, which has the capability to predict which
production is to be used to replace the input string. The predictive parser does not suffer from
backtracking.
To accomplish its tasks, the predictive parser uses a look-ahead pointer, which points to the next
input symbols. To make the parser back-tracking free, the predictive parser puts some constraints
on the grammar and accepts only a class of grammar known as LL(k) grammar.
Predictive parsing uses a stack and a parsing table to parse the input and generate a parse tree.
Both the stack and the input contains an end symbol $to denote that the stack is empty and the
input is consumed. The parser refers to the parsing table to take any decision on the input and
stack element combination.
In recursive descent parsing, the parser may have more than one production to choose from for a
single instance of input, whereas in predictive parser, each step has at most one production to
choose. There might be instances where there is no production matching the input string, making
the parsing procedure to fail.
LL Parser
An LL Parser accepts LL grammar. LL grammar is a subset of context-free grammar but with
some restrictions to get the simplified version, in order to achieve easy implementation. LL
grammar can be implemented by means of both algorithms namely, recursive-descent or table-
driven.
LL parser is denoted as LL(k). The first L in LL(k) is parsing the input from left to right, the
second L in LL(k) stands for left-most derivation and k itself represents the number of look
aheads. Generally k = 1, so LL(k) may also be written as LL(1).
Bottom-up Parsing
As the name suggests, bottom-up parsing starts with the input symbols and tries to construct the
parse tree up to the start symbol.
Bottom-up parsing starts from the leaf nodes of a tree and works in upward direction till it reaches the
root node. Here, we start from a sentence and then apply production rules in reverse manner in order to
reach the start symbol
Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing. These steps are known as shift-
step and reduce-step.
Shift step: The shift step refers to the advancement of the input pointer to the next input
symbol, which is called the shifted symbol. This symbol is pushed onto the stack. The
shifted symbol is treated as a single node of the parse tree.
Reduce step : When the parser finds a complete grammar rule (RHS) and replaces it to
(LHS), it is known as reduce-step. This occurs when the top of the stack contains a handle.
To reduce, a POP function is performed on the stack which pops off the handle and
replaces it with LHS non-terminal symbol.
LR Parser
The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class of context-
free grammar which makes it the most efficient syntax analysis technique. LR parsers are also
known as LR(k) parsers, where L stands for left-to-right scanning of the input stream; R stands for
the construction of right-most derivation in reverse, and k denotes the number of lookahead
symbols to make decisions.
There are three widely used algorithms available for constructing an LR parser:
SLR(1) – Simple LR Parser:
o Works on smallest class of grammar
o Few number of states, hence very small table
o Simple and fast construction
LR(1) – LR Parser:
o Works on complete set of LR(1) Grammar
o Generates large table and large number of states
o Slow construction
LALR(1) – Look-Ahead LR Parser:
o Works on intermediate size of grammar
o Number of states are same as in SLR(1)
LL vs. LR
LL LR
Starts with the root nonterminal on the Ends with the root nonterminal on the stack.
stack.
Uses the stack for designating what is still Uses the stack for designating what is
to be expected. already seen.
Builds the parse tree top-down. Builds the parse tree bottom-up.
Continuously pops a nonterminal off the Tries to recognize a right hand side on the
stack, and pushes the corresponding right stack, pops it, and pushes the corresponding
hand side. nonterminal.
Reads the terminals when it pops one off Reads the terminals while it pushes them on
the stack. the stack.
Pre-order traversal of the parse tree. Post-order traversal of the parse tree.
All operating systems that support program loading have loaders, apart from highly specialized
computer systems that only have a fixed set of specialized programs. Embedded systems typically
do not have loaders, and instead the code executes directly from ROM. In order to load the operating
system itself, as part of booting, a specialized boot loader is used. In many operating systems the
loader is permanently resident in memory, although some operating systems that support virtual
memory may allow the loader to be located in a region of memory that is pageable.
• Part of the OS that brings an executable file residing on disk into memory and starts it running
• Steps
– Read executable file’s header to determine the size of text and data segments
– Create a new address space for the program
– Copies instructions and data into address space
– Copies arguments passed to the program on the stack
– Initializes the machine registers including the stack ptr
– Jumps to a startup routine that copies the program’s arguments from the stack to registers
and calls the program’s main routine
Macro Definition
Two new assembler directives are used in macro definition:
MACRO: identify the beginning of a macro definition
MEND: identify the end of a macro definition
• label op operands
name MACRO parameters
:
body
:
MEND
• Parameters: the entries in the operand field identify the parameters of the macro
instruction
We require each parameter begins with ‘&’
• Body: the statements that will be generated as the expansion of the macro.
• Prototype for the macro:
The macro name and parameters define a pattern or prototype for the macro instructions used
by the programmer
Macro Expansion
Each macro invocation statement will be expanded into the statements that form the body of the
macro.
• Arguments from the macro invocation are substituted for the parameters in the macro
prototype.
The arguments and parameters are associated with one another according to their positions.
• The first argument in the macro invocation corresponds to the first parameter in the macro
prototype, etc.
Macro call leads to macro expansion. During macro expansion, the macro call statement is
replaced by a sequence of assembly statements. Two key notions concerning macro expansion
are:
a. Expansion time control flow- this determines the order in which model statements are
visited during macro expansion.
b. Lexical substitution: Lexical substitution is used to generate an assembly statement
from a modal statement.
CHAPTER 5: Deadlocks in
operating system
What is a Deadlock?
Deadlocks are a set of blocked processes each holding a resource and waiting to acquire a resource
held by another process.
Handling Deadlock
The above points focus on preventing deadlocks. But what to do once a deadlock has occured.
Following three strategies can be used to remove deadlock after its occurrence.
• Preemption: We can take a resource from one process and give it to other. This will resolve the
deadlock situation, but sometimes it does causes problems.
• Rollback: In situations where deadlock is a real possibility, the system can periodically make a
record of the state of each process and when deadlock occurs, roll everything back to the last
checkpoint, and restart, but allocating resources differently so that deadlock does not occur.
• Kill one or more processes: This is the simplest way, but it works.
If the necessary conditions for a deadlock are in place, it is still possible to avoid deadlock by being
careful when resources are allocated. Perhaps the most famous deadlock avoidance algorithm, due
to Dijkstra [1965], is the Banker’s algorithm. So named because the process is analogous to that
used by a banker in deciding if a loan can be safely made.
Fig. 1
In the above figure, we see four customers each of whom has been granted a number of credit nits.
The banker reserved only 10 units rather than 22 units to service them. At certain moment, the
situation becomes
Customers Used Max
A 1 6
B 1 5 Available Units
C 2 4 =2
D 4 7
Fig. 2
Safe State The key to a state being safe is that there is at least one way for all users to finish. In
other analogy, the state of figure 2 is safe because with 2 units left, the banker can delay any
request exceptC's, thus letting C finish and release all four resources. With four units in hand, the
banker can let either D or B have the necessary units and so on.
Unsafe State Consider what would happen if a request from B for one more unit were granted in
above figure 2.
Fig. 3
If all the customers namely A, B, C, and D asked for their maximum loans, then banker could not
satisfy any of them and we would have a deadlock.
Important Note: It is important to note that an unsafe state does not imply the existence or even
the eventual existence a deadlock. What an unsafe state does imply is simply that some unfortunate
sequence of events might lead to a deadlock.