CS1-OperatingSystemsLectureNotesComplete
CS1-OperatingSystemsLectureNotesComplete
Basically OS is a
Resource manager with objective of efficient resource use and user
convenience
Control program that controls the execution of user programs to
prevent errors and improper use of the computer
Several jobs are kept in main memory at the same time, and the
CPU is multiplexed among them.
Hard real-time:
Secondary storage limited or absent, data stored in short
term memory, or read-only memory (ROM)
Conflicts with time-sharing systems as no virtual memory.
Soft real-time
Limited utility in industrial control of robotics
Useful in applications (multimedia, virtual reality, undersea
exploration, planetory rovers) requiring advanced operating-
system features.
Ex. RT Linux
Bootstrap:(1) Initialize h/w - like cpu registers, device controllers, memory controllers etc
( 2 ) Loads o.s kernel into the primary memory and then transfers control to O.S
Synchronous Asynchronous
Slow
Good for sequential access and not good for random
access
Generally used as Backup media
Dual-Mode Operation
I/O Protection
Memory Protection
CPU Protection
monitor user
set user mode
System Components
Operating System Services
System Calls
System Programs
System Structure
Virtual Machines
System Design and Implementation
System Generation
Process Management
Main Memory Management
File Management
I/O System Management
Secondary Storage Management
Networking
Protection System
Command-Interpreter System
command-line interpreter
shell (in UNIX)
Additional functions exist not for helping the user, but rather
for ensuring efficient system operations.
• Resource allocation – allocating resources to multiple users
or multiple jobs running at the same time.
• Accounting – keep track of and record which users use how
much and what kinds of computer resources for account
billing or for accumulating usage statistics.
• Protection – ensuring that all access to system resources is
controlled.
Systemc
all 13 (X)
Note: UNIX process related system calls fork, exec, exit (return
with error code 0 (no error), any positive no (error no)
Process Concept
Process Scheduling
Operations on Processes
Cooperating Processes
Interprocess Communication
Note : The process could be removed forcibly from the CPU, as a result of an
interrupt and put back in the ready queue.
Resources that
QUEUES
serve the queue
OS Notes by Dr. Naveen Choudhary
Schedulers
Job queue
Ready CPU
queue
L.T.S should select a good mix of i/o bound and CPU bound process so
as to make efficient and optimal use of all the i/o devices and CPU
The key idea behind medium term scheduler is that some times it can be
advantageous to remove process from memory (and from active contention of
CPU) and thus to decrease the degree of multiprogramming. At some later time
the process can be reintroduced into memory and its execution can be continued
where it left off. This scheme is called swapping. Swapping (medium term
scheduler) may be necessary to improve the process mix or because a change in
memory requirement has overcommitted available memory, requiring memory to
be freed.
item nextProduced;
while (1) {
while (((in + 1) % BUFFER_SIZE) == out)
; /* do nothing */
buffer[in] = nextProduced;
in = (in + 1) % BUFFER_SIZE;
}
item nextConsumed;
while (1) {
while (in == out)
; /* do nothing */
nextConsumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
}
Operations
create a new mailbox
send and receive messages through mailbox
destroy a mailbox
Primitives are defined as:
send(A, message) – send a message to mailbox A
receive(A, message) – receive a message from mailbox A
PC PC
Program
Counter
Disadvantages:
If the kernel is single threaded, then any user level thread
executing a system call will cause the entire task to wait, until
the system call returns because kernel schedule only processes
& processes waiting for I/O (system call ) are put in wait queue
& can not be allotted CPU.
Process B
Process A …..
t1b t100b
t1a
So process B could receive 100 times the CPU time than process A receives.
Now if a thread make an I/O and a system call, the whole process need not be
blocked (only that thread is blocked) & thus another thread in the same process
run during this time.
Examples
- Windows 95/98/NT/2000
- Solaris
- Tru64 UNIX
- BeOS
- Linux
OS Notes by Dr. Naveen Choudhary
Multithreading Models
Many-to-One
One-to-One
Many-to-Many
Example :- Unix
Examples
- Windows 95/98/NT/2000
- OS/2
/or simply
process
Basic Concepts
Scheduling Criteria
Scheduling Algorithms
Multiple-Processor Scheduling
Real-Time Scheduling
Algorithm Evaluation
Note: fig shows there are large no. of short CPU burst, and there is a small
no. of long I/O bursts. An i/o bound program would typically have many
very short CPU burst and long i/o bursts. An CPU bound program may
have a few very long CPU bursts. This distribution can be important in the
selection of an appropriate CPU scheduling
OS Notes by Dr. Naveen Choudhary
CPU Scheduler
P1 P2 P3
0 24 27 30
P2 P3 P1
0 3 6 30
Waiting time for P1 = 6; P2 = 0; P3 = 3
Average waiting time: (6 + 0 + 3)/3 = 3
Much better than previous case.
Convoy effect as all the other processes wait for one big
process to get off the CPU resulting in lower CPU and device
utilization ( as this will not decrease the waiting time of the big
process as it will increases the waiting time of all the other
processes
Soln short process first and then schedule long process to
avoid convoy effect
P1 P3 P2 P4
0 3 7 8 12 16
P1 P2 P3 P2 P4 P1
0 2 4 5 7 11 16
n1 tn 1 n .
=0
n+1 = n
Recent history has no effect. Last prediction has all the
weightage
=1
n+1 = tn
Only the actual last CPU burst counts.
If we expand the formula, we get:
n+1 = tn+(1 - ) tn -1 + …
+(1 - )j tn -1 + …
+(1 - )n=1 tn 0
P1 P2 P3 P4 P1 P3 P4 P1 P3 P3
Three queues:
Q0 – time quantum 8 milliseconds
Q1 – time quantum 16 milliseconds
Q2 – FCFS
Scheduling
A new job enters queue Q0 which is served in FCFS order.
When it gains CPU, job receives 8 milliseconds. If it does
not finish in 8 milliseconds, job is moved to queue Q1.
At Q1 job is again served (may be in FCFS ) and receives
16 additional milliseconds. If it still does not complete, it is
preempted and moved to queue Q2.
Background
The Critical-Section Problem
Synchronization Hardware
Semaphores
Classical Problems of Synchronization
Critical Regions
Monitors
Synchronization in Solaris 2 & Windows 2000
#define BUFFER_SIZE 10
typedef struct {
...
} item;
item buffer[BUFFER_SIZE];
int in = 0;
int out = 0;
int counter = 0;
Producer process
item nextProduced;
while (1) {
while (counter == BUFFER_SIZE)
; /* do nothing */
buffer[in] = nextProduced;
in = (in + 1) ;
counter++;
}
Consumer process
item nextConsumed;
while (1) {
while (counter == 0)
; /* do nothing */
nextConsumed = buffer[out];
out = (out + 1) ;
counter--;
}
The statements
counter++;
counter- -;
register1 = counter
register1 = register1 + 1
counter = register1
register2 = counter
register2 = register2 – 1
counter = register2
}
Note : in a uni-processor system solving critical
section problem is easy just disallow interrupts to
occur while a shared variable is being modified but
that is not so in case of multiprocessing environment
Shared data:
boolean lock = false;
Process Pi
do {
while (TestAndSet(lock)) do no-operation;
critical section
lock = false;
remainder section
}
Process Pi
do {
key = true;
while (key == true)
Swap(lock,key);
critical section
lock = false;
remainder section
}
***
Shared data:
semaphore mutex; //initially mutex = 1
Process Pi:
do {
wait(mutex);
critical section
signal(mutex);
remainder section
} while (1);
Data structures:
binary-semaphore S1, S2;
int C:
Initialization:
S1 = 1
S2 = 0
C = initial value of semaphore S
signal operation
wait(S1);
C ++;
if (C <= 0)
{ signal(S2); }
signal(S1);
Bounded-Buffer Problem
Dining-Philosophers Problem
Initially:
do {
…
produce an item in nextp
…
wait(empty);
wait(mutex);
…
add nextp to buffer
…
signal(mutex);
signal(full);
} while (1);
do {
wait(full)
wait(mutex);
…
remove an item from buffer to nextc
…
signal(mutex);
signal(empty);
…
consume the item in nextc
…
} while (1);
First reader writer problem :: no reader will be kept waiting unless a writer has
already obtained permission to use the shared object. In other words, no reader
should wait for other readers to finish simply because a writer is waiting (problem
writer can starve )
Second reader writer problem :: once a writer is ready, that writer performs its write
as soon as possible. In other words if a writer is waiting to access the object, no
new reader may start reading (problem reader can starve )
The algorithm below is for first reader writer problem
Shared data
Initially
wait(wrt);
…
writing is performed
…
signal(wrt);
wait(mutex);
readcount++;
if (readcount == 1)
wait(wrt);
signal(mutex);
…
reading is performed
…
wait(mutex);
readcount--;
if (readcount == 0)
signal(wrt);
signal(mutex):
mutex :: to ensure mutual exclusion when the variable readcount is updated
Wrt :: used by the writer & also by first and last reader that enters or exit the
C.S (it is not used by readers who enter or exit while other readers are in
their C.S)
Note that , if a writer is in CS & n readers are waiting, then one reader is
queued on wrt and n-1 readers are queued on mutex. Also observe that,
when a writer executes signal(wrt), we may resume the execution of either
the waiting reader or a single waiting writer. The selection is made by the
scheduler.
monitor monitor-name
{
shared variable declarations
procedure body P1 (…) {
...
}
procedure body P2 (…) {
...
}
procedure body Pn (…) {
...
}
{
initialization code
}
}
OS Notes by Dr. Naveen Choudhary
Monitors …..contd
System Model
Deadlock Characterization
Methods for Handling Deadlocks
Deadlock Prevention
Deadlock Avoidance
Deadlock Detection
Recovery from Deadlock
Combined Approach to Deadlock Handling
P0 P1
wait (A);(1) wait(B) (2)
wait (B); (3) wait(A) (4)
Process
Pi requests instance of Rj
Pi
Rj
Pi is holding an instance of Rj
Pi
Rj
No Preemption –
If a process (say p1) that is holding some resources
requests another resource that cannot be immediately
allocated to it, then all resources currently being held by p1
are released.
Preempted resources are added to the list of resources for
which the p1 process is waiting.
Process (p1) will be restarted only when it can regain its old
resources, as well as the new ones that it is requesting.
Note : The protocol is often applied to resources whose
state can be easily saved and restarted later, such as
CPU registers & memory space. It cannot generally be
applied to such resources as printers and tape drives.
Sequence <P1, P2, …, Pn> is safe if for each Pi, the resources
that Pi can still request can be satisfied by currently available
resources + resources held by all the Pj, with j<I.
If Pi resource needs are not immediately available, then Pi can wait
until all Pj have finished.
When Pj is finished, Pi can obtain needed resources, execute,
return allocated resources, and terminate.
When Pi terminates, Pi+1 can obtain its needed resources, and so
on.
Multiple instances.
Detection algorithm
Recovery scheme
Process Termination ::
Background
Swapping
Contiguous Allocation
Paging
Segmentation
Segmentation with Paging
Questions: a cpu with say 16 address line can access 216 memory locations i.e. 64 Kbytes
-q1: what can be the maximim size of a single program
-q2: can we have 8 MB of memory (i.e. Memory less than 16 KB)
O.S. Notes prepared by Dr. Naveen Choudhary
Dynamic Loading
Assembler Assembler
7. A job being swapped out should not have any pending i/o. otherwise pending i/o will
ultimately write into the wrong (swapped in ) process area.
soln. never to swap a process with pending i/o
To execute i/o operation only into operating system buffer
O.S. Notes prepared by Dr. Naveen Choudhary
Schematic View of Swapping
Single-partition allocation
Relocation-register & limit register are used to protect user
processes from each other, and from changing operating-system
code and data.
Relocation register contains value of smallest physical address;
limit register contains range of logical addresses – each logical
address must be less than the limit register.
When the CPU scheduler selects a process for execution, the
dispatcher loads the relocation & limit registers with the correct
values as part of the context switch.
Free
O.S. Notes prepared by Dr. Naveen Choudhary
Hardware Support for Relocation and Limit Registers
OS OS OS OS
process 8 process 10
OS OS
400 k 400 k Selecting an
P1 P1 optimal
900 k 900 k
100 k compaction
1000 k P2 strategy is quite
1600 k
P2 P3 difficult (optimal
1700 k – minimum
P3 1900 k
2000 k 400 movement of
300 k program & data)
2300 k
O.S. Notes prepared by Dr. Naveen Choudhary NEXT…
Fragmentation……..
Page offset (d) – combined with base address to define the physical
memory address that is sent to the memory unit.
Internal Fragmentation in paging
No external fragmentation. But there will be internal fragmentation if
program size is more than say n pages but less than n+1 pages ( if
process size is n pages & 1 bytes then almost 1 whole page is internal
fragmentation)
On average one half page per process is internal fragmentation so
small page size is better.
But for big pages there will be less entries in the page table so less
over head for the OS. Large pages is also good for disk I\O as it will be
more efficient then doing may small I\O (I\O efficient when data being
transferred is large)
Trend is towards large page size (2 or 4 kilo bytes)
4
Address 5 x 4 + 0 = 20
5 x 4 + 2 = 22 5
frame No. x NO. of
6 x 4 + 2 = 26 6
bytes per page +
offset
7
Hierarchical Paging
p1 p2 d
10 10 12
where pi is an index into the outer page table, and p2 is the displacement
within the page of the outer page table.
For 3 – level paging we need to have p1,p2 & p3
As the no. of levels increases the access time for memory increase but
caching (Associate members) with high hit ratio can help greatly.
Actual page/frame
Frame No.
Shared code
One copy of read-only (reentrant) code shared among
processes (i.e., text editors, compilers, window systems).
Shared code must appear in same location in the logical
address space of all processes.
4
1
3 2
4
Relocation.
dynamic
by segment table
Sharing.
shared segments
same segment number
Allocation.
first fit/best fit
external fragmentation
•There can be jump inst within the shared segment. What this address will be
(segment No., offset) so how the segment will refer itself. Because there is only on
physical copy of sqrt, it must refer to itself in same way for other users.---i.e. it must have
a unique segment No.
•So if large no. of processes are sharing a segment then having a unique segment no.
for
O.S. shared segment
Notes prepared by Dr. Naveenno. in all the processes will be a problem.
Choudhary
Segmentation with Paging – MULTICS
To extend the scheme further (let s = 18 bit then we can have 262, 144
segments, requiring excessively large segment table. we can page the
segment table itself. So now our logical address will look like)
Segment no. offset
s1 s2 d1 d2
8
O.S. Notes prepared by Dr. Naveen Choudhary 10 6 10
Chapter 10: Virtual Memory
Background
Demand Paging
Process Creation
Page Replacement
Allocation of Frames
Thrashing
Operating System Examples
0
0
page table
During address translation, if valid–invalid bit in page
table entry is 0 page fault.
O. S. Notes Prepared by Dr. Naveen Choudhary
Page Table When Some Pages Are Not in Main Memory
Cont…
O. S. Notes Prepared by Dr. Naveen Choudhary ….
Page Fault
50% of the time the page that is being replaced has been modified and
therefore needs to be swapped out.
Swap Page Time (page fault service time ) = 10 msec = 10,000 microsec
EAT = (1 – p) x 1 + p (10000)
1 – P + 10000P
= 1 + 9999P
To improve page fault service time :- use swap space
swap space is generally faster then access from file
system. It is faster b/s swap space is allocated in much larger blocks,
and file look ups & indirect allocation method are not used
1 1 1 4 4 4 5 5 5
2 2 2 1 1 1 3 3
3 3 3 2 2 2 4
1 5 5 5 5 4 4
2 2 1 1 1 1 5
10 page faults
3 3 3 2 2 2 2
4 4 4 4 3 3 3
= 15 page faults
1 4
6 page faults
2
( 4 (initial faults)+2 page faults )
3
4 5
1 5
2
3 5 4
4 3
1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0 1
0 1 2 0 3 0 4 2 3 0 3 2 1 2 0 1 7 0
7 0 1 2 2 3 0 4 2 2 0 3 3 1 2 0 1 7
= 12 page fault
Head Tail
Reference bit Rb
Second chance ( give pages with reference bit set, a second chance
Goal :- is to keep regularly referenced pages in the memory. data
structure used Circular queue, reference bit
Need reference bit.
Clock replacement.
If page to be replaced (in clock order) has reference bit = 1.
then:
set reference bit 0.
leave page in memory.
replace next page (in clock order), subject to same rules.
The working set strategy prevents Thrashing while keeping the degree of
multi programming as high as possible thus optimizing the CPU utilization.
The difficulty with the working set model is keeping track of the working
set. The working set window is a moving window. At each memory
references, a new reference appears at one end & the oldest reference
drops off the other end.
Consider I/O. Pages that are used for copying a file from a I/O device
must be locked from being selected for eviction by a page replacement
algorithm.
I/O is being done from a I/O device to a memory page now if this page is
replaced with another page of some other process then information from
I/O device will be written wrongly to the new page (as I/O will be
implement by different I/O process & it will be unaware of the page
change)
Solution 1: Write/ read I/O from Kernel/ system buffer so a process
wishing to do I/O will first write/ read the information in kernel buffer &
from there I/O processor will read data ( but this extra copying will
increase the overhead
Solution 2: Associate a lock bit to every frame in memory. while doing I/O
set lock bit for frame. The frame with the lock bit set can not be replaced.
File Concept
Access Methods
Directory Structure
File Sharing
Protection
Create
Write
Read
Reposition within file – file seek
Delete
Truncate (contents of a file is deleted ie file size
becomes 0 but other attributes of file are not changed.
Open(Fi) – search the directory structure on disk for entry
Fi, and move the content of entry to memory. so that we
need not waste time in reading the directory for the
secondary storage every time we access the file
Close (Fi) – move the content of entry Fi in memory to
directory structure on disk.
OS in some cases like .exe, .com etc or application program in other
cases (.java, .c etc) uses the file extension to recognize (in some cases
to open the appropriate application for the given file windows,
macintosh )
Unix uses a crude magic number stored at the beginning of some file
to indicate roughly the type of file executable, batch, postscipt etc.
{not
OS Notes by all files
Dr. Naveen have magic numbers }
Choudhary
Access Methods
Sequential Access
read next
write next
reset
no read after last write
(rewrite)
Direct Access
read n
write n
position to n
read next
write next
rewrite n
n = relative block number
(Offset = 0 )
(reset)
Directory
Files
F1 F2 F4
F3
Fn
Name
Type
Address
Current length
Maximum length
Date last accessed (for archival)
Date last updated (for dump)
Owner ID (who pays)
Protection information (discuss later)
Naming problem
(only single root dir so we can’t have two files with the
same name )
Grouping problem
Grouping Capability
File structure
Logical storage unit
Collection of related information
File system resides on secondary storage (disks).
File system organized into layers.
File control block – storage structure consisting of
information about a file.
Contiguous allocation
Linked allocation
Indexed allocation
Use best fit or first fit but this can lead to external
fragmentation
Files cannot grow.( as we need to give the Max.
anticipated file size at the time of creation of the file and
this can lead to internal fragmentation if the Max.
anticipated file size is not fully used
block = pointer
data
Free
block
214 0
FAT can reduce the random access time; Generally FAT will
be available at the start of the partition
OS Notes by Dr. Naveen Choudhary
Indexed Allocation
index table
Data blocks
Q R
Q = displacement into index table 2nd block as 600 >1*512
R = displacement into block 88
Index
blk1
Index
blk2
512
words
1
2
512
outer-index
inode
OS Notes by Dr. Naveen Choudhary
Free-Space Management
Contiguous free
blocks
Bit vector (n blocks)
0 1 2 n-1
1 1 1 …
0 block[i] occupied
bit[i] =
1 block[i] free
Grouping
A modification of the free list approach is to store the
address of n free blocks in the first free block. The first n-1
of these blocks are actually free. The last free block
contains the address of another free blocks & so on.
Advantage address of large no. of free blocks can be
found quickly unlike linked list method
Counting
Generally several contiguous blocks may be allocated or
freed simultaneously
So each entry in the free space list ( some blocks may be
reserved to keep this list ) can be made to contain the
address of the first free block and the number n of free
contiguous blocks that follow the first free block
Ram disk (virtual disk ) :: A section of primary memory is set aside &
treated as a virtual disk.
Ramdisk device driver accepts all the standard disk operation but
perform those operation on the memory section, instead of on disk
The difference b/w a Ram disk and a disk cache is that the contents
of the Ramdisk are totally user controlled, whereas those of the disk
cache are under the control of OS. For instance, a Ram disk will stay
empty until the user creates file there
Routine I/O through the file system uses the buffer (disk)
cache.( i/o read(file), i/o write(file)
Get block
of disl
I/O Hardware
Application I/O Interface
Kernel I/O Subsystem
Transforming I/O Requests to Hardware Operations
Performance
cpu
IRQ (interrupt request
line )
OS Notes by Dr. Naveen Choudhary
Interrupt-Driven I/O Cycle
There are variety of i/o devices & new devices are launched
every now & then. Each device has its own set of capabilities,
control bit definition & protocol for interfacing with host
(processor ) – and they all are different.
The issue is how we can design an OS such that new devices
can be attached to the computer without the OS being rewritten
I/O system calls encapsulate device behaviors in generic
classes
Device-driver layer hides differences among I/O controllers from
kernel
Devices vary in many dimensions
Character-stream or block
Sequential or random-access
Sharable or dedicated
Speed of operation
read-write, read only, or write only
host1 host2
s1 s1
host3
s2
s2
Disk Structure
Disk Scheduling
Disk Management
Swap-Space Management
RAID Structure
Disk Attachment
Stable-Storage Implementation
Tertiary Storage Devices
Operating System Issues
Performance Issues
Head pointer 53
Selects the request with the minimum seek time from the
current head position.
This algo. is not fair but efficient
This algo. is not optimal. A generic optimal algo for disk
scheduling does not exist
SSTF scheduling is a form of SJF scheduling; may cause
starvation of some requests.
Illustration shows total head movement of 236 cylinders.
The disk arm starts at one end of the disk, and moves
toward the other end, servicing requests until it gets to the
other end of the disk, where the head movement is
reversed and servicing continues.
Sometimes called the elevator algorithm.
Illustration shows total head movement of 208 cylinders.
Version of C-SCAN
Arm only goes as far as the last request in each direction,
then reverses direction immediately, without first going all
the way to the end of the disk.
The data segment swap map is more complicated, because the data segment can grow
over time. The map is of fixed size, but contains swap addresses for blocks of varying
size. Given index i, a block pointed to by swap map entry is of size 2i *16K to a maximum
of 2 megabytes. So using this scheme the blocks of large processes can be found quickly,
and the swap map remains small. Moreover for small processes only small blocks will be
needed to be used minimizing fragmentation.
OS Notes by Dr. Naveen Choudhary
RAID Structure
• Raid Level 3 - This level is similar to level 2, except that it takes advantage of
the fact that each disk is still doing its own error-detection, so that when an
error occurs, there is no question about which disk in the array has the bad
data. As a result a single parity bit is all that is needed to recover the lost data
from an array of disks. Level 3 also includes striping, which improves
performance.
• Raid Level 5 - This level is similar to level 4, except the parity blocks are
distributed over all disks, thereby more evenly balancing the load on the
system. For any given block on the disk(s), one of the disks will hold the parity
information for that block and the other N-1 disks will hold the data. Note that
the same disk cannot hold both data and parity for the same block, as both
would be lost in the event of a disk crash.
• Raid Level 6 - This level extends raid level 5 by storing multiple bits of error-
recovery codes, ( such as the Reed-Solomon codes ), for each bit position of
data, rather than a single parity bit. In the example shown below 2 bits of ECC
are stored for every 4 bits of data, allowing data recovery in the face of up to
two simultaneous disk failures. Note that this still involves only 50% increase in
storage needs, as opposed to 100% for simple mirroring which could only
tolerate a single disk failure.