Unit - V - Principles of Operating Systems - Ece - Iii - Ii
Unit - V - Principles of Operating Systems - Ece - Iii - Ii
Unit - V - Principles of Operating Systems - Ece - Iii - Ii
Overview of Mass Storage Structure - Disk Structure - Disk Scheduling and Management-
File System Interface: File Concept - Access Methods -Directory and Disk Structure
I/O Systems: I/O Hardware- Application I/O Interface - Kernel I/O Subsystem.
--------------------------------
PART 1
TOPIC - 5.1 OVERVIEW OF MASS-STORAGE STRUCTURE
5.1.1 Magnetic Disks
o One or more platters in the form of disks covered with magnetic media. Hard
disk platters are made of rigid metal, while "floppy" disks are made of more flexible
plastic.
o Each platter has two working surfaces. Older hard disk drives would sometimes not
use the very top or bottom surface of a stack of platters, as these surfaces were
more susceptible to potential damage.
o Each working surface is divided into a number of concentric rings called tracks. The
collection of all tracks that are the same distance from the edge of the platter, ( i.e.
all tracks immediately above one another in the following diagram ) is called
a cylinder.
o Each track is further divided into sectors, traditionally containing 512 bytes of data
each, although some modern disks occasionally use larger sector sizes. ( Sectors also
include a header and a trailer, including checksum information among other things.
Larger sector sizes reduce the fraction of the disk consumed by headers and trailers,
but increase internal fragmentation and the amount of disk that must be marked
bad in the case of errors. )
o The data on a hard drive is read by read-write heads. The standard configuration (
shown below ) uses one head per surface, each on a separate arm, and controlled by
a common arm assembly which moves all heads simultaneously from one cylinder to
another. ( Other configurations, including independent read-write heads, may speed
up disk access, but involve serious technical difficulties. )
o The storage capacity of a traditional disk drive is equal to the number of heads ( i.e.
the number of working surfaces ), times the number of tracks per surface, times the
number of sectors per track, times the number of bytes per sector. A particular
physical block of data is specified by providing the head-sector-cylinder number at
which it is located.
Figure 5.1 - Moving-head disk mechanism.
• In operation the disk rotates at high speed, such as 7200 rpm ( 120 revolutions per second. )
The rate at which data can be transferred from the disk to the computer is composed of
several steps:
o The positioning time, a.k.a. the seek time or random access time is the time
required to move the heads from one cylinder to another, and for the heads to settle
down after the move. This is typically the slowest step in the process and the
predominant bottleneck to overall transfer rates.
o The rotational latency is the amount of time required for the desired sector to
rotate around and come under the read-write head.This can range anywhere from
zero to one full revolution, and on the average will equal one-half revolution. This is
another physical step and is usually the second slowest step behind seek time. ( For a
disk rotating at 7200 rpm, the average rotational latency would be 1/2 revolution /
120 revolutions per second, or just over 4 milliseconds, a long time by computer
standards.
o The transfer rate, which is the time required to move the data electronically from
the disk to the computer. ( Some authors may also use the term transfer rate to refer
to the overall transfer rate, including seek time and rotational latency as well as the
electronic data transfer rate. )
• Disk heads "fly" over the surface on a very thin cushion of air. If they should accidentally
contact the disk, then a head crash occurs, which may or may not permanently damage the
disk or even destroy it completely. For this reason it is normal to park the disk heads when
turning a computer off, which means to move the heads off the disk or to an area of the disk
where there is no data stored.
• Floppy disks are normally removable. Hard drives can also be removable, and some are
even hot-swappable, meaning they can be removed while the computer is running, and a
new hard drive inserted in their place.
• Disk drives are connected to the computer via a cable known as the I/O Bus. Some of the
common interface formats include Enhanced Integrated Drive Electronics, EIDE; Advanced
Technology Attachment, ATA; Serial ATA, SATA, Universal Serial Bus, USB; Fiber Channel, FC,
and Small Computer Systems Interface, SCSI.
• The host controller is at the computer end of the I/O bus, and the disk controller is built into
the disk itself. The CPU issues commands to the host controller via I/O ports. Data is
transferred between the magnetic surface and onboard cache by the disk controller, and
then the data is transferred from that cache to the host controller and the motherboard
memory at electronic speeds.
• As technologies improve and economics change, old technologies are often used in different
ways. One example of this is the increasing used of solid state disks, or SSDs.
• SSDs use memory technology as a small fast hard disk. Specific implementations may use
either flash memory or DRAM chips protected by a battery to sustain the information
through power cycles.
• Because SSDs have no moving parts they are much faster than traditional hard drives, and
certain problems such as the scheduling of disk accesses simply do not apply.
• However SSDs also have their weaknesses: They are more expensive than hard drives,
generally not as large, and may have shorter life spans.
• SSDs are especially useful as a high-speed cache of hard-disk information that must be
accessed quickly. One example is to store filesystem meta-data, e.g. directory and inode
information, that must be accessed quickly and often. Another variation is a boot disk
containing the OS and some application executables, but no vital user data. SSDs are also
used in laptops to make them smaller, faster, and lighter.
• Because SSDs are so much faster than traditional hard disks, the throughput of the bus can
become a limiting factor, causing some SSDs to be connected directly to the system PCI bus
for example.
• Magnetic tapes were once used for common secondary storage before the days of hard disk
drives, but today are used primarily for backups.
• Accessing a particular spot on a magnetic tape can be slow, but once reading or writing
commences, access speeds are comparable to disk drives.
• Capacities of tape drives can range from 20 to 200 GB, and compression can double that
capacity.
TOPIC - 5.2 DISK STRUCTURE
• The traditional head-sector-cylinder, HSC numbers are mapped to linear block addresses by
numbering the first sector on the first head on the outermost track as sector 0. Numbering
proceeds with the rest of the sectors on that same track, and then the rest of the tracks on
the same cylinder before proceeding through the rest of the cylinders to the center of the
disk. In modern practice these linear block addresses are used in place of the HSC numbers
for a variety of reasons:
1. The linear length of tracks near the outer edge of the disk is much longer than for
those tracks located near the center, and therefore it is possible to squeeze many
more sectors onto outer tracks than onto inner ones.
2. All disks have some bad sectors, and therefore disks maintain a few spare sectors
that can be used in place of the bad ones. The mapping of spare sectors to bad
sectors in managed internally to the disk controller.
3. Modern hard drives can have thousands of cylinders, and hundreds of sectors per
track on their outermost tracks. These numbers exceed the range of HSC numbers
for many ( older ) operating systems, and therefore disks can be configured for any
convenient combination of HSC values that falls within the total number of sectors
physically on the drive.
• There is a limit to how closely packed individual bits can be placed on a physical media, but
that limit is growing increasingly more packed as technological advances are made.
• Modern disks pack many more sectors into outer cylinders than inner ones, using one of two
approaches:
o With Constant Linear Velocity, CLV, the density of bits is uniform from cylinder to
cylinder. Because there are more sectors in outer cylinders, the disk spins slower
when reading those cylinders, causing the rate of bits passing under the read-write
head to remain constant. This is the approach used by modern CDs and DVDs.
o With Constant Angular Velocity, CAV, the disk rotates at a constant angular speed,
with the bit density decreasing on outer cylinders. ( These disks would have a
constant number of sectors per track on all cylinders. )
• Shortest Seek Time First scheduling is more efficient, but may lead
to starvation if a constant stream of requests arrives for the same
general area of the disk.
• SSTF reduces the total head movement to 236 cylinders, down from
640 required for the same set of requests under FCFS. Note, however
that the distance could be reduced still further to 208 by starting with
37 and then 14 first before processing the rest of the requests.
Figure 5.5 - SSTF disk scheduling.
• The SCAN algorithm, a.k.a. the elevator algorithm moves back and
forth from one end of the disk to the other, similarly to an elevator
processing requests in a tall building.
• With very low loads all algorithms are equal, since there will
normally only be one request to process at a time.
• For slightly larger loads, SSTF offers better performance than FCFS,
but may lead to starvation when loads become heavy enough.
• For busier systems, SCAN and LOOK algorithms eliminate starvation
problems.
• The actual optimal algorithm may be something even more complex
than those discussed here, but the incremental improvements are
generally not worth the additional overhead.
• Some improvement to overall filesystem access times can be made by
intelligent placement of directory and/or inode information. If those
structures are placed in the middle of the disk instead of at the
beginning of the disk, then the maximum distance from those
structures to data blocks is reduced to only one-half of the disk size.
If those structures can be further distributed and furthermore have
their data blocks stored as close as possible to the corresponding
directory structures, then that reduces still further the overall time to
find the disk block numbers and then access the corresponding data
blocks.
• On modern disks the rotational latency can be almost as significant as
the seek time, however it is not within the OSes control to account for
that, because modern disks do not reveal their internal sector mapping
schemes, ( particularly when bad blocks have been remapped to spare
sectors. )
o Some disk manufacturers provide for disk scheduling
algorithms directly on their disk controllers, ( which do know
the actual geometry of the disk as well as any remapping ), so
that if a series of requests are sent from the computer to the
controller then those requests can be processed in an optimal
order.
o Unfortunately there are some considerations that the OS must
take into account that are beyond the abilities of the on-board
disk-scheduling algorithms, such as priorities of some requests
over others, or the need to process certain requests in a
particular order. For this reason OSes may elect to spoon-feed
requests to the disk controller one at a time in certain
situations.
----------------------------------------------------------------------------------------------------------
PART 2
File System Interface: File Concept - Access Methods -Directory and Disk Structure
• Disk files are accessed in units of physical blocks, typically 512 bytes
or some power-of-two multiple thereof. ( Larger physical disks use
larger block sizes, to keep the range of block numbers within the
range of a 32-bit integer. )
• Internally files are organized in units of logical units, which may be
as small as a single byte, or may be a larger size corresponding to
some data record or structure size.
• The number of logical units which fit into one physical block
determines its packing, and has an impact on the amount of internal
fragmentation ( wasted space ) that occurs.
• As a general rule, half a physical block is wasted for each file, and the
larger the block sizes the more space is lost to internal fragmentation.
• When the same files need to be accessed in more than one place in
the directory structure ( e.g. because they are being shared by more
than one user / process ), it can be useful to provide an acyclic-graph
structure. ( Note the directed arcs from parent to child. )
• UNIX provides two types of links for implementing the acyclic-graph
structure. ( See "man ln" for more details. )
o A hard link ( usually just called a link ) involves multiple
directory entries that both refer to the same file. Hard links are
only valid for ordinary files in the same filesystem.
o A symbolic link, that involves a special file, containing
information about where to find the linked file. Symbolic links
may be used to link directories and/or files in other filesystems,
as well as ordinary files in the current filesystem.
• Windows only supports symbolic links, termed shortcuts.
• Hard links require a reference count, or link count for each file,
keeping track of how many directory entries are currently referring to
this file. Whenever one of the references is removed the link count is
reduced, and when it reaches zero, the disk space can be reclaimed.
• For symbolic links there is some question as to what to do with the
symbolic links when the original file is moved or deleted:
o One option is to find all the symbolic links and adjust them
also.
o Another is to leave the symbolic links dangling, and discover
that they are no longer valid the next time they are used.
o What if the original file is removed, and replaced with another
file having the same name before the symbolic link is next
used?
• If cycles are allowed in the graphs, then several problems can arise:
o Search algorithms can go into infinite loops. One solution is to
not follow links in search algorithms. ( Or not to follow
symbolic links, and to only allow symbolic links to refer to
directories. )
o Sub-trees can become disconnected from the rest of the tree
and still not have their reference counts reduced to zero.
Periodic garbage collection is required to detect and resolve
this problem. ( chkdsk in DOS and fsck in UNIX search for
these problems, among others, even though cycles are not
supposed to be allowed in either system. Disconnected disk
blocks that are not marked as free are added back to the file
systems with made-up file names, and can usually be safely
deleted. )
--------------------------------------------------------------------------------------------------------------------------------------
PART 3
Directory Implementation - Allocation Methods
• A linear list is the simplest and easiest directory structure to set up,
but it does have some drawbacks.
• Finding a file ( or verifying one does not already exist upon creation )
requires a linear search.
• Deletions can be done by moving all entries, flagging an entry as
deleted, or by moving the last entry into the newly vacant position.
• Sorting the list makes searches faster, at the expense of more complex
insertions and deletions.
• A linked list makes insertions and deletions into a sorted list easier,
with overhead for the links.
• More complex data structures, such as B-trees, could also be
considered.
• There are three major methods of storing files on disks: contiguous, linked,
and indexed.
• Disk files can be stored as linked lists, with the expense of the storage
space consumed by each link. ( E.g. a block may be 508 bytes instead
of 512. )
• Linked allocation involves no external fragmentation, does not
require pre-known file sizes, and allows files to grow dynamically at
any time.
• Unfortunately linked allocation is only efficient for sequential access
files, as random access requires starting at the beginning of the list for
each new location access.
• Allocating clusters of blocks reduces the space wasted by pointers, at
the cost of internal fragmentation.
• Another big problem with linked allocation is reliability if a pointer is
lost or damaged. Doubly linked lists provide some protection, at the
cost of additional overhead and wasted space.
Figure 5.2 - Linked allocation of disk space.
5.4.4 Performance
5.1 Overview
5.2.1 Polling
5.2.2 Interrupts
• Interrupts allow devices to notify the CPU when they have data to
transfer or when an operation is complete, allowing the CPU to
perform other duties when no I/O transfers need its immediate
attention.
• The CPU has an interrupt-request line that is sensed after every
instruction.
o A device's controller raises an interrupt by asserting a signal
on the interrupt request line.
o The CPU then performs a state save, and transfers control to
the interrupt handler routine at a fixed address in memory.
( The CPU catches the interrupt and dispatches the interrupt
handler. )
o The interrupt handler determines the cause of the interrupt,
performs the necessary processing, performs a state restore,
and executes a return from interrupt instruction to return
control to the CPU. ( The interrupt handler clears the interrupt
by servicing the device. )
▪ ( Note that the state restored does not need to be the
same state as the one that was saved when the interrupt
went off. See below for an example involving time-
slicing. )
• Figure 5.3 illustrates the interrupt-driven I/O procedure:
Figure 5.3 - Interrupt-driven I/O cycle.
• With blocking I/O a process is moved to the wait queue when an I/O request
is made, and moved back to the ready queue when the request completes,
allowing other processes to run in the meantime.
• With non-blocking I/O the I/O request returns immediately, whether the
requested I/O operation has ( completely ) occurred or not. This allows the
process to check for available data without getting hung completely if it is
not there.
• One approach for programmers to implement non-blocking I/O is to have a
multi-threaded application, in which one thread makes blocking I/O calls (
say to read a keyboard or mouse ), while other threads continue to update
the screen or perform other tasks.
• A subtle variation of the non-blocking I/O is the asynchronous I/O, in
which the I/O request returns immediately allowing the process to continue
on with other tasks, and then the process is notified ( via changing a process
variable, or a software interrupt, or a callback function ) when the I/O
operation has completed and the data is available for use. ( The regular non-
blocking I/O returns immediately with whatever results are available, but
does not complete the operation and notify the process later. )
Figure 5.8 - Two I/O methods: (a) synchronous and (b) asynchronous.
5.4.2 Buffering
5.4.3 Caching
• I/O requests can fail for many reasons, either transient ( buffers
overflow ) or permanent ( disk crash ).
• I/O requests usually return an error bit ( or more ) indicating the
problem. UNIX systems also set the global variable errno to one of a
hundred or so well-defined values to indicate the specific error that
has occurred. ( See errno.h for a complete listing, or man errno. )
• Some devices, such as SCSI devices, are capable of providing much
more detailed information about errors, and even keep an on-board
error log that can be requested by the host.
1. What is a file?
A file is an abstract data type defined and implemented by the operating system. It is a sequence
of logical records.
A single-level directory in a multiuser system causes naming problems, since each file must have
a unique name.
The direct access nature of disks allows flexibility in the implementation of files. The main
problem here is how to allocate space to these files so that disk space is utilized effectively and
files can be accessed quickly. Three major methods of allocating disk space are:
Contiguous
Linked
Indexed
Primary memory is the main memory (Hard disk, RAM) where the operating system
resides. Secondary memory can be external devices like CD, floppy magnetic discs etc. secondary
storage cannot be directly accessed by the CPU and is also external memory storage.
Name
Type
Size
Protection
Protection mechanisms provide controlled access by limiting the types of file access that can
be made. Access is permitted or denied depending on many factors. Several different types
i. Read ii. Write iii. Execute iv. Append v. Delete vi. List
8. When designing the file structure for an operating system, what attributes are
considered?
The file system provides the mechanism for on line storage and access to file contentsincluding
data and programs. The file system resides permanently on secondary storage which isdesigned
Since disk space is limited, we should reuse the space from deleted files for new files. To keep
track of free disk space, the system maintains a free space list. The free space list records all free
disk blocks
1) List the attributes of the file and discuss it
2) How operations are performed on the files? Explain each operation in detail
b) List the common file extensions associated with the same group of files
8) Compare and contrast two level directory and tree structured directory
9) What are the issues are associated with the file systems? Explain in detail
10. What is the purpose of I/O system calls and device-driver? How do the devices vary?