Session 5 6 Revision

Operating Systems Design( 19CS2106S)
Session-5 & 6
Revision
Session Plan
Design and Implementation of log.c
(10 mnts)
• https://github.com/mit-pdos/xv6-public/blob/
master/log.c
Logging –Crash Recovery in File Systems
• Crash Recovery is one of the issue in File
Systems.
• This issue arises when many file system
operations involve multiple writes to the disk.
• This may result in a crash after a subset of the
writes may leave the on-disk file system in an
inconsistent state.
• This problem can be solved using log based
recovery.
• Log is a sequence of records of write activities
and maintains a history of all update activities.
How xv6 solves the crash problem?
• Xv6 solves the problem of crashes during file system
operations with a simple form of logging.
• An xv6 system call does not directly write the on-
disk file system data structures.
• Instead, it places a description of all the disk writes it
wishes to make in a log on the disk.
• Once the system call has logged all of its writes, it
writes a special commit record to the disk indicating
that the log contains a complete operation.
• Then the system call copies the writes to the on-disk
file system data structures.
• After those writes have completed, the system call
erases the log on disk.
Log based crash recovery
• If the system should crash and reboot, the file
system code recovers from the crash as follows,
before running any processes.
• If the log is marked as containing a complete
operation, then the recovery code copies the
writes to where they belong in the on-disk file
system.
• If the log is not marked as containing a
complete operation, the recovery code ignores
the log.
• The recovery code finishes by erasing the log.
log.c-Sheet No.47 from xv6 Code Manual
Discussion of ALMs
https://www.cse.iitb.ac.in/~mythili/os/ps/xv6/p
s-xv6-file.pdf
Session-5 ALM-1
• Consider two processes in xv6 that both wish to read a

particular disk block, i.e., either process does not
intend to modify the data in the block. The first process
obtains a pointer to the struct buf using the function
“bread”, but never causes the buffer to become dirty.
• Now, if the second process calls “bread” on the same

block before the first process calls ”brelse”, will this
second call to “bread” return immediately, or would it
block? Briefly describe what xv6 does in this case, and
justify the design choice.
Session-5 ALM-2
• When the buffer cache in xv6 runs out of slots
in the cache in the bget function, it looks for a
clean LRU block to evict, to make space for the
new incoming block.
• What would break in xv6 if the buffer cache
implementation also evicted dirty blocks (by
directly writing them to their original location
on disk using the bwrite function) to make
space for new blocks?
Session-5 ALM-3
Low-level File System Algorithms
• iget
• iget
• bmap
• namei
Review of inode
• inode stands for index node.
• In Unix based operating system each file is indexed by
a number called inode(index node) or inode number.
• It’s a name of a file in the form a number.
• The inode contains the information necessary for a
process to access a file, such as file ownership, access
rights, file size, and location of the file's data in the
file system.
• inode contains the table of contents to locate a file's
data on disk.
• Also called as disk inode.
inode number can be viewed using the ls -il command
Concept of in-core inode
• When the file is opened, then the kernel copies
the inode(disk inode) into main memory area
called inode cache, just likie the buffer cache.
• So, this copy of disk inode present in MM is
called as in-core inode.
• As the file changes, the in-core inode is updated
usually more often than the on-disk copy.
• The in-core inode contains up-to-date
information on the state of the file
• This is allocation is done using iget algorithm
Accessing of inode
• Accessing of inode is similar to the concept of

accessing a buffer block in Buffer Cache.
• Here also the concept Hash Queue and FREE LIST
will be there.
• The kernel maps the device number and inode
number into a hash queue and searches the
queue for the inode.
• If it cannot find the inode, it allocates one from
the free list and locks it.
• The kernel then prepares to read the disk copy of
the newly accessed inode into the in-core copy.
Allocation of an in-core copy of actual disk inode iget
Algorithm
Allocation of an in-core copy of actual disk inode
iget Algorithm
• The kernel removes the in-core inode from the
free list, places it on the correct hash queue,
and sets its in-core reference count to 1.
• It copies the file type, owner fields,

permission settings, link count, file size, and
the table of contents from the disk inode to
the in-core inode, and returns a locked inode.
Releasing the inodes
• When the kernel releases an inode ,it decrements its in-
core reference count.
• If the count drops to 0, the kernel writes the inode to
disk if the in-core copy differs from the disk copy.
• They differ if the file data has changed like if the file
access time has changed, or if the file owner or access
permissions have changed.
• Now, the kernel places the inode on the free list of
inodes, effectively caching the inode in case it is needed
again soon.
• The kernel may also release all data blocks associated
with the file and free the inode if the number of links to
the file is 0.
Releasing the inodes using iput algorithm
Structure of Regular File
• In UNIX, the data in files is not stored sequentially
on disk. If it was to be stored sequentially, the file
size would not be flexible without large
fragmentation.
• In case of sequential storage, the inode would only
need to store the starting address and size.
• Instead, the inode stores the disk block numbers
on which the data is present.
• But for such strategy, if a file had data across 1000
blocks, the inode would need to store the numbers
of 1000 blocks and the size of the inode would
differ according to the size of the file.
Structure of a Regular File in Unix
• The inodes have array of size 13 which for
storing the block numbers, although, the
number of elements in array is independent of
the storage strategy.
• The first 10 members of the array are "direct
addresses", meaning that they store the block
numbers of actual data.
• The 11th member is "single indirect", it stores
the block number of the block which has
"direct addresses".
• The 12th member is "double indirect", it
stores block number of a "single indirect"
block.
• And the 13th member is "triple indirect", it
stores block number of a "double indirect"
block.
• This strategy can be extended to "quadruple"
or "quintuple" indirect addressing.
If a logical block on the file system holds 1K
bytes and that a block number is addressable by
a 32 bit integer, then a block can hold up to 256
block numbers. The maximum file size with 13
member data array is:
• But the file size field in the inode is 32 bits, the
size of a file is effectively limited to 4
gigabytes.
bmap- to convert logical byte offset of a file to a
physical disk block
Example
• To access byte offset 9000: The first 10 blocks
contain 10K bytes. So 9000 should be in the
first 10 block. 9000 / 1024 = 8 so it is in the 8th
block (starting from 0). And 9000 % 1024 =
808 so the byte offset into the 8th block is 808
bytes (starting from 0). (Block 367 in the
figure.)
Directories
• Directories are the set of files that gives the file

system its hierarchical structure.
• A directory is a file whose data is a sequence of
entries, each consisting of an inode number and
the name of a file contained in the directory.
• A path name is a null terminated character string
divided into separate components by the slash ("/")
character.
• Each component except the last must be the name
of a directory, but the last component may be a
non-directory file
Directory layout for /etc
• Directory entries may be empty, indicated by
an inode number of 0.
• For instance, the entry at address 224 in "/etc"

is empty, although it once contained an entry
for a file named "crash".
• Every directory contains the file names dot
and dot-dot ("." and "..") whose inode
numbers are those of the directory and its
parent directory, respectively.
• The inode number of "." in "/etc“ is located at
offset 0 in the file, and its value is 83.
• The inode number of ".." is located at offset
16, and its value is 2.
Accessing a file in directory using path
name
• namei algorithm is used for Conversion of a
path name to an inode.
• It searches a pathname until a terminal point
is found .
• The access to a file in a directory is by its path
name,but the kernel works internally with
inodes only rather than with path names.
• Therefore, it converts the path names to
inodes to access files
namei Algorithm-Searching the path
• namei algorithm uses intermediate inodes as it
parses a path name; call them working inodes.
• The inode where the search starts is the first
working inode.
• During each iteration of the namei loop, the kernel
makes sure that the working inode is indeed that of
a directory.
• Otherwise, the system would violate the assertion
that non-directory files can only be leaf nodes of the
file system tree.
• The process must also have permission to search
the directory (read permission is insufficient).
namei Algorithm
• The user ID of the process must match the
owner or group ID of the file, and execute
permission must be granted, or the file must
allow search to all users.
• Otherwise the search fails.
namei Algorithm
namei Algorithm
Case Study of fs.c
• fs.c file is available from code sheets 49-57 in

xv6 code sheet manual.
• It presents the implementation code in xv6 for
the iget, bget, bmap, namei and dirlookup
low level file system algorithms.
Xv6 Code Sheet No.49
iget
iput
dirlookup
LTC Activities-Session-6
Program to demonstrate the following
• iputtest()// does chdir() call iput(p->cwd) in a
transaction?
• exitiputtest() // does exit() call iput(p->cwd) in a
transaction?
• iref() // test that iput() is called at the end of
_namei()
• bigdir() // directory that uses indirect blocks
• fsfull() // what happens when the file system
runs out of blocks? answer: balloc panics.*/
LTC Activity-Testing iput

Session 5 6 Revision

Uploaded by

Copyright:

Available Formats

Session 5 6 Revision

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Session 5 6 Revision

Uploaded by

Copyright:

Available Formats

Operating Systems Design( 19CS2106S)

• Consider two processes in xv6 that both wish to read a

• Now, if the second process calls “bread” on the same

• Accessing of inode is similar to the concept of

• It copies the file type, owner fields,

• Directories are the set of files that gives the file

• For instance, the entry at address 224 in "/etc"

• fs.c file is available from code sheets 49-57 in

You might also like