Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
730 views

File System Structure

file system structure - os

Uploaded by

supriya sundaram
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
730 views

File System Structure

file system structure - os

Uploaded by

supriya sundaram
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Topics covered in this file

1.file system structure


2.file system implementation - overview
3.directory implementation - ll & hash table
4.free space management
5.recovery

File System Structure


File System provide efficient access to the disk by allowing data to be stored, located and
retrieved in a convenient way. A file System must be able to store the file, locate the file and
retrieve the file.

Most of the Operating Systems use layering approach for every task including file systems.
Every layer of the file system is responsible for some activities.

The image shown below, elaborates how the file system is divided in different layers, and
also the functionality of each layer.
o When an application program asks for a file, the first request is directed to the
logical file system. The logical file system contains the Meta data of the file and
directory structure. If the application program doesn't have the required
permissions of the file then this layer will throw an error. Logical file systems also
verify the path to the file.
o Generally, files are divided into various logical blocks. Files are to be stored in
the hard disk and to be retrieved from the hard disk. Hard disk is divided into
various tracks and sectors. Therefore, in order to store and retrieve the files, the
logical blocks need to be mapped to physical blocks. This mapping is done by
File organization module. It is also responsible for free space management.
o Once File organization module decided which physical block the application
program needs, it passes this information to basic file system. The basic file
system is responsible for issuing the commands to I/O control in order to fetch
those blocks.
o I/O controls contain the codes by using which it can access hard disk. These
codes are known as device drivers. I/O controls are also responsible for handling
interrupts.
Different Types of File Systems

There are several types of file systems, each designed for specific purposes
and compatible with different operating systems. Some common file system
types include:
 FAT32 (File Allocation Table 32): Commonly used in older versions of
Windows and compatible with various operating systems.
 NTFS (New Technology File System): Used in modern Windows
operating systems, offering improved performance, reliability, and security
features.
 ext4 (Fourth Extended File System): Used in Linux distributions,
providing features such as journaling, large file support, and extended file
attributes.
 HFS+ (Hierarchical File System Plus): Used in macOS systems prior to
macOS High Sierra, offering support for journaling and case-insensitive
file names.
 APFS (Apple File System): Introduced in macOS High Sierra and the
default file system for macOS and iOS devices, featuring enhanced
performance, security, and snapshot capabilities.
 ZFS (Zettabyte File System): A high-performance file system known for
its advanced features, including data integrity, volume management, and
efficient snapshots.

File System Implementation


Several on-disk and in-memory structures are used to implement a system.

1.On Disk Data Structures

There are various on disk data structures that are used to implement a file system. This
structure may vary depending upon the operating system.

1. Boot Control Block


Boot Control Block contains all the information which is needed to boot an operating
system from that volume. It is called boot block in UNIX file system. In NTFS, it is
called the partition boot sector.

2. Volume Control Block

Volume control block all the information regarding that volume such as number of
blocks, size of each block, partition table, pointers to free blocks and free FCB
blocks. In UNIX file system, it is known as super block. In NTFS, this information is
stored inside master file table.

3. Directory Structure (per file system)

A directory structure (per file system) contains file names and pointers to
corresponding FCBs. In UNIX, it includes inode numbers associated to file names.

4. File Control Block

File Control block contains all the details about the file such as ownership details,
permission details, file size,etc. In UFS, this detail is stored in inode. In NTFS, this
information is stored inside master file table as a relational database structure. A
typical file control block is shown in the image below.

2.In Memory Data Structure


Till now, we have discussed the data structures that are required to be present on the hard
disk in order to implement file systems. Here, we will discuss the data structures required to
be present in memory in order to implement the file system.
Figure – In-memory File System Structure

The in-memory data structures are used for file system management as well as
performance improvement via caching. This information is loaded on the mount time and
discarded on ejection.

1. In-memory Mount Table

In-memory mount table contains the list of all the devices which are being mounted
to the system. Whenever the connection is maintained to a device, its entry will be
done in the mount table.

2. In-memory Directory structure cache

This is the list of directory which is recently accessed by the CPU. The directories
present in the list can also be accessed in the near future so it will be better to store
them temporally in cache.

3. System-wide open file table

This is the list of all the open files in the system at a particular time. Whenever the
user open any file for reading or writing, the entry will be made in this open file table.

4. Per process Open file table

It is the list of open files subjected to every process. Since there is already a list
which is there for every open file in the system thereforeIt only contains Pointers to
the appropriate entry in the system wide table.
Directory Implementation

Directory implementation in the operating system can be done using Singly


Linked List and Hash table. The efficiency, reliability, and performance of a
file system are greatly affected by the selection of directory-allocation and
directory-management algorithms. There are numerous ways in which the
directories can be implemented. But we need to choose an appropriate
directory implementation algorithm that enhances the performance of the
system.

Directory Implementation using Singly Linked List

The implementation of directories using a singly linked list is easy to program


but is time-consuming to execute. Here we implement a directory by using a
linear list of filenames with pointers to the data blocks.

Directory Implementation Using Singly Linked List


 To create a new file the entire list has to be checked such that the new
directory does not exist previously.
 The new directory then can be added to the end of the list or at the
beginning of the list.
 In order to delete a file, we first search the directory with the name of the
file to be deleted. After searching we can delete that file by releasing the
space allocated to it.
 To reuse the directory entry we can mark that entry as unused or we can
append it to the list of free directories.
 To delete a file linked list is the best choice as it takes less time.
Disadvantage
The main disadvantage of using a linked list is that when the user needs to
find a file the user has to do a linear search. In today’s world directory
information is used quite frequently and linked list implementation results in
slow access to a file. So the operating system maintains a cache to store the
most recently used directory information.

Directory Implementation using Hash Table

An alternative data structure that can be used for directory implementation is


a hash table. It overcomes the major drawbacks of directory implementation
using a linked list. In this method, we use a hash table along with the linked
list. Here the linked list stores the directory entries, but a hash data structure
is used in combination with the linked list.
In the hash table for each pair in the directory key-value pair is generated.
The hash function on the file name determines the key and this key points to
the corresponding file stored in the directory. This method efficiently
decreases the directory search time as the entire list will not be searched on
every operation. Using the keys the hash table entries are checked and
when the file is found it is fetched.
Directory Implementation Using Hash Table

Disadvantage:
The major drawback of using the hash table is that generally, it has a fixed
size and its dependency on size. But this method is usually faster than linear
search through an entire directory using a linked list.

Free Space Management

Free space management is a critical aspect of operating systems as it


involves managing the available storage space on the hard disk or other
secondary storage devices. Free space management is a crucial function of
operating systems, as it ensures that storage devices are utilized efficiently
and effectively.
The system keeps tracks of the free disk blocks for allocating space to files
when they are created. Also, to reuse the space released from deleting the
files, free space management becomes crucial. The system maintains a free
space list which keeps track of the disk blocks that are not allocated to some
file or directory.

The free space list can be implemented mainly as:


1. Bitmap or Bit vector

A Bitmap or Bit Vector is series or collection of bits where each bit


corresponds to a disk block. The bit can take two values: 0 and 1: 0
indicates that the block is free and 1 indicates an allocated block. The given
instance of disk blocks on the disk in Figure 1 (where green blocks are
allocated) can be represented by a bitmap of 16 bits as: 1111000111111001.

Advantages:
 Simple to understand.
 Finding the first free block is efficient. It requires scanning the words (a
group of 8 bits) in a bitmap for a non-zero word. (A 0-valued word has all
bits 0). The first free block is then found by scanning for the first 1 bit in
the non-zero word.
Disadvantages:
 For finding a free block, Operating System needs to iterate all the blocks
which is time consuming.
 The efficiency of this method reduces as the disk size increases.

2. Linked List

In this approach, the free disk blocks are linked together i.e. a free block
contains a pointer to the next free block. The block number of the very first
disk block is stored at a separate location on disk and is also cached in

memory. In Figure-2, the free space list


head points to Block 5 which points to Block 6, the next free block and so on.
The last free block would contain a null pointer indicating the end of free list.
A drawback of this method is the I/O required for free space list traversal.
Advantages:
 The total available space is used efficiently using this method.
 Dynamic allocation in Linked List is easy, thus can add the space as per
the requirement dynamically.
Disadvantages:
 When the size of Linked List increases, the headache of miniating
pointers is also increases.
 This method is not efficient during iteration of each block of memory.

Grouping

This approach stores the address of the free blocks in the first free block.
The first free block stores the address of some, say n free blocks. Out of
these n blocks, the first n-1 blocks are actually free and the last block
contains the address of next free n blocks. An advantage of this approach is
that the addresses of a group of free disk blocks can be found easily.
Advantage:
 Finding free blocks in massive amount can be done easily using this
method.
Disadvantage:
 The only disadvantage is, we need to alter the entire list, if any of the
block of the list is occupied.

Counting

This approach stores the address of the first free disk block and a number n
of free contiguous disk blocks that follow the first block. Every entry in the list
would contain:
 Address of first free disk block.
 A number n.
Advantages:
 Using this method, a group of entire free blocks can take place easily and
Fastly.
 The list formed in this method is especially smaller in size.
Disadvantage:
 The first free block in this method, keeps account of other free blocks.
Thus, due to that one block the space requirement is more.

Recovery

Consistency Checking
 The storing of certain data structures ( e.g. directories and inodes ) in memory and the caching of
disk operations can speed up performance, but what happens in the result of a system crash? All volatile
memory structures are lost, and the information stored on the hard drive may be left in an inconsistent
state.
 A Consistency Checker ( fsck in UNIX, chkdsk or scandisk in Windows ) is often run at boot
time or mount time, particularly if a filesystem was not closed down properly. Some of the problems that
these tools look for include:
 Disk blocks allocated to files and also listed on the free list.
 Disk blocks neither allocated to files nor on the free list.
 Disk blocks allocated to more than one file.
 The number of disk blocks allocated to a file inconsistent with the file's stated size.
 Properly allocated files / inodes which do not appear in any directory entry.
 Link counts for an inode not matching the number of references to that inode in the
directory structure.
 Two or more identical file names in the same directory.
 Illegally linked directories, e.g. cyclical relationships where those are not allowed, or
files/directories that are not accessible from the root of the directory tree.
 Consistency checkers will often collect questionable disk blocks into new files with names
such as chk00001.dat. These files may contain valuable information that would otherwise
be lost, but in most cases they can be safely deleted, ( returning those disk blocks to the
free list. )
 UNIX caches directory information for reads, but any changes that affect space allocation or
metadata changes are written synchronously, before any of the corresponding data blocks are written to.

Backup and Restore


 In order to recover lost data in the event of a disk crash, it is important to conduct backups
regularly.
 Files should be copied to some removable medium, such as magnetic tapes, CDs, DVDs, or
external removable hard drives.
 A full backup copies every file on a filesystem.
 Incremental backups copy only files which have changed since some previous time.
 A combination of full and incremental backups can offer a compromise between full recoverability,
the number and size of backup tapes needed, and the number of tapes that need to be used to
do a full restore. For example, one strategy might be:
o At the beginning of the month do a full backup.
o At the end of the first and again at the end of the second week, backup all files which
have changed since the beginning of the month.
o At the end of the third week, backup all files that have changed since the end of the
second week.
o Every day of the month not listed above, do an incremental backup of all files that have
changed since the most recent of the weekly backups described above.
 Backup tapes are often reused, particularly for daily backups, but there are limits to how many
times the same tape can be used.
 Every so often a full backup should be made that is kept "forever" and not overwritten.
 Backup tapes should be tested, to ensure that they are readable!
 For optimal security, backup tapes should be kept off-premises, so that a fire or burglary cannot
destroy both the system and the backups. There are companies ( e.g. Iron Mountain ) that
specialize in the secure off-site storage of critical backup information.
 Keep your backup tapes secure - The easiest way for a thief to steal all your data is to
simply pocket your backup tapes!
 Storing important files on more than one computer can be an alternate though less reliable form
of backup.
 Note that incremental backups can also help users to get back a previous version of a file that
they have since changed in some way.
 Beware that backups can help forensic investigators recover e-mails and other files that users
had though they had deleted!

You might also like