Chapter 5: File System 5. File Concepts
Chapter 5: File System 5. File Concepts
Chapter 5: File System 5. File Concepts
A file is a container for a collection of information. The file manager provides a protection
mechanism to allow users administrator how processes executing on behalf of different users can
access the information in a file. File protection is a fundamental property of files because it
allows different people to store their information on a shared computer.
File represents programs and data. Data files may be numeric, alphabetic, binary or alpha
numeric. Files may be free form, such as text files. In general, file is sequence of bits, bytes, lines
or records.
Information stored in files must be persistent. That is not be affected by process creation and
termination. A file should only disappear when its owner explicitly removes it. Files are
managed by the operating system. How they are structured, named, accessed, used, protected and
implemented are major topics in operating system design. As a whole that part of the operating
system dealing with files is known as the file system.
FILE NAMING:
• Files are an abstraction mechanism. Files provide a way to store information on the disk
and read it back later.
• When a process creates a file, it gives the file a name. When the process terminates, the
file continues to exist and can be accessed by other processes using its name.
• Rules for file naming vary from system to system, but all current operating systems allow
strings of one to eight letters as legal file names. Digits and special characters are also
permitted, characters.
• Some file systems like the one present in UNIX distinguish between upper and lower
case letters, whereas others like the one in MS-DOS do not.
• Thus a unix system have all of the following as three distinct files maria, Maria, MARIA.
In MS DOS all these names refer to the same file name.
• Many operating systems support two-part file names, with the two parts separated by a
period ex. prog.c . The part following the period is called the file extension and usually
indicates something about the file.in MS DOS for example file names are 1 to 8
characters, plus an optional extension of 1 to 3 characters.
• In Unix, file may even have two or more extensions. Example prog.c.z, where .z is
commonly used to indicate that the file(prog.c) has been compressed usig ziv-lempel
compression algorithm.
2
FILE STRUCTURE:
Files can be structured in any of several ways. Three common possibilities are shown below.
In byte sequence,
• Read and write operations are performed on this fixed length records. Ex, mainframe
O.S.
In tree structure,
• A file consists of a tree of records, not necessarily all the same length.
• The tree is sorted based on the key field, to allow rapid searching for a particular key.
• The operating system decides where to place them and not the user.
• This type of file system is widely used on the large mainframe computers still used in
some commercial data processing.
FILE TYPES:
Unix have regular files, directories, character special files and block special files.
• Regular files are generally either ASCII files or binary files. Advantage of ASCII files is
that they can be displayed and printed as is, and they can be edited with any text editor.
• Binary files have some internal structure known only to programs that use them. This file
is just a sequence of bytes. O.S executes this file only if it has proper format.
• Two version of executable file, one as just sequence of bits and another as an archive are
shown below.
4
• Character special files are related to input/output and used to model serial I/O devices
such as terminals, printers, and networks.
FILE ATTRIBUTES:
• Extra items added to a file like date, time, size, etc other than name of file and its data.
• The list of attributes varies considerably from system to system. The table below shows
some of the attribute.
File attributes vary from one operating system to another. The common attributes are,
Time, date, and user identification – data for protection, security, and usage monitoring
Information about files are kept in the directory structure, which is maintained on the disk
5
File Operations
Any file system provides not only a means to store data organized as files, but a collection of
functions that can be performed on files. Typical operations include the following:
Create: A new file is defined and positioned within the structure of files.
Delete: A file is removed from the file structure and destroyed.
Open: An existing file is declared to be "opened" by a process, allowing the process to perform
functions on the file.
Close: The file is closed with respect to a process, so that the process no longer may perform
functions on the file, until the process opens the file again.
Read: A process reads all or a portion of the data in a file.
Write: A process updates a file, either by adding new data that expands the size of the file or by
changing the values of existing data items in the file.
File Types – Name, Extension
A common technique for implementing file types is to include the type as part of the file name.
The name is split into two parts : a name and an extension. Following table gives the file type
with usual extension and function.
• Here file name and virtual address are given, this causes the operating system to map the
file into the address space at the virtual address.
• File mapping works best in a system that supports segmentation. In such a system, each
file can be mapped onto its own segment so that byte k in the file is also byte k in the
segment.
File access mechanism refers to the manner in which the records of a file may be accessed.
There are several ways to access files −
Sequential access
Direct/Random access
Indexed sequential access
Sequential access
A sequential access is that in which the records are accessed in some sequence, i.e., the
information in the file is processed in order, one record after the other. This access
method is the most primitive one. Skipping some bytes or reading out of order not
allowed.
7
• Sequential file access was used in storage medium like magnetic tape rather than disks.
Direct/Random access
Random access file organization provides, accessing the records directly.
Each record has its own address on the file with by the help of which it can be directly
accessed for reading or writing.
The records need not be in any sequence within the file and they need not be in adjacent
locations on the storage medium.It is possible to read the bytes or records of a file out of
order, or to access records by key, rather than by position. Files whose bytes or records
can be read in any order are called random access files.
Files are allocated disk spaces by operating system. Operating systems deploy following three
main ways to allocate disk space to files.
Contiguous Allocation
Linked Allocation
Indexed Allocation
Contiguous Allocation
Linked Allocation
Indexed Allocation
Sector 0 of the disk is called the MBR (Master Boot Record ) and is used to boot the
computer. The end of the MBR contains the partition table. This table gives the starting
and ending addresses of each partition.
MBR program locates the active partition, read in its first block, called the boot block ,
and execute it.
The program in the boot block loads the operating system contained in that partition.
The layout of a disk partition varies from file system to file system. Often the file system
will contain some of the items shown in below Fig.
9
The first one is the superblock. It contains all the key parameters about the file system
like file system type, the number of blocks in the file system, and other key
administrative Information and is read into memory when the computer is booted.
• Free space managment block tells about free blocks in the file system.
• i-nodes tells all about the file.
• Root directory contains the top of the file system tree.
• The remainder of the disk typically contains all the other directories and files.
DIRECTORIES:
• The simplest form of directory system is having one directory containing all the files.
Sometimes it is called the root directory. Ex. CDC 6600, early P.C.
• Below diagram illustrates one directory system,
• Advantage is simplicity and ability to locate files quickly.
10
Disadvantage: Different users may accidentally use the same names for their files.
To overcome the disadvantage of single directory system, each user here are given a
private directory.
• This design could be used, for example, on a multiuser computer or on a simple network
of personal computers that shared a common file server over a local area network.
when a user tries to open a file, the system knows which user it is in order to know which
directory to search. As a consequence, some kind of login procedure is needed,
• Here users can only access files in their own directories. However, a slight extension is to
allow users to access other users’ files by providing some indication of whose file is to be
opened.
11
• In order to manage many files, hierarchy structure i.e. tree of directories is preferred and
is shown in below diagram.
• Here each user can have as many directories as are needed.
This schema acts as a powerful structuring tool for users to organize their work.
PATH NAMES
Two different methods are commonly used for specifying or denoting the file names
placed inside a directory file system.
• In the first method, each file is given an absolute path name consisting of the path from
the root directory to the file. Ex. /usr/ast/mailbox
• Absolute path names always start at the root directory
• Second way is to use relative path name.
• A user can designate one directory as the current working directory and any reference for
a file means it is present within the working directory.
12
• But if a file has to be accessed regardless of the working directory, then absolute path
name must be specified.
• Most operating systems that support a hierarchical directory system have two special
entries in every directory, “.” and “..”, Dot refers to the current directory; dotdot refers to
its parent.
DIRECTORY OPERATION:
The allowed system calls for managing directories are,
1. Create 2. Delete 3. Opendir 4. Closedir 5.Readdir 6. Rename 7. Link 8.Unlink
1. Recover from disaster – accidents like disk crash, fire, flood or some natural
catastrophe
1. Physical dump
13
2. Logical dump
• A physical dump starts at block 0 of the disk, writes all the disk blocks onto the output
tape in order, and stops when it has copied the last one.
• A logical dump starts at one or more specified directories and recursively dumps all files
and directories found there that have changed since some given base date