Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter 5: File System 5. File Concepts

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

1

Chapter 5: file system

Chapter 5: file system


5. File Concepts
A file is a collection of similar records. The file is treated as a single entity by users and
applications and may be referred by name. Files have unique file names and may be created and
deleted. Restrictions on access control usually apply at the file level.

A file is a container for a collection of information. The file manager provides a protection
mechanism to allow users administrator how processes executing on behalf of different users can
access the information in a file. File protection is a fundamental property of files because it
allows different people to store their information on a shared computer.

File represents programs and data. Data files may be numeric, alphabetic, binary or alpha
numeric. Files may be free form, such as text files. In general, file is sequence of bits, bytes, lines
or records.

Information stored in files must be persistent. That is not be affected by process creation and
termination. A file should only disappear when its owner explicitly removes it. Files are
managed by the operating system. How they are structured, named, accessed, used, protected and
implemented are major topics in operating system design. As a whole that part of the operating
system dealing with files is known as the file system.

FILE NAMING:

• Files are an abstraction mechanism. Files provide a way to store information on the disk
and read it back later.

• When a process creates a file, it gives the file a name. When the process terminates, the
file continues to exist and can be accessed by other processes using its name.

• Rules for file naming vary from system to system, but all current operating systems allow
strings of one to eight letters as legal file names. Digits and special characters are also
permitted, characters.

• Some file systems like the one present in UNIX distinguish between upper and lower
case letters, whereas others like the one in MS-DOS do not.

• Thus a unix system have all of the following as three distinct files maria, Maria, MARIA.
In MS DOS all these names refer to the same file name.

• Many operating systems support two-part file names, with the two parts separated by a
period ex. prog.c . The part following the period is called the file extension and usually
indicates something about the file.in MS DOS for example file names are 1 to 8
characters, plus an optional extension of 1 to 3 characters.

• In Unix, file may even have two or more extensions. Example prog.c.z, where .z is
commonly used to indicate that the file(prog.c) has been compressed usig ziv-lempel
compression algorithm.
2

Chapter 5: file system

FILE STRUCTURE:

Files can be structured in any of several ways. Three common possibilities are shown below.

In byte sequence,

• Operating system sees the file as a set of bytes.

• UNIX and windows use this schema.

• User can put anything in file and name it.


3

Chapter 5: file system

(a) Byte sequence. (b) Record sequence. (c) Tree


In record sequence,

• a file is a sequence of fixed-length records, each with some internal structure.

• Read and write operations are performed on this fixed length records. Ex, mainframe
O.S.

In tree structure,

• A file consists of a tree of records, not necessarily all the same length.

• Each file contains a key field in a fixed position in the record.

• The tree is sorted based on the key field, to allow rapid searching for a particular key.

• The operating system decides where to place them and not the user.

• This type of file system is widely used on the large mainframe computers still used in
some commercial data processing.

FILE TYPES:

Many operating systems support several types of files.

Unix have regular files, directories, character special files and block special files.

Windows have regular files and directories.

• Regular files are the ones that contain user information.

• Regular files are generally either ASCII files or binary files. Advantage of ASCII files is
that they can be displayed and printed as is, and they can be edited with any text editor.

• Binary files have some internal structure known only to programs that use them. This file
is just a sequence of bytes. O.S executes this file only if it has proper format.

• Two version of executable file, one as just sequence of bits and another as an archive are
shown below.
4

Chapter 5: file system

(a) An executable file. (b) An archive.


• Directories are system files for maintaining the structure of the file system.

• Character special files are related to input/output and used to model serial I/O devices
such as terminals, printers, and networks.

• Block special files are used to model disks.

FILE ATTRIBUTES:

• Extra items added to a file like date, time, size, etc other than name of file and its data.

• The list of attributes varies considerably from system to system. The table below shows
some of the attribute.

File attributes vary from one operating system to another. The common attributes are,

Name – only information kept in human-readable form.

Identifier – unique tag (number) identifies file within file system

Type – needed for systems that support different types

Location – pointer to file location on device

Size – current file size

Protection – controls who can do reading, writing, executing

Time, date, and user identification – data for protection, security, and usage monitoring

Information about files are kept in the directory structure, which is maintained on the disk
5

Chapter 5: file system

File Operations

Any file system provides not only a means to store data organized as files, but a collection of
functions that can be performed on files. Typical operations include the following:
Create: A new file is defined and positioned within the structure of files.
Delete: A file is removed from the file structure and destroyed.
Open: An existing file is declared to be "opened" by a process, allowing the process to perform
functions on the file.
Close: The file is closed with respect to a process, so that the process no longer may perform
functions on the file, until the process opens the file again.
Read: A process reads all or a portion of the data in a file.
Write: A process updates a file, either by adding new data that expands the size of the file or by
changing the values of existing data items in the file.
File Types – Name, Extension
A common technique for implementing file types is to include the type as part of the file name.
The name is split into two parts : a name and an extension. Following table gives the file type
with usual extension and function.

5.1 MEMORY MAPPED FILES:

• Two system calls map and un-map are used.

• Here file name and virtual address are given, this causes the operating system to map the
file into the address space at the virtual address.

• File mapping works best in a system that supports segmentation. In such a system, each
file can be mapped onto its own segment so that byte k in the file is also byte k in the
segment.

• An example of file mapping is shown below,


6

Chapter 5: file system

(a) A segmented process before mapping files into its address


space. (b) The
process after mapping an existing file abc into one segment
and creating a new segment
for file xyz.
• Here a process can copy the source segment into the destination segment using an
ordinary copy loop. No read or write system calls needed. Then it can execute the unmap
system call to remove the files from the address space and then exit.
Advantage: eliminates the need for I/O thus making programming easier.
Disadvantage:
• It is hard for the system to know the exact length of the output file and there is no way of
knowing how many bytes in that page were written
• The system has to take great care to make sure the two processes do not see inconsistent
versions of the file.

5.2 File Access Mechanisms

File access mechanism refers to the manner in which the records of a file may be accessed.
There are several ways to access files −

 Sequential access
 Direct/Random access
 Indexed sequential access

Sequential access
A sequential access is that in which the records are accessed in some sequence, i.e., the
information in the file is processed in order, one record after the other. This access
method is the most primitive one. Skipping some bytes or reading out of order not
allowed.
7

Chapter 5: file system

• Sequential file access was used in storage medium like magnetic tape rather than disks.

Example: Compilers usually access files in this fashion.

Direct/Random access
 Random access file organization provides, accessing the records directly.

 Each record has its own address on the file with by the help of which it can be directly
accessed for reading or writing.

 The records need not be in any sequence within the file and they need not be in adjacent
locations on the storage medium.It is possible to read the bytes or records of a file out of
order, or to access records by key, rather than by position. Files whose bytes or records
can be read in any order are called random access files.

Indexed sequential access

 This mechanism is built up on base of sequential access.


 An index is created for each file which contains pointers to various blocks.
 Index is searched sequentially and its pointer is used to access the file directly.

5.2.1 Space Allocation

Files are allocated disk spaces by operating system. Operating systems deploy following three
main ways to allocate disk space to files.

 Contiguous Allocation
 Linked Allocation
 Indexed Allocation

Contiguous Allocation

 Each file occupies a contiguous address space on disk.


 Assigned disk address is in linear order.
 Easy to implement.
 External fragmentation is a major issue with this type of allocation technique.

Linked Allocation

 Each file carries a list of links to disk blocks.


8

Chapter 5: file system

 Directory contains link / pointer to first block of a file.


 No external fragmentation
 Effectively used in sequential access file.
 Inefficient in case of direct access file.

Indexed Allocation

 Provides solutions to problems of contiguous and linked allocation.


 A index block is created having all pointers to files.
 Each file has its own index block which stores the addresses of disk space occupied by
the file.
 Directory contains the addresses of index blocks of files.

5.3 FILE SYSTEM IMPLEMENTATION

5.3.1 File System Layout:

 Sector 0 of the disk is called the MBR (Master Boot Record ) and is used to boot the
computer. The end of the MBR contains the partition table. This table gives the starting
and ending addresses of each partition.
 MBR program locates the active partition, read in its first block, called the boot block ,
and execute it.
 The program in the boot block loads the operating system contained in that partition.
 The layout of a disk partition varies from file system to file system. Often the file system
will contain some of the items shown in below Fig.
9

Chapter 5: file system

The first one is the superblock. It contains all the key parameters about the file system
like file system type, the number of blocks in the file system, and other key
administrative Information and is read into memory when the computer is booted.

• Free space managment block tells about free blocks in the file system.
• i-nodes tells all about the file.
• Root directory contains the top of the file system tree.
• The remainder of the disk typically contains all the other directories and files.

DIRECTORIES:

To keep track of files, file systems normally have directories or folders

Single-Level Directory Systems:

• The simplest form of directory system is having one directory containing all the files.
Sometimes it is called the root directory. Ex. CDC 6600, early P.C.
• Below diagram illustrates one directory system,
• Advantage is simplicity and ability to locate files quickly.
10

Chapter 5: file system

Disadvantage: Different users may accidentally use the same names for their files.

Two-level Directory Systems

To overcome the disadvantage of single directory system, each user here are given a
private directory.

Two level directory is illustrated below,

• This design could be used, for example, on a multiuser computer or on a simple network
of personal computers that shared a common file server over a local area network.

when a user tries to open a file, the system knows which user it is in order to know which
directory to search. As a consequence, some kind of login procedure is needed,

• Here users can only access files in their own directories. However, a slight extension is to
allow users to access other users’ files by providing some indication of whose file is to be
opened.
11

Chapter 5: file system

Advantage: eliminates name conflict problem in case of multiple users.

Disadvantage: Not efficient in presence of large number of files.

Hierarchical Directory Systems:

• In order to manage many files, hierarchy structure i.e. tree of directories is preferred and
is shown in below diagram.
• Here each user can have as many directories as are needed.

Users can create an arbitrary number of subdirectories

This schema acts as a powerful structuring tool for users to organize their work.

PATH NAMES

Two different methods are commonly used for specifying or denoting the file names
placed inside a directory file system.

• In the first method, each file is given an absolute path name consisting of the path from
the root directory to the file. Ex. /usr/ast/mailbox
• Absolute path names always start at the root directory
• Second way is to use relative path name.
• A user can designate one directory as the current working directory and any reference for
a file means it is present within the working directory.
12

Chapter 5: file system

• But if a file has to be accessed regardless of the working directory, then absolute path
name must be specified.

• Most operating systems that support a hierarchical directory system have two special
entries in every directory, “.” and “..”, Dot refers to the current directory; dotdot refers to
its parent.
DIRECTORY OPERATION:
The allowed system calls for managing directories are,
1. Create 2. Delete 3. Opendir 4. Closedir 5.Readdir 6. Rename 7. Link 8.Unlink

5.4 BACKUP STRATEGIES:

• Backing up of files can be done on modern tapes.


• Modern tapes hold tens or sometimes even hundreds of gigabytes

Backups are done to

1. Recover from disaster – accidents like disk crash, fire, flood or some natural
catastrophe

2. Recover from stupidity – accidentally removing files.

Few issues in backing up data’s are,

• Should the entire file system be backed up or only part of it?


• Should the files that were backed up before also must be backed up again. backing up
files that have not changed since the last backup leads to the idea of incremental dumps
• Should the data’s be compressed to save some space before backing up is made
• How should the backup on active file system has to be made.
• Will backing up lead to non technical problems like security problems related to data’s.

Strategies used for back up are

1. Physical dump
13

Chapter 5: file system

2. Logical dump

• A physical dump starts at block 0 of the disk, writes all the disk blocks onto the output
tape in order, and stops when it has copied the last one.
• A logical dump starts at one or more specified directories and recursively dumps all files
and directories found there that have changed since some given base date

You might also like