Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
21 views30 pages

Wa0024

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 30

UNIT 5

File System :
FileConcepts
AccessMethods
DirectoryStructures
Protection Consistency Semantics
FileSystem Structures
AllocationMethods
FreeSpaceManagement.

System Security :
Security Problems
Program Threats
System and Network Threats
User Authentication.
File Concepts
A file is a named collection of related information that is recorded on
secondary storage such as magnetic disks, magnetic tapes and optical disks.
In general, a file is a sequence of bits, bytes, lines or records whose meaning is
defined by the files creator and user.

File Structure
A File Structure should be according to a required format that the
operating system can understand.

❑ A file has a certain defined structure according to its type.

❑ A text file is a sequence of characters organized into lines.

❑ A source file is a sequence of procedures and functions.

❑ An object file is a sequence of bytes organized into blocks that are


understandable by the system’s linker.

❑ An executable file is a series of code sections that the loader can bring into
memory and execute.
File Attributes
A file has a single editable name given by its creator. The name never
changes unless a user has necessary permission to change the file name. The
names are given because it is humanly understandable.

The properties of a file can be different on many systems, however, some of


them are common:

❖ Name – unique name for a human to recognize the file.


❖ Identifier – A unique number tag that identifies the file in the file system,
and non-human readable.
❖ Type – Information about the file to get support from the system.
❖ Size – The current size of the file in bytes, kilobytes, or words.
❖ Protection – Access control information determines who could read, write,
execute, change, or delete the file in the system.
❖ Time, Date, and User identification – This information kept for date
created, last modified, or accessed and so on.

The information about the file is kept in a directory structure which also
resides on the secondary storage.
File Operations
A file is an abstract data type. The operating system performs the
following file operations using various system calls.

Create Files – User must have necessary disk space on the file system to create
a file. A directory entry is required where the file is created.

Read Files – The system call requires file name and next block in memory to be
read. The system needs a read pointer to read the file from a specific location in
the file and this pointer is updated for the next read from the file.

Write Files – The system call uses same file pointers of the process to write to a
file. This saves space and reduces complexity.

Repositioning within a file – The current-file-position pointer is repositioned


to a given value. It does not involve an I/O, and known as file seek operation.

Deleting a file – We look into the directory for the file name, if a file is found,
release the space occupied by the file and remove directory entries for the deleted
file.
Truncating a file – Sometimes the user does not want to delete a file, but
remove some information from it. This will change file length attribute,
however, other attributes remain unchanged.

There are other file operations such as appending a file, renaming a file,
create a duplicate copy of the file.

Open-File Table

✔ The file operations require searching a directory every time.

✔ To avoid frequent searches, the OS allows a system call – open() and keeps a
small table containing information about all open files called open file table.

✔ When the file operation is requested, the system refers to the file via an
index value.

✔ The file is not used actively, the process closes the file and the OS remove its
entry from the open file table.
File Type : A common technique for implementing file types is to include the type as part of
the file name.The name is split into two parts --- a name and an extension,usually separated by a
period character.

File type Usual extension Function

Executable exe, com, bin Read to run machine language program

Object obj, o Compiled, machine language not linked

Source Code C, java, pas, asm, a Source code in various languages

Batch bat, sh Commands to the command interpreter

Text txt, doc Textual data, documents

Word Processor wp, tex, rrf, doc Various word processor formats

Archive arc, zip, tar Related files grouped into one compressed file

Multimedia mpeg, mov, rm For containing audio/video information

Markup xml, html, tex It is the textual data and documents

Library lib, a ,so, dll It contains libraries of routines for programmers

Print or View gif, pdf, jpg It is a format for printing or viewing a ASCII or binary file.
File Access Methods
The information stored in a file must be accessed and read into memory.
Though there are many ways to access a file, some system provides only one
method, other systems provide many methods, out of which you must choose
the right one for the application.

Sequential Access Method


In this method, the information in the file is processed in order, one
record after another. For example, compiler and various editors access files in
this manner.
The read-next – reads the next portion of a file and updates the file
pointer which tracks the I/O location. Similarly, the write-next will write at
the end of a file and advances the pointer to the new end of the file.
Direct Access Method
The other method for file access is direct access or relative access. For
direct access, the file is viewed as a numbered sequence of blocks or records.
This method is based on the disk model of file. Since disks allow random
access to file block.
You can read block 34, then read 45, and write in block 78, there is no
restriction on the order of access to the file.
The direct access method is used in database management. A query is
satisfied immediately by accessing large amount of information stored in
database files directly. The database maintains an index of blocks which
contains the block number. This block can be accessed directly and
information is retrieved.
Rather than read next or write next, the direct access method pass the
block number as the parameter for read and write operations.

Direct Access Method


Using Index
FILE DIRECTORIES
Collection of files is a file directory. The directory contains information
about the files, including attributes, location and ownership. Much of this
information, especially that is concerned with storage, is managed by the
operating system. The directory is itself a file, accessible by various file
management routines.
The directory can be viewed as a symbol table that translates file names
into their directory entries.

Information contained in a device directory are:


1. Name
2. Type
3. Address
4. Current length
5. Maximum length
6. Date last accessed
7. Date last updated
8. Owner id
9. Protection information
Operation performed on directory are:
1. Search for a file
2. Create a file
3. Delete a file
4. List a directory
5. Rename a file
6. Traverse the file system

Advantages of maintaining directories are:

Efficiency: A file can be located more quickly.


Naming: It becomes convenient for users as two users can have same name
for different files or may have different name for same file.
Grouping: Logical grouping of files can be done by properties e.g. all java
programs, all games etc.
SINGLE-LEVEL DIRECTORY

In this a single directory is maintained for all the users.


Naming problem: Users cannot have same name for two files.
Grouping problem: Users cannot group files according to their need.

TWO-LEVEL DIRECTORY

In this separate directories for each user is maintained. Due to two levels
there is a path name for every file to locate that file. Now,we can have same
file name for different user. Searching is efficient in this method.

Path names can be of two types: absolute path names ,relative path
names.
An absolute path name begins at the root and follows a path down to the
specified file , giving the directory names on the path.

A relative path name defines a path from the current directory.


TREE-STRUCTURED DIRECTORY :
Directory is maintained in the form of a tree. Searching is efficient and
also there is grouping capability. We have absolute or relative path name
for a file.
Consistency Semantics
It is an important criterion for evaluating any file system that supports file sharing. It is a
characterization of the system that specifies the semantics of multiple users accessing a shared
file simultaneously. The semantics are typically implemented as code with the file system.
• Logical file system

It manages metadata information about a file i.e includes all details


about a file except the actual contents of file.
It also maintains via file control blocks.
File control block (FCB) has information about a file – owner, size,
permissions, location of file contents.

Advantages :

Duplication of code is minimized.


Each file system can have its own logical file system.

Disadvantages :

If we access many files at same time then it results in low performance.


File Allocation Methods
The allocation methods define how the files are
stored in the disk blocks. There are three main
disk space or file allocation methods.

Contiguous Allocation
Linked Allocation
Indexed Allocation

The main idea behind these methods is to provide:


Efficient disk space utilization.
Fast access to the file blocks.
Contiguous Allocation
In this scheme, each file occupies a contiguous set of blocks on the disk.
For example, if a file requires n blocks and is given a block b as the starting
location, then the blocks assigned to the file will be: b, b+1,
b+2,……b+n-1. This means that given the starting block address and the
length of the file (in terms of blocks required), we can determine the blocks
occupied by the file.

The directory entry for a file with contiguous allocation contains


❖ Address of starting block
❖ Length of the allocated portion.

Advantages:
1. Both the Sequential and Direct Accesses are supported by this.
2. This is extremely fast since the number of seeks are minimal because of
contiguous allocation of file blocks.

Disadvantages:
1. This method suffers from both internal and external fragmentation. This
makes it inefficient in terms of memory utilization.
2. Increasing file size is difficult because it depends on the availability of
contiguous memory at a particular instance.
Example:

The file ‘mail’ in the following figure starts from the block 19 with
length = 6 blocks. Therefore, it occupies 19, 20, 21, 22, 23, 24 blocks.
Linked List Allocation
• In this scheme, each file is a linked list of disk blocks which need not
be contiguous.
• The disk blocks can be scattered anywhere on the disk.
• The directory entry contains a pointer to the starting and the ending file
block.
• Each block contains a pointer to the next block occupied by the file.

Advantage :

o No external fragmentation
o Effectively used in sequential access file.
o There is no need to declare the file size when that file was created.

Disadvantage :

o Reliability can be a problem.


o The space required for the pointer.
o Inefficient in case of direct access file.
Example :

The file ‘jeep’ in following image shows how the blocks are randomly
distributed. The last block (25) contains -1 indicating a null pointer and
does not point to any other block.
Indexed Allocation
* In this scheme, a special block known as the Index block contains the
pointers to all the blocks occupied by a file.
* Each file has its own index block.
* The ith entry in the index block contains the disk address of the ith file
block.
* The directory entry contains the address of the index block as shown in
the example.

Advantages :

❑ Provides solutions to problems of contiguous and linked allocation.


❑ Support Direct access.
❑ No external fragmentation.

Disadvantages :

❑ Wasted space: overhead of the index blocks.


❑ Reliability problem.
Free space management
The system keeps tracks of the free disk blocks for
allocating space to files when they are created.
Also, to reuse the space released from deleting the
files, free space management becomes crucial.
The system maintains a free space list which keeps
track of the disk blocks that are not allocated to some
file or directory.
The free space list can be implemented mainly as:
1.Bitmap or Bit vector
2.Linked List
3.Grouping
4.Counting
Bitmap or Bit vector
❑ A Bitmap or Bit Vector is series or collection of bits where each bit
corresponds to a disk block.
❑ The bit can take two values: 0 and 1: 0 indicates that the block is
allocated and 1 indicates a free block.

The given instance of disk blocks on the disk in Figure 1 (where green blocks
are allocated) can be represented by a bitmap of 16 bits as:
0000111000000110.
Advantages :

• Simple to understand.
• Finding the first free block is efficient. It requires scanning the words (a
group of 8 bits) in a bitmap for a non-zero word. (A 0-valued word has all
bits 0).
• The first free block is then found by scanning for the first 1 bit in the non-
zero word.

The block number can be calculated as:


(number of bits per word) *(number of 0-values words) + offset of bit first bit
1 in the non-zero word .

For the Figure, we scan the bitmap sequentially for the first non-zero word.
The first group of 8 bits (00001110) constitute a non-zero word since all bits
are not 0. After the non-0 word is found, we look for the first 1 bit.

This is the 5th bit of the non-zero word. So, offset = 5.

Therefore, the first free block number = 8*0+5 = 5.


Linked List
❑ In this approach, the free disk blocks are linked together i.e. a free block
contains a pointer to the next free block.
❑ The block number of the very first disk block is stored at a separate
location on disk and is also cached in memory.

In Figure-2, the free space list head points


to Block 5 which points to Block 6, the next
free block and so on. The last free block
would contain a null pointer indicating the
end of free list.

A drawback of this method is the I/O


required for free space list traversal.
Grouping
The first free block stores the address of some, say n free blocks.
Out of these n blocks, the first n-1 blocks are actually free and the last
block contains the address of next free n blocks.

An advantage of this approach is that the addresses of a group of free disk


blocks can be found easily.

Counting
This approach stores the address of the first free disk block and a number
n of free contiguous disk blocks that follow the first block.
Every entry in the list would contain:
Address of first free disk block A number n

For example, in Figure-1, the first entry of the free space list would be:
([Address of Block 5], 2), because 2 contiguous free blocks follow block 5.

You might also like