Professional Documents
Culture Documents
Title of Project File Management: Shri H. H. J. B Polytechnic, CHANDWAD-423101 (Nashik)
Title of Project File Management: Shri H. H. J. B Polytechnic, CHANDWAD-423101 (Nashik)
EDUCATION
SHRI H. H. J. B POLYTECHNIC,
CHANDWAD-423101 (Nashik)
MICRO PROJECT
Academic year 2023-24
TITLE OF PROJECT
File management
1
Teacher Evaluation Sheet
Name of Student: LUNKAD ARPITA RAHUL
Enrolment No: 23651020313
Name of Program: Computer Technology Semester:-2K
Course Title: Linux Basics Code: -312001
Title of the Micro Project: File Management
Sr.
Characteristic to be Poor Average Good Excellent
No.
assessed (Marks 1-3) (Marks 4-5) (Marks 6 - 8) (Marks 9-10)
(A) Process and Product Assesssment (Convert above total marks out of 6 marks)
1 Relevance to the Course
Literature Survey /
2
Information Collection
Completion of the Target as
3
per project proposal
Analysis of data and
4 representation
5 Quality of Prototype / Model
6 Report Preparation
(B) Individual Presentation / Viva (Convert above total marks out of 4 marks)
8 Presentation
9 Viva
Micro – Project Evaluation Sheet:
Process Assessment Product Assessment
Part Project Part Individual Total
A – project Methodology B – Project Presentation / Marks 10
LUNKAD ARPITA RAHUL Proposal (2 marks) Report / Working Viva (4 marks)
(2 marks) Model(2 marks)
2
Teacher Evaluation Sheet
Name of Student: MAHAJAN NIRJARA VIKAS
Enrolment No:23651020314
Name of Program: Computer Technology Semester:-2K
Course Title: Linux Basics Code: -312001
Title of the Micro Project: File Management
Course Outcomes Achieved:-
CO1 - Install Linux operating system.
CO2 - Execute general purpose commands of the Linux operating system.
CO3 - Manage files and directories in Linux operating system.
CO4 - Use vi editor in Linux operating system.
CO5 - Write programs using shell script
MAHAJAN NIRJARA
VIKAS
Comments / Suggestions about team work / leadership / inter – personal communication (if any)
3
MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION
CERTIFICATE
This is to certify 1) LUNKAD ARPITA RAHUL
Place: CHANDWAD
Date: / /2024
4
MAHARASHTRA STATE BOARD OF TECHNICAL EDUCATION
CERTIFICATE
This is to certify 1) Nirjara Vikas Mahajan .
Place: CHANDWAD
Date: / /2024
5
INDEX
Part A
07
1.0 Brief Introduction
07
2.0 Aim of Micro Project
07
3.0 Action Plan
07
4.0 Resources Required
Part B
08
1.0 Brief Description
08
2.0 Aim of Micro Project
08
3.0 Course Outcome Integrated
09
4.0 Actual Procedure Followed
16
5.0 Actual Resource Used
26
6.0 Outputs of the Micro-projects
26
7.0 Skill Developed
26
8.0 Applications of this Microproject
6
PART A-Plan
3.0.Proposed Methodology-
The proposed methodology for file management in Linux basics includes understanding the file
system structure, navigating directories, mastering file manipulation commands, managing permissions,
utilizing archiving and compression tools, implementing backup strategies, automating tasks, monitoring disk
usage, and continuous learning and documentation.
.
4.0Action Plan-
file management in Linux involves organizing, accessing, and manipulating files and directories
using command-line tools, navigating the file system structure, managing permissions and ownership, and
utilizing text editors for editing files.
file management in Linux basics aims to organize files efficiently, optimize resource usage,
enhance security, enable backup and recovery, facilitate collaboration, automate tasks, ensure
compliance, and promote learning and skill development.
The actual methodology for file management in Linux basics involves understanding the file system
structure, navigating directories, creating/deleting/copying/moving files, viewing/editing files, setting
permissions/ownership, archiving/compressing files, implementing backup strategies, automating tasks with
shell scripting, monitoring disk usage, and continuous learning through documentation and online resources
8
6.0 Files :
Concepts:
• A file is a named collection of related information that is recorded on secondarystorage such
as magnetic disks, magnetic tapes and optical disks.
• In general, a file is a sequence of bits, bytes, lines or records whose meaning isdefined by
the files creator and user.
Attributes of a File
Following are some of the attributes of a file:
• Name . It is the only information which is in human-readable form.
• Identifier. The file is identified by a unique tag(number) within file system.
• Type. It is needed for systems that support different types of files.
• Location. Pointer to file location on device.
• Size. The current size of the file.
• Protection. This controls and assigns the power of reading, writing, executing.
Time, date, and user identification. This is the data for protection,
security,and usage monitoring.
File Operations
The operating system must do to perform basic file operations given below.
• Creating a file: Two steps are necessary to create a file. First, space in the file system must
be found for the file. Second, an entry for the new file must be madein the directory.
• Writing a file: To write a file, we make a system call specifying both the nameof the file and
the information to be written to the file. Given the name of the file, the system searches the
directory to find the file's location. The system mustkeep a write pointer to the location in the
file where the next write is to take place. The write pointer must be updated whenever a write
occurs.
• Reading a file: To read from a file, we use a system call that specifies the nameof the file and
where (in memory) the next block of the file should be put. Again,the directory is searched for
the associated entry, and the system needs to keep aread pointer to the location in the file
where the next read is to take place. Oncethe read has taken place, the read pointer is updated.
• Repositioning within a file: The directory is searched for the appropriate entry, and the
current-file-position pointer is repositioned to a given value.Repositioning within a file need
not involve any actual I/O. This file operation is also known as a file seek.
Deleting a file. To delete a file, we search the directory for the named
file. Having found the associated directory entry, we release all file
space, so that it can be reused bv other files, and erase the directory entry
• Protection: Access-control information determines who can do reading, writing, executing, and
so on.
9
• Truncating a file: The user may want to erase the contents of a file but keep itsattributes.
Rather than forcing the user to delete the file and then recreate it, this function allows all
attributes to remain unchanged—except for file length—but lets the tile be reset to length zero
and its file space released.
In brief
File Types
10
File System Structure
A File Structure should be according to a required format that the operating system can
understand.
• A file has a certain defined structure according to its type.
• A text file is a sequence of characters organized into lines.
• A source file is a sequence of procedures and functions.
• An object file is a sequence of bytes organized into blocks that are
understandable by the machine.
• When operating system defines different file structures, it also contains the code
to support these file structure. Unix, MS-DOS support minimum number of file
structure.
Files can be structured in several ways in which three common structures are given in this
tutorial with their short description one by one.
File Structure 1
• Here, as you can see from the figure 1, the file is an unstructured sequence of
bytes.
• Therefore, the OS doesn't care about what is in the file, as all it sees are bytes.
File Structure 2
• Now, as you can see from the figure 2 that shows the second structure of a file,
where a file is a sequence of fixed-length records where each with some internal
structure.
• Central to the idea about a file being a sequence of records is the idea that read
operation returns a record and write operation just appends a record.
File Structure 3
• Now in the last structure of a file that you can see in the figure 3, a file basically
consists of a tree of records, not necessarily all the same length, each containing
a key field in a fixed position in the record. The tree is stored on the field, just to
allow the rapid searching for a specific key.
2. Direct Access
• Sometimes it is not necessary to process every record in a file.
• It is not necessary to process all the records in the order in which they are present in thememory. In all
such cases, direct access is used.
• The disk is a direct access device which gives us the reliability to random access of anyfile block.
• In the file, there is a collection of physical blocks and the records of that blocks.
• Example: Databases are often of this type since they allow query processing that involves immediate
access to large amounts of information. All reservation systems fallinto this category.
In brief:
• This method is useful for disks.
• There are no restrictions on which blocks are read/written, it can be dobe in anyorder.
• User now says "read n" rather than "read next".
• "n" is a number relative to the beginning of file, not relative to an absolutephysical
disk location.
12
Advantages:
• Direct access file helps in online transaction processing system (OLTP) likeonline
railway reservation system.
• In direct access file, sorting of the records are not required.
• It accesses the desired records immediately.
• It updates several files quickly.
• It has better control over record allocation.
Disadvantages:
• Direct access file does not provide backup facility.
• It is expensive
• It has less storage space as compared to sequential file.
1. Indexed Sequential Access
• The index sequential access method is a modification of the direct accessmethod.
• Basically, it is kind of combination of both the sequential access as well as directaccess.
• The main idea of this method is to first access the file directly and then it accessessequentially.
• In this access method, it is necessary for maintaining an index.
• The index is nothing but a pointer to a block.
• The direct access of the index is made to access a record in a file.
• The information which is obtained from this access is used to access the file.Sometimes
the indexes are very big.
• So to maintain all these hierarchies of indexes are built in which one direct accessof an index
leads to information of another index access.
• It is built on top of Sequential access.
• It uses an Index to control the pointer while accessing files.
Advantages:
• In indexed sequential access file, sequential file and random file access ispossible.
• It accesses the records very fast if the index table is properly organized.
• The records can be inserted in the middle of the file.
• It provides quick access for sequential and direct processing.
• It reduces the degree of the sequential search.
Disadvantages:
• Indexed sequential access file requires unique keys and periodic reorganization.
• Indexed sequential access file takes longer time to search the index for the dataaccess or
retrieval.
• It requires more storage space.
• It is expensive because it requires special software.
• It is less efficient in the use of storage space as compared to other fileorganizations
Swapping:
• Swapping is a mechanism in which a process can be swapped temporarily out ofmain memory
(or move) to secondary storage (disk) and make that memory available to other processes.
• At some later time, the system swaps back the process from the secondary storage to main
memory.
13
• Tough performance is usually affected by swapping process but it helps in running multiple
and big processes in parallel and that's the reason
• Swapping is also known as a technique for memory
compaction.
• Swap space is a space on hard disk which is a substitute of physical memory.
• It is used as virtual memory which contains process memory image.
Whenever our computer run short of physical memory it uses its virtual memoryand stores
information in memory on disk
1. Contiguous Allocation
In this scheme, each file occupies a contiguous set of blocks on the disk. For example, if a file
requires n blocks and is given a block b as the starting location,then the blocks assigned to the file will
be: b, b+1, b+2,……b+n-1
• This means that given the starting block address and the length of the file (interms of blocks
required), we can determine the blocks occupied by the file.
• The directory entry for a file with contiguous allocation contains
1. Address of starting block
2. Length of the allocated portion.
The file ‘mail’ in the following figure starts from the block 19 with length = 6blocks. Therefore, it
occupies 19, 20, 21, 22, 23, 24 blocks.
14
• Each file occupies a contiguous address space on disk.
• Assigned disk address is in linear order.
• Easy to implement.
• External fragmentation is a major issue with this type of allocation technique
Advantages:
• Both the Sequential and Direct Accesses are supported by this. For direct access,the address of
the kth block of the file which starts at block b can easily be obtained as (b+k).
• This is extremely fast since the number of seeks are minimal because ofcontiguous allocation
of file blocks.
Disadvantages:
• This method suffers from both internal and external fragmentation. This makesit inefficient
in terms of memory utilization.
• Increasing file size is difficult because it depends on the availability ofcontiguous
memory at a particular instance
.
2. Linked Allocation
• In this scheme, each file is a linked list of disk blocks which need not be
contiguous.
• The disk blocks can be scattered anywhere on the disk.
• The directory entry contains a pointer to the starting and the ending file block.
• Each block contains a pointer to the next block occupied by the file.
The file ‘jeep’ in following image shows how the blocks are randomly distributed. The last block (25)
contains -1 indicating a null pointer and does notpoint to any other block
15
•
16
• Provides solutions to problems of contiguous and linked allocation.
• A index block is created having all pointers to files.
• Each file has its own index block which stores the addresses of disk spaceoccupied by
the file.
• Directory contains the addresses of index blocks of files.
•
Advantages:
• This supports direct access to the blocks occupied by the file and thereforeprovides fast
access to the file blocks.
• It overcomes the problem of external fragmentation.
Disadvantages:
• The pointer overhead for indexed allocation is greater than linked allocation.
• For very small files, say files that expand only 2-3 blocks, the indexed allocationwould keep
one entire block (index block) for the pointers which is inefficient in terms of memory
utilization. However, in linked allocation we lose the space ofonly 1 pointer per block.
2. Single-level directory –
Single level directory is simplest directory structure.
In it all files are contained in same directory which make it easy to support andunderstand.
A single level directory has a significant limitation, however, when the numberof files
increases or when the system has more than one user.
17
Since all the files are in the same directory, they must have the unique name. iftwo users
call their dataset test, then the unique name rule violated.
Advantages:
Since it is a single directory, so its implementation is very easy.
If files are smaller in size, searching will faster
The operations like file creation, searching, deletion, updating are very easy insuch a
directory structure.
Disadvantages:
There may chance of name collision because two files cannot have the samename.
Searching will become time taking if directory will large.
In this cannot group the same type of files together.
3. Two-level directory –
As, a single level directory often leads to confusion of files names among different users hence
the solution to this problem is to create a separate directoryfor each user.
In the two-level directory structure, each user has their own user files directory (UFD).
The UFDs has similar structures, but each lists only the files of a single user. system’s master
file directory (MFD) is searches whenever a new user id=s logged in.
The MFD is indexed by username or account number, and each entry points to the UFD for that user
18
Advantages:
We can give full path like /User-name/directory-name/.
Different users can have same directory as well as file name.
Searching of files become more easy due to path name and user-grouping.
Disadvantages:
A user is not allowed to share files with other users.
Still it not very scalable, two files of the same type cannot be grouped togetherin the same
user.
4. Tree-structured directory –
Once we have seen a two-level directory as a tree of height 2, the natural
generalization is to extend the directory structure to a tree of arbitrary height.
This generalization allows the user to create their own subdirectories and to organize on their
files accordingly.
A tree structure is the most common directory structure. The tree has a root directory, and
every file in the system have a unique path.
Advantages:
Very generalize, since full path name can be given.
Very scalable, the probability of name collision is less.
Searching becomes very easy, we can use both absolute path as well as relative.
Disadvantages:
Every file does not fit into the hierarchical model; files may be saved intomultiple
directories.
We cannot share files.
It is inefficient, because accessing a file may go under multiple directories
19
Disk Organization:
A physical structure of disk is a memory storage device which looks like this:
Basically, hard disk can be divided in the logical structure in the following five logicalterms:
MBR (Master Boot Record)
DBR (DOS Boot Record)
FAT (File Allocation Tables)
Root Directory
Data Area
1. The Master Boot Record (or MBR)
• At the beginning of the hard drive is the MBR. When your computer starts usingyour hard
drive, this is where it looks first.
• The MBR itself has a specific organization. The size of the MBR is 512 bytes.
• The boot loader is the first 446 bytes of the MBR. This section containsexecutable
code, where programs are housed.
• The partition tables are 4 slots of 16 bytes each, containing the description of apartition
(primary or extended) on the disk
Here is how to describe a partition:
• State of the partition (inactive partition bootable) - (1 byte)
• Custom heads at the beginning of the partition - (1 byte)
• Cylinder sector and the beginning of the partition - (2 bytes)
• Type of partition (file system, eg, 32 fat, ext2 etc ...) - (1 bytes)
• Head of the end of the partition (1 byte)
• Cylinder sector and the end of the score - (2 bytes)
• Number of sectors between the MBR and the first sector of the partition - (4bytes)
21
• Number of sector of the partition - (4 bytes)
The Magic Number is two bytes used to determine if the hard disk has a bootloader or not. If it does,
the magic number should be equal in value to hexadecimal 55AA
• Previously the root directory used to be fixed in size and located at a fixed position on disk
but now it is free to grow as necessary as it is now treated as a file.
This configuration has striping, but no redundancy of data. It offers the bestperformance, but no fault
tolerance.
23
RAID 1: Also known as disk mirroring, this configuration consists of at least two drivesthat duplicate the storage of data.
There is no striping. Read performance is improved since either disk can be read at the same time. Write performance is
the same as for single disk storage
RAID 2: This configuration uses striping across disks, with some disks storing error checking and correcting (ECC)
information. It has no advantage over RAID 3 and is nolonger used
RAID 3: This technique uses striping and dedicates one drive to storing parity information. The embedded ECC
information is used to detect errors. Data recovery isaccomplished by calculating the exclusive OR (XOR) of the
information recorded on the other drives. Since an I/O operation addresses all the drives at the same time, RAID3 cannot
overlap I/O. For this reason, RAID 3 is best for single-user systems with longrecord applications
24
RAID 4: This level uses large stripes, which means you can read records from any single drive. This allows you to use
overlapped I/O for read operations. Since all writeoperations have to update the parity drive, no I/O overlapping is
possible. RAID 4 offersno advantage over RAID 5
RAID 5: This level is based on block-level striping with parity. The parity informationis striped across each drive,
allowing the array to function even if one drive were to fail.The array's architecture allows read and write operations to
span multiple drives. This results in performance that is usually better than that of a single drive, but not as high as that
of a RAID 0 array. RAID 5 requires at least three disks, but it is often recommended to use at least five disks for
performance reasons
25
RAID 6: This technique is similar to RAID 5, but includes a second parity scheme thatis distributed across
the drives in the array. The use of additional parity allows the arrayto continue to function even if two disks
fail simultaneously. However, this extra protection comes at a cost. RAID 6 arrays have a higher cost per
gigabyte (GB) and often have slower write performance than RAID 5 arrays.
The output of a micro-project on file management in Linux basics includes creating a well-organized
directory structure, performing file operations like copying and moving files, setting permissions and
ownership, editing text files, archiving and compressing files, implementing basic backup strategies,
documenting the process, and presenting or demonstrating the completed tasks.
10.0 CONCLUTION:
mastering file management in Linux basics is crucial for efficient operation within a Linux
environment, leading to enhanced productivity and security across various fields such as system
administration, software development, data analysis, and cybersecurity.
26
THANK YOU !
27
28
29
30
21
.
22
23