Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
44 views

Study Module 2

The document provides an overview of Module 2 on data storage management. It covers 3 study sessions: [1] data storage devices such as floppy disks, hard disks, removable hard disks, magnetic tape, optical storage, USB flash drives, SD cards, solid state drives, and cloud storage; [2] file organization techniques; and [3] traditional file system processing. Study Session 1 describes different data storage devices, how data is accessed from disks, and disk performance measures like latency, throughput, and seek time.

Uploaded by

canal abdul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Study Module 2

The document provides an overview of Module 2 on data storage management. It covers 3 study sessions: [1] data storage devices such as floppy disks, hard disks, removable hard disks, magnetic tape, optical storage, USB flash drives, SD cards, solid state drives, and cloud storage; [2] file organization techniques; and [3] traditional file system processing. Study Session 1 describes different data storage devices, how data is accessed from disks, and disk performance measures like latency, throughput, and seek time.

Uploaded by

canal abdul
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Module 2:

Data Storage Management

Study Session 1: Data Storage Devices

Study Session 2: File Organization

Study Session 3: Traditional File System Processing

Study Session 1:

Learning Outcome:

At the end of session, you should be able to:

• Identify and describe data storage devices

Data Storage Devices

Diskette

A Diskette or floppy disk is a data storage medium that is composed of a disk of


thin, flexible magnetic storage medium encased in a square or rectangular piece of
mylar plastic. Floppy disks are read and written by a floppy disk drive or FDD, the
initials of which should not be confused with "fixed disk drive," which is another
term for a hard disk drive. Floppy disk exists in the following sizes; 8-inch (200
mm), 5¼-inch (133 mm), and the most common 3½-inch (90 mm). Though floppy
disks are still used in some data processing environment, they are now superseded
mainly by flash and optical storage devices. At the same time, some users consider
emails as a convenient way of exchanging small to medium size digital files. Floppy
optical drives combine magnetic and optical technologies to store about 21MB of
data on a media similar to 3½-inch floppy disks. The name is a portmanteau of the
words 'floppy' and 'optical.' This device was introduced in 1989 by Insite Peripherals
of San Jose but did not become popular because of the limited storage capacity it
offers. Similar technology was used in the Laser Servo-120 drive introduced in 1996
with 120MB capacity.

Figure 1: Floppy Disk

Hard disks

In a data processing environment, a Hard Disk Drive (HDD), commonly referred to


as a hard drive, hard disk or fixed disk drive, serve as a permanent storage device
for a large amount of data. Originally, the term "hard" was temporary slang,
substituting "hard" for "rigid," before these drives had an established and
universally-agreed-upon name. The hard disk drive (often shortened to "hard drive")
and the hard disk are not the same things, they are packaged as a unit, and so either
term is sometimes used to refer to the whole unit. A hard disk is a set of stacked
"disks,” each of which, like phonograph records, has data recorded
electromagnetically in concentric circles or “tracks” on the disk. A “head”
(something like a phonograph arm but in a relatively fixed position) records (writes)
or reads the information on the tracks. Two heads, one on each side of a disk, read
or write the data as the disk spins. Each read or write operation requires that data be
located, which is an operation called a "seek." (Data already in a disk cache,
however, will be located more quickly.) Modern computers come with a hard disk
that contains several billion bytes (gigabytes) of storage.

Figure 2: Hard Disk

3.3 Removable hard disks

This is a variation of the hard disk in which hard disks enclosed in plastic or metal
cartridges are easily removable like floppy disks. It combines the best features of
hard and floppy disks. They are used to provide large economic, high, fast, and
portable storage facilities for data processing.

3.4 Magnetic tape

Magnetic tape has historically been found more convenient means of large data
storage over disk where media portability or removability is required for backup.
Magnetic Tape uses the same read/write techniques as disks. Data is stored on
flexible mylar tape covered with magnetic oxide. Data is stored in parallel tracks of
9, 18, or 36. Data on tapes are accessed sequentially. Tapes provide slow, very cheap,
large capacity backup for data. The rapid advances in disk storage technologies
resulting in and the improvement in disk storage density, and reducedprice, coupled
with arguably declining innovation in tape storage technology, has reduced the
market share of tape storage devices.
Figure 3: Tape

Optical Storage

The optical storage devices such as CDs and DVDs are means in which data is
written and read with a laser for archival or backup purposes. The optical storage
devices are fast replacing both hard drives in computers and tape backup in mass
storage. This is because optical media are more durable than tape and less vulnerable
to environmental conditions lasting up to seven times as long as traditional storage
media. However, at present optical media are slower than typical hard drive speeds
and offer lower storage capacities. Optical disk capacity ranges up to 6 gigabytes
(6,000,000 bytes), which is far more compared to the 1.44 megabytes (MB), that is,
1,440,000 bytes offered by a floppy disk. A newer technology, the digital versatile
disc DVD, has about 4.7-gigabyte storage capacity on a single-sided, one-layered
disk compared with 65 gigabyte of storage for a CD-ROM disk. Invariably, they can
be used to hold large amount of data.

Figure 4: Optical Storage


Accessing Data from Disk

Bits of data (0 s and 1 s) are stored on circular magnetic platters called disks and
rotates rapidly (& never stops). A disk head reads and writes bits of data as under
the head. Often, several platters are organized into a disk pack (or disk drive). A disk
contains concentric tracks. Tracks are divided into sectors. A sector is the smallest
addressable unit in a disk.

Figure 5: Disk Organization

When a program reads a byte from the disk, the operating system locates the surface,
track, and sector containing that byte, and reads the entire sector into a particular
area in main memory called buffer. The bottleneck of disk access is moving the
read/write arm. So it makes sense to store a file in tracks that are below/above each
other on different surfaces rather than in several tracks on the same surface.

Cylinder

A cylinder is the set of tracks at a given radius of a disk pack. i.e. a cylinder is the
set of tracks that can be accessed without moving the disk arm. All the information
on a cylinder can be accessed without moving the read/write arm.

Disk Performance

In measuring the performance of a storage device, we may consider the following


three parameters.
(a) Rotational delay/Latency

This is the time it takes to position the proper sector under the read/write head. In
general, it is used to refer to the period of time that one component in a system is
spinning its wheels waiting for another component. Latency, therefore, is wasted
time. It makes sense to separate read latency and write latency, and in case of
sequential access storage, minimum, maximum and average latency. Consider a hard
disk which rotates at about 5000 rpm i.e. one revolution per 12 msec. The average
latency can be calculated as follows:

Min latency = 0

Max latency = Time for one disk revolution

Average latency (r) = (min + max) / 2

= max / 2

= time for ½ disk revolution Typically, 68 ms average.

(b) Throughput

This is the rate at which information can be read from or written to the storage. It is
expressed in terms of megabytes per second or MB/s. A media accessed sequentially,
as opposed to randomly, typically yield maximum throughput.

(c) Seek time

This is the amount of time between when the CPU requests a file and when the first
byte of the file is sent to the CPU. Seek times between 10 and 20 milliseconds are
common.

A USB Flash Drive


This is a NAND-type flash memory data storage device integrated with a USB
(universal serial bus) connector. USB flash drives are typically removable and
rewritable, much shorter than a floppy disk (1-4 inches or 25-102 mm), and weigh
less than 56g. Their storage capacities typically range from 64MB to 32GB or more.
They have 10-year data retention. USB flash drives offer potential advantages over
other portable storage devices, particularly the floppy disk. They are more compact,
faster, hold more data, are more reliable for lack of moving parts, and have a more
durable design.

Figure 6: A Flash

Secure Digital Card (SD Card)

A common type of memory card, SD cards are used in multiple electronic devices,
including digital cameras and mobile phones. Although there are different sizes,
classes, and capacities available, they all use a rectangular design with one side
"chipped off" to prevent the card from being inserted into the camera or other device
the wrong way.

Solid State Drive (SSD)

A solid-state drive uses flash memory to store data and is sometimes used in devices
such as netbooks, laptop, and desktop computers instead of a traditional hard disk
drive. The advantages of an SSD over an HDD include a faster read/write speed,
noiseless operation, greater reliability, and lower power consumption. The biggest
downside is cost, with an SSD offering lower capacity than an equivalently priced
HDD.

Cloud Storage

With users increasingly operating multiple devices in multiple places, many are
adopting online and cloud computing solutions. Cloud computing basically involves
accessing services over a network via a collection of remote servers. Although the
idea of a "cloud of computers" may sound rather abstract to those unfamiliar with
this metaphorical concept, in practice it can provide powerful storage solutions for
devices that are connected to the internet.

Punch Card

Punch cards (or punched cards) were a common method of data storage used in the
early computers. Basically, they consisted of a paper card with punched or perforated
holes that have been created by hand or machine. The cards were entered into the
computer to enable the storage and accessing of information. This form of data
storage media pretty much disappeared as new and better technologies were
developed.
Study Session 2:

File Organization

Learning Outcomes:

At the end of this session, you should be able to:


• Explain the meaning of a computer file.
• Identify and describe file organization techniques.
• Describe the type of data files used in a Data Processing Environment

What is a computer file?


A file is collection of data or information that has a name called the filename. Most
of the information stored in computer system is stored as files. A file is often stored
with a user’s given name and a system supplied extension. The name of the file
should reflect the content of the data stored in the file. For example, payroll or result
File. The extension of the file should reflect either the type of file (i.e., program file,
image file, audio file etc.) or the software used to create the file (e.g., MS WORD
document, MS EXCEL Worksheet, BASIC Compiler). For example;
Table 5: Computer Files
S/No Type of File / Software Extension
1 Microsoft Word Document .doc
2 Audio file / Video .wav, avi, .mp3
3 BASIC compiler .bas
4 Text .txt
5 Microsoft Excel .xls

a. Program File
These are files that store sets of instructions written in a programming language. A
source program file, for example, contains the instructions written in a high-level
language such as BASIC or FORTRAN programming language by a programmer.
In contrast, the object file is the translated form of the source file in machine code
after. The files that contain the machine code are called executable files (or binary
files).
b. ASCII File
ASCII stands for American Standard Code for Information Interchange. ASCII files
are text-based files. The characters are represented in ASCII code (without
formatting such as underline, italics, boldface, or graphics). Files stored in this
format are used to transfer documents between incompatible computer platforms,
such as IBM and Macintosh.
c. Image File
Documents containing digitized graphics or images are stored in this format. Image
files: Image files contain digitized graphics. Audio and video files: Audio files
contain digitized sound, while video files contain digitized video images and
animation.
d. Audio and Video File
This is a file that is used to store digitized sound or digitized video images and
animation.
e. Data File
This refers to document files, contain data, not programs. Their contents are using
application software.

File Organization
The file is created, arranged, and maintained in data processing systems to
retrieve quickly. Computer systems store files permanently on secondary storage
devices. Records or files are arranged in several ways on the storage media, and
the arrangement determines how individual records can be accessed or retrieved.
Four common ways of file organization and access are:

A. Serial File Organization


In this method of the file, organization records are not arranged in any specific order.
If magnetic tapes are used to store data, it would be necessary to wind the tape
forward and backward to access a given record since access can only be made in the
sequence in which the records were physically stored on the tape, i.e., serially.
Moreover, if records are stored on disk, a full index will be required to access any
given record. This method of file organization is, therefore, inefficient.
B. Sequential File Organization
In this type of file organization, data records are typically stored in ascending
order of required fields. Data must be retrieved in the same physical sequence in
which they are stored. It is the only file organization method that can be used on
magnetic tapes. Magnetic tape is a sequential storage device. That is, records and
files are stored in magnetic tape in sequential order. They are also read in
sequential order. Note that records may also be stored sequentially on disk if
desired. Serial and sequential file access means the same thing regarding files
stored on tapes when stored in sequence, but this may not be the same case with
disk files as the records accessed serially may not be defined in a critical
sequence. This sequential file organization method is no longer a popular storage
or access records in a file.
C. Indexed Sequential Organization
This technique of file organization uses both the sequential and direct access
methods. It is widely applied to the storage of records on the magnetic disk. It allows
the sequential file to be manipulated serially as the record is stored in ascending
order of key field. Also, it will enable direct access to storage devices to be accessed
directly using the indexed sequential access method (ISAM). This access method
relies on an index or key fields to locate individual records. An index to a file is
similar to how a book can be used to find its physical position //on a library shelf.
The method requires that data are stored in a magnetic or optical disk. For example,
a university could index specific ranges of students’ matriculation numbers from
0000 to 1000, 1001 to 2000, and so on. For the computer to find the record with the
key field 8888, it would go first to the index, which would give the location of the
range in which the key field appears (for example, 8001 to 9000). The computer
would then search sequentially (from 8001) to find area 8888
D. Director Random File Organization
This file organization is utilized with magnetic disk technology. Most computer
applications today use this approach for storing records in computer files. In this
approach, individual records are kept in a particular sequence of key fields. Thus,
users can access records in a sequence they desire, without regard to actual
physical order on magnetic tapes or disk. With this approach, every record has an
address that makes it possible to locate it independently of other storage media
records. To allow easy access and retrieval of information, an index or table of
the key is maintained with the record's relative record number in storage. The
actual key is looked up in the index with the corresponding record number of the
record that matches the key to retrieve a record. Once this is found, the address in
storage is worked out, and the recoded accessed. Records stored with the
technique are much faster to be accessed than records store with the sequential
file organization. However, they may be more expensive because optic or
magnetic disks may be involved in their storage.
Classification of Storage Devices
Storage devices can be classified generally as sequential access or random access.
For example, a tape drive is a sequential-access device because to get to record
five (5) on the tape, and the drive needs to pass through points 1, 2, 3, and 4. A
disk drive, on the other hand, is a random-access device because it allows the
record at any position on the disk to be accessed without passing through all
intervening positions.
Types of data files
Data stored in a data processing center could be considered as a transaction file and
master file.
(a) Transaction file
This refers to a collection of transaction records. The transaction file is a temporary
holding file that stores records that generally have a limited useful lifetime. For an
employee file for payroll processing, a transaction file would hold, the name, contact
information, hour worked, pay rate, tax, utility bills, etc. for staff for a particular
month. At the end of every month or so, there will be a need to compute the staff's
salaries from the information in the transaction file. After the transactions are
successfully carried out, the transaction file's information will be used to update the
master file. In a data processing system, transaction records may be retained online
for some period and later achieved on permanent storage devices. Transaction files
can serve as audit trails and history for the organization.
(b) Master File
The master file is a collection of records that are relatively permanent records that
are updated periodically. Thus, once a record has been added to a master file, it
remains in the system indefinitely. The value of the record fields will change over
its lifetime, but the individual records are retained indefinitely. Master files contain
descriptive data, such as name and address, and summary information, such as
students Cumulative, Grade Point Average in an examination processing system or
total net pay, total tax deductions in a payroll system. The changes to be made to a
master file could be the addition of records, deletion of a record, or update of a
record. This will occur in an organization when new staff joins the workforce or
when a staff resigns his appointment.
In addition, the master file of a payroll system may be composed of discrete pieces
of information (such as a name, address, or employee number, called data elements.
Data are keyed into the system, updating the data elements periodically. The master
file features are combined in different ways to make up periodic reports of interest
to management and government agencies or to generate paychecks sent to the staff
at the end of the month. Other examples of master files include Customer, Product,
Result, or Supplier file.

Self-Assessment Questions
1. Define the term a computer file.
2. Explain three main techniques of file organization.
3. List and explain the two types of files.

References/Further Readings
• Jeffery L. Whitten, Lonnie D. Bentley, Kevin C. Dittman, Systems
Analysis and Design Methods, McGraw Hill, New York, 2004.
• Introduction to Computers and Information Technology, 2nd edition, Pearson,
2015, ISBN-13: 9781323237120.
Study Session 3:

Traditional File System/Processing

Learning Outcomes:

At the end of this session, you should be able to:

• Explain the concept file system


• Identify and explain the problems with the file system

Traditional File Systems /Processing

As an organization grows, computer systems and applications become more


complex. For example, a university s computer system that handles student s
information if the data of students is to be kept independently by the units that each
student interacts with, for example, Registration, Hostel, Accounts, Examination,
and Records, Students Affairs, Health Centre, etc. Worse still, each department is
allowed to keep students' information independently in their application. A time
comes when multiple files containing the same records of students will exist in the
different units.

Problem with the Traditional File System

Some problems with the traditional file environment are Data Dependence,
Program-Data Dependence, and Difficulty of Data-Sharing. These are discussed as
follow:

(a) Data Redundancy: This means the presence of duplicate data in multiple data
files and often in different formats. This is often the result when different
departments are allowed to collect the same piece of information about an object.
For instance, within the university environment, the hostels and student registration
department might collect the same student’s information (Name, Mat No, Level, and
Address). Because it is collected and maintained in so many different places, the
same data items may be repeated in different departments. When data fields are
repeated in different files, storage spaces are wasted, and much time is spent trying
to update the records.

(b) Program-Data Dependence: Program-Data Dependence is the tight


relationship between data stored in files and the specific programs that process the
files' information. Computer programs become so data specific; any changes in data
would also mean any modification of the program's processes. Such changes could
be costly in terms of the time and cost of re-programming

(c) Difficulty of Data-Sharing: It is difficult to share data in a file environment


because it may be challenging to relate the data in one file with that of another within
one or several departments were files are kept. Besides, there is no control over the
access to data, which makes it difficult to retrieve the desired information.

(d) Access Time in a Traditional File Environment: One of the primary


disadvantages of the traditional file environment is the time it takes to access data.
It takes lots of time to locate a few files in an extensive paper filing system,
depending on their location. Electronic databases allow for almost instantaneous
access to information. Having a faster data access approach increases managers,
analysts, accountants, and other workers who use data regularly.

(e) Editing and Communication: A traditional file system is cumbersome in that


it does not allow users to edit files or send information to others easily. Paper files
often cannot be edited directly, forcing users to make new copies to update old files.
To distribute data on paper files, users must mail, fax, or scan the data. Databases
allow users to edit information fields directly, and because the information is stored
digitally, it is already in a form that can be easily transmitted.
(f) Order of Data: Data can get out of order in traditional filing systems. If someone
accidentally puts a file in the wrong place, or takes a file out of a cabinet and forgets
to put it back, it can lead to lost data or the creation of additional copies of files.
Electronic filing systems allow users to quickly check whether information already
exists somewhere in the system, which helps avoid redundant files and data loss
problems.

Self-Assessment Questions

1. What do you understand by Traditional File System?


2. Explain five main problems with the Traditional File System

References/Further Readings
Dominic Giampaolo, Practical File System Design with the Be File System,
Morgan Kaufmann; Kindle edition (August 29, 2013), ISBN-13: 9781558604971.

You might also like