Storage System in DBMS
Storage System in DBMS
In this section, we will take an overview of various types of storage devices that are used for
accessing and storing data.
For storing the data, there are different types of storage options available. These storage types
differ from one another as per the speed and accessibility. There are the following types of
storage devices used for storing the data:
o Primary Storage
o Secondary Storage
o Tertiary Storage
Primary Storage
It is the primary area that offers quick access to the stored data. We also know the primary
storage as volatile storage. It is because this type of memory does not permanently store the data.
As soon as the system leads to a power cut or a crash, the data also get lost. Main memory and
cache are the types of primary storage.
o Main Memory: It is the one that is responsible for operating the data that is available by
the storage medium. The main memory handles each instruction of a computer machine.
This type of memory can store gigabytes of data on a system but is small enough to carry
the entire database. At last, the main memory loses the whole content if the system shuts
down because of power failure or other reasons.
o Cache: It is one of the costly storage media. On the other hand, it is the fastest one. A
cache is a tiny storage media which is maintained by the computer hardware usually.
While designing the algorithms and query processors for the data structures, the designers
keep concern on the cache effects.
Secondary Storage
Secondary storage is also called as Online storage. It is the storage area that allows the user to
save and store data permanently. This type of memory does not lose the data due to any power
failure or system crash. That's why we also call it non-volatile storage.
There are some commonly described secondary storage media which are available in almost
every type of computer system:
o Flash Memory: A flash memory stores data in USB (Universal Serial Bus) keys which
are further plugged into the USB slots of a computer system. These USB keys help
transfer data to a computer system, but it varies in size limits. Unlike the main memory, it
is possible to get back the stored data which may be lost due to a power cut or other
reasons. This type of memory storage is most commonly used in the server systems for
caching the frequently used data. This leads the systems towards high performance and is
capable of storing large amounts of databases than the main memory.
o Magnetic Disk Storage: This type of storage media is also known as online storage
media. A magnetic disk is used for storing the data for a long time. It is capable of storing
an entire database. It is the responsibility of the computer system to make availability of
the data from a disk to the main memory for further accessing. Also, if the system
performs any operation over the data, the modified data should be written back to the
disk. The tremendous capability of a magnetic disk is that it does not affect the data due
to a system crash or failure, but a disk failure can easily ruin as well as destroy the stored
data.
Tertiary Storage
It is the storage type that is external from the computer system. It has the slowest speed. But it is
capable of storing a large amount of data. It is also known as Offline storage. Tertiary storage is
generally used for data backup. There are following tertiary storage devices available:
o Optical Storage: An optical storage can store megabytes or gigabytes of data. A
Compact Disk (CD) can store 700 megabytes of data with a playtime of around 80
minutes. On the other hand, a Digital Video Disk or a DVD can store 4.7 or 8.5 gigabytes
of data on each side of the disk.
o Tape Storage: It is the cheapest storage medium than disks. Generally, tapes are used for
archiving or backing up the data. It provides slow access to data as it accesses data
sequentially from the start. Thus, tape storage is also known as sequential-access storage.
Disk storage is known as direct-access storage as we can directly access the data from
any location on disk.
Storage Hierarchy
Besides the above, various other storage devices reside in the computer system. These storage
media are organized on the basis of data accessing speed, cost per unit of data to buy the
medium, and by medium's reliability. Thus, we can create a hierarchy of storage media on the
basis of its cost and speed.
Thus, on arranging the above-described storage media in a hierarchy according to its speed and
cost, we conclude the below-described image:
In the image, the higher levels are expensive but fast. On moving down, the cost per bit is
decreasing, and the access time is increasing. Also, the storage media from the main memory to
up represents the volatile nature, and below the main memory, all are non-volatile devices.
RAID 1
RAID 1 uses mirroring techniques. When data is sent to a RAID controller, it sends a copy of
data to all the disks in the array. RAID level 1 is also called mirroring and provides 100%
redundancy in case of a failure.
RAID 2
RAID 2 records Error Correction Code using Hamming distance for its data, striped on different
disks. Like level 0, each data bit in a word is recorded on a separate disk and ECC codes of the
data words are stored on a different set disks. Due to its complex structure and high cost, RAID
2 is not commercially available.
RAID 3
RAID 3 stripes the data onto multiple disks. The parity bit generated for data word is stored on
a different disk. This technique makes it to overcome single disk failures.
RAID 4
In this level, an entire block of data is written onto data disks and then the parity is generated
and stored on a different disk. Note that level 3 uses byte-level striping, whereas level 4 uses
block-level striping. Both level 3 and level 4 require at least three disks to implement RAID.
RAID 5
RAID 5 writes whole data blocks onto different disks, but the parity bits generated for data
block stripe are distributed among all the data disks rather than storing them on a different
dedicated disk.
RAID 6
RAID 6 is an extension of level 5. In this level, two independent parities are generated and
stored in distributed fashion among multiple disks. Two parities provide additional fault
tolerance. This level requires at least four disk drives to implement RAID.