18CS822 - SAN - Module 1
18CS822 - SAN - Module 1
18CS822 - SAN - Module 1
Module 1
Storage System
By,
MANJULA DEVI P
ISE, AMCEC
Syllabus:
Storage System: Introduction to Information Storage: Information Storage,
Evolution of Storage Architecture, Data Center Infrastructure, Virtualization and
Cloud Computing.
Data Center Environment: Application Database Management System (DBMS),
Host (Compute), Connectivity, Storage, Disk Drive Components, Disk Drive
Performance, Host Access to Data, Direct-Attached Storage, Storage Design Based
on Application
Introduction to Information Storage
Information Storage:
Storage is a repository that enables users to persistently store and retrieve this digital data.
Why Information management?
Information is increasingly important in our daily lives. We have become information
Dependents.
We live in on-command, on-demand world that means we need information when and
where it is required.
We access the Internet every day to perform searches, participate in social networking,
send and receive e-mails, share pictures and videos, and scores of other applications.
Equipped with a growing number of content-generating devices, more information is
being created by individuals than by businesses.
The importance, dependency, and volume of information for the business world also
continue to grow at astounding rates.
Businesses depend on fast and reliable access to information critical to their success.
Some of the business applications that process information include airline reservations,
telephone billing systems, e-commerce, ATMs, product designs, inventory management,
e-mail archives, Web portals, patient records, credit cards, life sciences, and global
capital markets.
The increasing criticality of information to the businesses has amplified the challenges in
protecting and managing the data.
Organizations maintain one or more data centers to store and manage information. A data
center is a facility that contains information storage and other physical information
technology (IT) resources for computing, networking, and storing information.
Data:
Data is a collection of raw facts from which conclusions might be drawn.
Examples: Handwritten letters, a printed book, a family photograph, printed and duly signed
copies of mortgage papers, a bank’s ledgers, and an airline ticket.
Digital data: Data can be generated using a computer and stored as strings of binary numbers (0s
and 1s) are called digital data.
Data is unstructured if its elements cannot be stored in rows and columns, and is
therefore difficult to query and retrieve by business applications.
Example: e-mail messages, business cards, or even digital format files such as .doc, .txt,
and .pdf.
Storage:
Data created by individuals or businesses must be stored so that it is easily
accessible for further processing.
In a computing environment, devices designed for storing data are termed storage
devices or simply storage.
The type of storage used varies based on the type of data and the rate at which it is
created and used.
Devices such as memory in a cell phone or digital camera, DVDs, CD-ROMs, and
hard disks in personal computers are examples of storage devices.
Cloud Computing:
Cloud computing enables individuals or businesses to use IT resources as a service over
the network.
It provides highly scalable and flexible computing that enables provisioning of resources
on demand.
Users can scale up or scale down the demand of computing resources, including storage
capacity, with minimal management effort or service provider interaction.
Cloud computing empowers self-service requesting through a fully automated request
fulfilment process.
A device driver is special software that permits the operating system to interact with a specific
device, such as a printer, a mouse, or a disk drive.
In the early days, disk drives appeared to the operating system as a number of continuous disk
blocks. The entire disk drive would be allocated to the file system or other data entity used by
the operating system or application.
Disadvantages:
lack of flexibility.
When a disk drive ran out of space, there was no easy way to extend the file system’s
size.
as the storage capacity of the disk drive increased, allocating the entire disk drive for the
file system often resulted in underutilization of storage capacity
Solution: evolution of Logical Volume Managers (LVMs)
LVM enabled dynamic extension of file system capacity and efficient storage management
The LVM is software that runs on the compute system and manages logical and physical
storage.
LVM is an intermediate layer between the file system and the physical disk.
LVM can partition a larger-capacity disk into virtual, smaller-capacity volumes(called
Partitioning) or aggregate several smaller disks to form a larger virtual volume. The process
is called concatenation.
Disk partitioning was introduced to improve the flexibility and utilization of disk drives.
In partitioning, a disk drive is divided into logical containers called logical volumes (LVs).
Concatenation is the process of grouping several physical drives and presenting them to the
host as one big logical volume (As shown in figure 7).
The basic LVM components are physical volumes, volume groups, and logical volumes.
Each physical disk connected to the host system is a physical volume (PV).
A volume group is created by grouping together one or more physical volumes. A unique
physical volume identifier (PVID) is assigned to each physical volume when it is initialized
for use by the LVM.
Logical volumes are created within a given volume group. A logical volume can be
thought of as a disk partition, whereas the volume group itself can be thought of as a disk.
File System:
A file is a collection of related records or data stored as a unit with a name.
A file system is a hierarchical structure of files.
A file system enables easy access to data files residing within a disk drive, a disk partition, or
a logical volume.
It provides users with the functionality to create, modify, delete, and access files.
Access to files on the disks is controlled by the permissions assigned to the file by the owner,
which are also maintained by the file system.
A file system organizes data in a structured hierarchical manner via the use of directories,
which are containers for storing pointers to multiple files.
All file systems maintain a pointer map to the directories, subdirectories, and files that are
part of the file system.
Examples of common file systems are:
FAT 32 (File Allocation Table) for Microsoft Windows
NT File System (NTFS) for Microsoft Windows
UNIX File System (UFS) for UNIX
Extended File System (EXT2/3) for Linux
The file system also includes a number of other related records, which are collectively
called the metadata.
Individual VMs can be restarted, upgraded, or even crashed, without affecting the
other VMs.
VMs can be copied or moved from one physical machine to another) without causing
application downtime.
Connectivity:
Connectivity refers to the interconnection between hosts or between a host and
peripheral devices, such as printers or storage devices.
Connectivity and communication between host and storage are enabled using:
physical components
Interface protocols.
Physical Components of Connectivity:
The physical components of connectivity are the hardware elements that connect the
host to storage.
Three physical components of connectivity between the host and storage are
The host interface device
Port
Cable
A host interface device or host adapter connects a host to other hosts and storage devices.
Eg: host bus adapter (HBA) and network interface card (NIC).
HBA is an application-specific integrated circuit (ASIC) board that performs I/O
interface functions between the host and storage, relieving the CPU from additional
I/O processing workload.
IDE/ATA is a popular interface protocol standard used for connecting storage devices,
such as disk drives and CD-ROM drives.
This protocol supports parallel transmission and therefore is also known as Parallel
ATA (PATA) or simply ATA.
IDE/ATA has a variety of standards and names.
The Ultra DMA/133 version of ATA supports a throughput of 133 MB per second.
In a master-slave configuration, an ATA interface supports two storage devices per
Fibre Channel is a widely used protocol for high-speed communication to the storage
device.
Fibre Channel interface provides gigabit network speed.
It provides a serial data transmission that operates over copper wire and optical fiber.
The latest version of the FC interface (16FC) allows transmission of data up to 16 Gb/s.
Internet Protocol (IP):
IP is a network protocol that has been traditionally used for host-to-host traffic.
With the emergence of new technologies, an IP network has become a viable option for
host-to-storage communication.
IP offers several advantages:
cost
maturity
enables organizations to leverage their existing IP-based network.
iSCSI and FCIP protocols are common examples that leverage IP for host-to-
storage communication.
Storage:
Storage is a core component in a data center.
A storage device uses magnetic, optic, or solid state media.
Disks, tapes, and diskettes use magnetic media,
CD/DVD uses optical media.
Removable Flash memory or Flash drives uses solid state media.
In the past, tapes were the most popular storage option for backups because of their low cost.
Tapes have various limitations in terms of performance and management, as listed below:
To access data, the actuator arm moves the R/W head over the platter to a particular track
while the platter spins to position the requested sector under the R/W head.
The time taken by the platter to rotate and position the data under the R/W head is called
rotational latency.
This latency depends on the rotation speed of the spindle and is measured in milliseconds.
Average rotational latency = One-half of the time taken for a full rotation.
o Appx. 5.5 ms for 5400-rpm drive
o Appx. 2.0 ms for 15000-rpm (250rps)
drive Average rotational latency=0.5/250=2ms
3. Data Transfer Rate:
The data transfer rate (also called transfer rate) refers to the average amount of data per unit
time that the drive can deliver to the HBA.
In a read operation, the data first moves from disk platters to R/W heads, and then it moves
to the drive’s internal buffer. Finally, data moves from the buffer through the interface to the
host HBA.
In a write operation, the data moves from the HBA to the internal buffer of the disk drive
through the drive’s interface.
The data then moves from the buffer to the R/W heads. Finally, it moves from the R/W heads
to the platters.