Choosing and Architecting Storage For Your Environment
Choosing and Architecting Storage For Your Environment
Choosing and Architecting Storage For Your Environment
Lucas Nguyen Technical Alliance Manager Mike DiPetrillo Specialist Systems Engineer
Agenda
Storage Mechanisms
Technology Market Transfers Interface Performance High (due to dedicated network) Medium (depends on integrity of LAN) Medium (depends on integrity of LAN)
Fibre Channel
Data Center
FC HBA
NAS
SMB
NIC
iSCSI
SMB
iSCSI HBA
DAS
Branch Office
Block access
SCSI HBA
DAS
Direct Attached Storage
NAS
Network Attached Storage
SAN
Storage Area Network
DAS
Tape / RAID S/W Cluster
NAS
Tape / RAID NIC failover S/W Cluster Filer Cluster LAN backup Data Replication
SAN
Tape / RAID HBA / SP failover Fabric / ISL redundancy Data Replication technologies S/W Cluster within Virtual Machine LAN backup within Virtual Machine VMware HA VMware Consolidated Backup
Choosing Disks
VM performance gated ultimately by IOPS density and storage space IOPS Density -> Number of read IOPS/GB
Higher = better
Tier1 -> 144 GB, 15k RPM->180 IOPS/144GB = 1.25 IOPS/GB Tier2 -> 300 GB, 10k RPM-> 150 IOPS/300GB = 0.5 IOPS/GB Tier3 -> 500 GB, 7k RPM -> 90 IOPS/500 GB = 0.18 IOPS/GB Relative Performance
Tier1 -> 1.0 Tier2 -> 0.4 (40%) Tier3 -> 0.14 (14%)
Volume Aggregation
Stripe virtual LUN across volumes from multiple RAID 5 groups. Some storage platforms only concat, but striping is preferred. Aggregate across volumes in the same ZBR zone. Do not mix volumes from different disk sizes, rotational velocity, or volume sizes. It is OK and preferred to stripe within the same volume groups. End result is one LUN presented to VMware spanning many physical disks.
Service Time: time for disk to complete requests Response Time (or svc_t) = wait time in queue + service time I/O active in device = actv Average wait queue response time = wsvc_t Average run queue response time = asvc_t
SCSI is a connect/disconnect protocol so the array can make certain optimizations Wait queue - I/Os buffering in the HBA/sd queue - bad Active queue I/Os buffered in the storage array Service queue I/Os being serviced on the disk (read miss) or cache (read hit, or fast write)
Array writes written to hardware cache, destaged to disk with SCSI write buffering disabled Array reads Array can reorder reads to minimize storage contention
SCSI tag queuing can optimize reads on active disks
Busy, but not backed into the HBA wait queue Average I/O 80-100 ms which is very slow (>50 ms)
R/s
w/ s
Kr/s
kw/s
wait
actv
wsvc_t
asvc_t
%w
%b
device
Utilization
Throughput (IOPS)
Av Read Sz (K)
Serv Time
0 0 0 0 0 0
88 84 84 92 89 88
VMware
LUN/VMFS active on one path (active/passive arrays) only VMFS volume much larger than typical OS LUN
Use WWPN zoning and zone the initiator (HBA) to the FA (storage port) in a 1:1 relationship This minimizes RSCN disruptions, device LI/LO, fail-over host based confusion
CASE STUDY
Background
Architecture can have huge performance implications Every environment will be different Use tests in your environment to find bottlenecks
Tests Run
IOMeter
70% Random, 70% Read, 64k Block 5 Minute run 10 GB disk
Fibre Channel Student Results VMFS Total I/Os per Second (IOPS) Total MBs per Second (Throughput ) Average I/O Response Time (ms) % CPU Utilization (total) RDM Fibre Channel Pre-Run VMFS 3294 RDM 3353 iSCSI Pre-Run VMFS 1813 RDM 1865 NAS Pre-Run VMDK 1691
206
209
113
116
105
1.21
1.19
2.20
2.14
2.36
33.87%
27.26%
24.00%
19.40%
23.00%
Results
110
113
206
209
113
116
105
1.19
1.24
1.21
1.19
2.20
2.14
2.36
22.73%
21.72%
33.87%
27.26%
24.00%
19.40%
23.00%
Analysis iSCSI and NAS give good performance Tier your storage RDMs do not always give better performance than VMFS
(1894, 3294) for VMFS (1868, 3353) for RDM
Fibre Channel Student Results VMFS Total I/Os per Second (IOPS) Total MBs per Second (Throughput ) Average I/O Response Time (ms) % CPU Utilization (total) 1894 RDM 1868 Fibre Channel Pre-Run VMFS 3294 RDM 3353 iSCSI Pre-Run VMFS 1813 RDM 1865 NAS Pre-Run VMDK 1691
110
113
206
209
113
116
105
1.19
1.24
1.21
1.19
2.20
2.14
2.36
22.73%
21.72%
33.87%
27.26%
24.00%
19.40%
23.00%
Analysis
Located a potential bottleneck SP path
Questions?
http://www.vmware.com/vmtn/vmworld/sessions/
Enter the following to download (case-sensitive):
Some or all of the features in this document may be representative of feature areas under development. Feature commitments must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery.