Good Performance of Storage Systems With IBM I

Good Performance of Storage
systems with IBM i

Jana Jamsek, ATS Europe
08/13/2012
Building a smarter planet
Agenda
Customers considerations
Storage systems that connect to IBM i
IBM i architecture and external storage
Sizing guidelines for Storage systems with IBM i
Demo of Disk Magic for IBM i
© 2009 IBM Corporation

Typical Customers’ considerations
Daily workload
– Interactive transaction workload
Requst: Short Transaction rensponse time
Batch job running during the night

– Typically many jobs that run brief transactions with database operations
– Copying of libraries
– Saving to tape
Request: Duration of batch job within certain limits
Typical request: IBM i on External storage should perform as good as on internal disk

IBM i and Storage systems
IA
DS8000
V7000 SSD
SSD IBM i
SVC SSD
VIOS
Tape
Libraries
& Drives
DS4000 DS5000
XIV
SSD ProtecTIER
DS6000
Note: The shown connections with VIOS mean VIOS vscsi
To properly plan and size Storage systems for IBM i it is

important to understand IBM i specific architecture

IBM i specifics: Single-Level Storage

IBM i specifics: Handling Input Output operations

Storage Manager – Handling IO operations
Storage management directories

Internal disk
Pages
Map disk and sector to a virtual address
Virtual addresses Page tables
LUN on external storage

IO flow – Internal disk drives
Processors
IO flow
MS bus Main storage
IO bus – PCI-X
Cache
IOA Ex #5903, #5904
Ex SSF SAS disk drives

Internal Disk drives
IO flow – Natively connected external storage

Processors Main storage
SLIC IOA DD
POWER HSL, 12x loop

IBM i PCI-X
IO flow
IOA IOA Ex #5735
SAN
HBA HBA
Processor, Processor,
cache cache
PCI-e
Storage DA DA
System FC DDM
LUN

Note: the connections in External storage show DS8000
IO flow – Storage system connected with VIOS
Request
Power Hadrware
Data
Storage system
IO flow - virtualization with SVC

IBM i client
POWER
VSCSI
VIOS VIOS
SVC
vdisk
Storage pool
Background storage system
Physical disk drives

mdisk
SCSI command tag queuing
IOP-less adapter Native: Max 6 IO operations at the same time
IBM i LUN
LPAR 1
Virtual FC
IBM i LPAR 2
LUN
LPAR 3
IBM i
Virtual scsi VIOS
NPIV: Max 6 IO operations at the same time

VSCSI: Max 32 IO operations at the same time

Sector conversion
Integrated disks in IBM i, support 520 bytes per sector

DS8000 supports 520 bytes per sector
– FC attached, attached with VIOS_NPIV
Midrange Storage supports 512 bytes per sector. Therefore, it is needed to convert 520
bytes per sector data layout to 512 byte per sector.
For every page in IBM i (8*512-byte sectors) an extra 512-byte sector is allocated. The extra
sectors contain the information previously stored in the 8-byte sector headers.
The usable capacity in Midrange Storage is therefore multiplied by 8/9 which means that it
is reduced by about 11%, when the logical volumes report to IBM i.

Sizing Guidelines

IBM i performance data
Good to use IBM i performance data even before modelling with Disk Magic, to apply sizing guidelines
Needed reports for sizing external storage

– System report / Disk utilization & Storage pool utilizaton
– Resource report / Disk utilization
– Component report / Disk activity
Desired way of collecting:

– Collect performance data during 24 hours on 3 consecutive days and during heavy End-of-month job
– Colleciton interval 5 minutes
Insert the reports to Disk Magic to obtain data in Excell spreadsheet

Example: Spreadsheet by Disk Magic
IO rate by server
Both systems KB/IO
3000.0 80.0
70.0
2500.0 60.0
50.0
40.0 KB/Write
2000.0
30.0
i1_ASP1 20.0
1500.0 10.0
i2_ASP1 0.0
Interval Start Time

03:05:04
06:15:04
09:25:04
12:35:04
15:45:04
18:55:04
22:05:04
01:15:04
04:25:04
07:35:04
10:45:04
13:55:04
17:05:04
20:15:04
23:25:04
02:35:04
05:45:04
08:55:04
12:05:04
15:15:04
18:25:04
21:35:04
1000.0
500.0
0.0
1 43 85 127 169 211 253 295 337 379 421 463 505 547 589 631 673 715 757 799 841

Determine the peaks - continue

Graphs from Disk Magic spreadsheet
Reports from 7 consecutive days
1. Oct candidate for peak

I/ O R a t e
7 0 0 0 .0
6 0 0 0 .0
5 0 0 0 .0
4 0 0 0 .0
I/O R a te
3 0 0 0 .0
2 0 0 0 .0
1 0 0 0 .0
0 .0
Oct/01/2009
Oct/01/2009
Oct/01/2009
Oct/02/2009
Oct/02/2009
Oct/03/2009
Oct/03/2009
Oct/03/2009
Oct/04/2009
Oct/04/2009
Oct/05/2009
Oct/05/2009
Oct/05/2009
Oct/06/2009
Oct/06/2009
Oct/07/2009
In te r v a l T im e
S
ep
/1
S 6/2
ep 0
/1 11
S 6/2 00
ep 0 :0
/1 11 1:
S 6/2 00 04
0.0
2000.0
4000.0
6000.0
8000.0
10000.0
12000.0
ep 01 :4
/1 1 6:
S 6/2 01 04
ep 0 :
/1 11 31:
S 6/2 0 04
ep 01 2:1
/1 1 6
S 6/2 03 :04
ep 0 :0
/1 11 1
S 6/2 03 :04
ep 0 :4
/1 11 6:
S 6/2 04 04
ep 0 :
/1 11 31:
S 6/2 05 04
ep 0 :1
/1 11 6
S 6/2 06 :04
ep 01 :0
/1
6 1 1:
S 06 04
ep /20 :
/1 11 46:
S 6/2 07 04
ep 0 :
/1 11 31:
S 6/2 0 04
ep 01 8:1
/1 1 6:
I/O Rate
S 6/2 09 04
ep 0 :0
Interval Time
/1 11 1:
S 6/2 09 04
ep 0 :
/1 11 46:
S 6/ 1 04
ep 20 0:
/1 11 31:
S 6/ 1 04
ep 20 1:
16
/1 11
S 6/2 12 :04
ep 0 :0
Peak in IO/Sec
/1 11 1:
6/ 04
20 12:
1 46
1 :
13 04
:3
1:
Graphs from Disk Magic spreadsheet
04
I/O Rate
Determine the peaks - continue
In
te
rv
al
S ta
rt
T 0.0
1000.0
2000.0
3000.0
4000.0
5000.0
6000.0
7000.0
8000.0
9000.0
10000.0
00 ime
:3
1
01 :0 4
:1
6:
02 0 4
:0
1
02 :0 4
:4
6
03 :0 4
:3
1
04 :0 4
:1
6
05 :0 4
:0
1
05 :0 4
:4
6:
06 0 4
:3
1
07 :0 4
:1
6:
08 0 4
RW
:0
1
08 :0 4
:4
6
09 :0 4
Peak for disk utilization
:3
1:
10 0 4
:1
6
11 :0 4
:0
1:
11 0 4
:4
6
12 :0 4
:3
1
13 :0 4
:1
Read write ratio is important to determine the peak, because of RAID penalty on write operations
6:
04
Reads/sec
Writes/sec
Component report: present cache hits

Why is the number of disk drives (DDMs) important
Each LUN spans multiple disk drives (DDMs)

The same set of DDMs is used by multiple LUNs
The more physical disk drives (disk amrs) are available and the higher is their rotation, the
faster are IO on each LUN
Example:
LUNs in DS8000
RAID level of DDMs usually influeneces performance too

Guidelines for RAID level

RAID-10 provides better resiliency
RAID-10 provides generally better performance:
– RAID-5 results in 4 disk operations per write – higher penalty
– RAID-10 results in 2 disk operations per write – lower penalty
RAID-10 requires more capacity
In DS8000 use RAID-10 when:
– There are many random writes
– Write cache efficiency is low
– Huge workload
In Midrange storage and Storwize V7000 we recommend to use RAID-10

DS8800: Number of ranks - continue
Quick calculation based on shown detailed calcuation
Assumed cache
hits:
20% read hit
30% write efficiency
IO/sec per rank
Example 9800 IO/sec with Read/Write ratio 50/50 need 9800 / 982 =app10 * RAID-10 ranks of 15 K rpm
disk drives, connected with IOP-less adapters
The table can be found in the Redbook IBM System Storage DS8000:Host Attachment and
Interoperability, SG24-8887-00
Detailed calculation: ( reads/sec – read cache hits % ) + 2 * (writes/sec – write cache efficiency) = disk
operations/sec (on disk)

DS5000/4000/3000:Number of disk drives

Detailed Calculation for maximal IO/sec on disk in Quick calculation
Raid-10 IO/sec per DDM:
( reads/sec – read cache hits % ) + 2 * (writes/sec
70% Read 50% Read
– write cache efficiency) = disk operations/sec
15 K RPM disk drive
(on disk)
RAID-1 or RAID-10 82 74
RAID-5 58 45
10 K RPM disk drive
RAID-5 39 30
Example: 7000 IO/sec with read/write ratio 70 / 30 needs 7000 / 82 =app

85 * 15 K RPM disk drives in RAID-10
Storwize V7000: Number of disk drives
Quick calculation:
IO/ sec per disk drive 70% Read 50% Read
15 K RPM disk drive
RAID-5 96 75
10 K RPM disk drive
RAID-5 64 50
Example: 7000 IO/sec with read/write ratio 70 / 30 needs 7000 / 138 =app
50 * 15 K RPM disk drives in RAID-10

Number of DDMs, connected with VIOS and SVC
The sizing guidelines and calculations for DDMs in Storage systems connected with VIOS or
VIOS_NPIV don’t change
The sizing guidelines and calculations for DDMs in Storage systems connected with SVC and
VIOS don’t change

Sizing for big blocksizes (transfer sizes)
Big blocksize: 64 KB and above

Add about 25% disk arms for big blocksizes
The shown guidelines assume small blocksize (about 12 KB)
Peak in IO usually experiences small blocksizes
Peak in blocksizes has typically low IO/sec
So usually we size for peak in IO/sec and don’t update with additional 25%

Why is the number of LUNs important
SCSI command tag queuing allows 6 concurrent IO operaitons to a LUN

– Available with IOP-less adapters
– The more LUNs the more concurrent operations to disk storage
In VIOS, the queue depth for IBM i volume is 32
The more LUNs are defined, the more Storage management server tasks are used,
consequently better IO performance

Number and size of LUNs
With given disk capacity: The bigger the number of LUNs the smaller the size
Sizing guidelines for the number, or for the size
To obtain the number of LUNs you may use WLE ( number of disk drives)
Considerations for very big number of LUNs:
– Many physical adapters are needed for natively connected storage
– Difficult to manage and troubleshoot with big number of virtual adapters in VIOS

Number and size of LUNs -continue
DS8000 Guideline by best practise:

– 2 * size of LUN = 1 * size of DDM, or
– 4 * size of LUN = 1 * size of DDM
– Presently 70 GB or 140 GB LUNs are mostly used
Storwize V7000, SVC

– Presently we recommend about 140 GB LUNs
– This recommendation is based on best practise by other midrance storage systems
– Recommended to craete vdisks in Striped mode ( default )
– Recommended extent size 256 MB ( default )
DS5000 best practise:

– 146 GB physical disks
– 70 GB or 140 GB LUNs
– Make RAID-1 arrays of tow physical disks and create one logical drive per RAID-1 array
– Create one LUN per array
– Recommended segment size 128 KB or 64 KB
– If the number of LUNs is limited to 16 (for example, connecting to IBM i on BladeCenter) you may want to make a RAID-10 array of four
physical disks and create one LUN per array.

Number and size of LUNs -continue

XIV - best pracise for the size of LUNs:
Measurements in Mainz:
– CPW 96000 users, 2 concurrent runs in different LPARs
– 15-module XIV Gen 3
Transaction resp. Time Disk service time (ms) Latency in XIV % Cache hits in XIV
(sec)
42 * 154 GB LUNs 6.6 4.6 4 80
6 * 1TB LUNs 19.3 8.2 7 60
70 GB LUNs were not tested
Recommendation: about 140GB LUNs, or 70GB LUNs

Size of LUNs - Guidelines for different types of connection
The listed guidelines for a particular storage system apply to all cases (when applicable):
– Native connection
Sizing for physical FC adapters applies to natively connected storage
– Connectioned with VIOS vscsi
– Connection with VIOS_NPIV
– Connection via SVC
The size of LUNs applies to SVC vdisks

Why is Multipath important
All path to external storage are used for IO operations

– IOs are done in Round Robin accross all path
– Exception is DS5000 where one path through preferred controller is active and all the other path is
passive

Sizing FC adapters in IBM i – by IO/sec

#5774 or #5749 #5735, #5273, #5276
IO/sec at 70% utilization 10500 per port 12250 per port
GB per port in adapter 2800 3266
Assumed: Access Density = 1.5, For 2 path: multiply the capacity by 2

Example: for one port in #5735 we recommend 3266 / 70 = 46 * 70 GB LUNs
Example: For 2 ports in Multipath we recommend 2 * 46 = 92 -> 64 * 70 GB LUNs in Multipath,
– Therefore use 64 LUNs

Throughput of IOP-less adapters
#5774 or #5749 #5735

Assumed also: #5273, #5276
Max sequential throughput 310 MB/sec per port 400 MB/sec
Avg of max sequential throughput for 4KB, 216 MB/sec per port
256KB raeds, writes
Avg of min sequential throughput for 4KB, 132 MB/sec per port
256KB raeds, writes
Max transaction workload throughput 250 MB/sec per port
Transaction workload throughput at 70% 175 MB/sec per port

utilization
Avg of max sequential throughput for 4KB, 382 MB/sec per 2 ports
256KB raeds, writes
Avg of min sequential throughput for 4KB, 208 MB/sec per 2 ports
256KB raeds, writes

Adapters on HSL and 12x loops
The following table shows recommended number of adapters per half loop

Sharing or dedicating ranks
Sharing ranks among multiple IBM i LPARs

– Enables better usage of resources in external storage
– On the other hand the performance of an LPAR might be influenced by workloads in other LPARs
Dedicating ranks to each LPAR
– Enables stable performance (no influence from other systems)
– Resources are not as well utilized as with shared ranks
Best practise:
– Dedicate ranks to big and/or important systems
– Share ranks among medium and small LPARs

Why is cache important

Read operations:
A directory lookup is performed if the requested track is in cache. If the requested track is
not found in the cache, the corresponding disk track is staged to cache. The setup of the
address translation is performed to map the cache image to the host adapter, the data is
then transferred from cache to host adapter and further to the host connection.
The more data are red from cache the faster is read response time

Why is cache important

Write operations:
Every write operaiton is done to cache
A directory lookup is performed if the requested track is in cache. If the requested track is not in cache,
segments in the write cache are allocated for the track image. The data is then transferred from the host
adapter memory to the two redundant write cache instances.
Data are destaged to disk with certain frequency by cache algorythm
Frequent destages cause a certain delay in write operations to cache
Write cache overflow: need to destage data in order to be able to allocate a track in the cache to be
written to
Write cache efficiency: The Write Efficiency is a number representing the number of times a track is
written to before being destaged to disk. 0% means a destage is assumed for every single write operation,
a value of 50% means a destage occurs after the track has been written to twice.

Guidelines for cache size in external storage
Modelling with Disk Magic
Rough guidelines for DS8800

– 10 to 20 TB capacity: 64GB cache
– 20 to 50 TB: 128 GB cache
– > 50 TB: 256 GB cache

Number of HA cards in DS8800
Rules of thumb for HAs in DS8800

About 4 to 8 * ports in IBM i per one HA card in DS8800
For high performance: The number of HA cards should be the same or bigger than the
number of device adapters in DS8800
At least one HA card per IO enclosure in DS8800

Sizing for VIOS
Use IBM i Workload Estimator
Rule of thumb:
– 0.25 CPUs per each 10,000 I/O of the virtual SCSI client
– For lowest I/O latency preferably use a dedicated processor for VIOS.
1 GB main memory for VIOS
Two or more FC adapters assigned to VIOS for multi-pathing

Sizing IASP
Very rough guideline: about 80% of IO will be done to IASP

System report – Resource utilization; IO to database

Sizing for SSD

iDoctor:
STAT:
100
Skew level: 90
80
70
Percentage of workload
Easy tier Storage Tier Advisor Tool (STAT)

60
50
40 IBM i PEX collection and iDoctor PEX analyzer queries

Consider sklew level of IBM i workload
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
Percentage of active data
Percent of small IO Percent of MB

Sizing for Metro Mirror or Global Mirror links
Obtain the write rate in MB/sec

– Use queries to IBM i collection data
– Or writes/sec * blksize
Based on highest write rate calculate needed bandwidth as follows:
• Assume 10 bits per byte for network overhead
• Assume a maximum 80% utilization of the network

• Apply 10% uplift factor to the result to account for peaks in the 5 minutes intervals
• If the compression of devices for remote links is know you may apply it. If it is not known you may assume a 2:1
compression

Disk Magic Demo

47
Backup charts

Sizing the disk drives in external storage
DS8800 Recommended maximal disk utilization: 60%

– 15K RPM SAS disk drives
– 10 K RPM SAS disk drives
– SSD
DS5000 Recommended maximal disk utilization: 45%
– 15K RPM disk drives
– 10 K RPM disk drives
– SSD
XIV
– Data modules
Storwize V7000 Recommended maximal disk utilization: 45%
– 15K RPM SAS disk drives
– 10 K RPM SAS disk drives
– SSD

DS8800: Number of ranks

Detailed calculation for maximal IO/sec on Raid-5 rank
( reads/sec – read cache hits % ) + 4 * (writes/sec – write cache efficiency) = disk
operations/sec (on rank)
One 6+P 15 K rpm rank can handle max 2047 disk accesses/sec, at
recommended 60% utilization: 1228 disk ops/sec
Divide current disk accesses/sec by 1228
Example: 261 reads/sec , 1704 writes/sec, 45% read cache hits, 24 % write
efficiency: (261-117) + 4 *(1704-409) = 5324, 5324 / 1228 = 4 - 5 ranks
Recommended: 4 ranks
Calculation is based on performance measurements in Storage development and
recommended % disk util.

DS8800: Number of ranks - continue
Estimate % read cache hit and % write cache efficiency from present cache hits on internal
disk
Rough estimation by best practise:
– If % cache hits is below 50% estimate the same percentage on external storage
– If % cache hits is above 50% estimate half of this % on external storage
If cache hits are not known or you are in doubt, use Disk Magic default estimation: 20% read
cache hit, 30% write cache efficiency

Good Performance of Storage Systems With IBM I

Uploaded by

Copyright:

Available Formats

Good Performance of Storage Systems With IBM I

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Good Performance of Storage Systems With IBM I

Uploaded by

Copyright:

Available Formats

Good Performance of Storage

systems with IBM i

© 2009 IBM Corporation

Typical Customers’ considerations

Batch job running during the night

© 2009 IBM Corporation

To properly plan and size Storage systems for IBM i it is

© 2009 IBM Corporation

IBM i specifics: Single-Level Storage

© 2009 IBM Corporation

IBM i specifics: Handling Input Output operations

© 2009 IBM Corporation

Storage Manager – Handling IO operations

Storage management directories

Virtual addresses Page tables

LUN on external storage

© 2009 IBM Corporation

IO flow – Internal disk drives

Ex SSF SAS disk drives

IO flow – Natively connected external storage

POWER HSL, 12x loop

© 2009 IBM Corporation

IO flow – Storage system connected with VIOS

IO flow - virtualization with SVC

Physical disk drives

SCSI command tag queuing

IOP-less adapter Native: Max 6 IO operations at the same time

Virtual scsi VIOS

NPIV: Max 6 IO operations at the same time

© 2009 IBM Corporation

Integrated disks in IBM i, support 520 bytes per sector

© 2009 IBM Corporation

© 2009 IBM Corporation

IBM i performance data

Needed reports for sizing external storage

Desired way of collecting:

Insert the reports to Disk Magic to obtain data in Excell spreadsheet

© 2009 IBM Corporation

Example: Spreadsheet by Disk Magic

Interval Start Time

© 2009 IBM Corporation

Determine the peaks - continue

1. Oct candidate for peak

Component report: present cache hits

© 2009 IBM Corporation

Why is the number of disk drives (DDMs) important

Each LUN spans multiple disk drives (DDMs)

RAID level of DDMs usually influeneces performance too

© 2009 IBM Corporation

Guidelines for RAID level

© 2009 IBM Corporation

IO/sec per rank

© 2009 IBM Corporation

DS5000/4000/3000:Number of disk drives

10 K RPM disk drive

Example: 7000 IO/sec with read/write ratio 70 / 30 needs 7000 / 82 =app

15 K RPM disk drive

RAID-1 or RAID-10 138 122

10 K RPM disk drive