Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SRDF A

Download as pdf or txt
Download as pdf or txt
You are on page 1of 80

This module focuses on designing SRDF/A solutions.

We will look at the factors that impact


SRDF/A design and then use BCSD and SymmMerge to design SRDF/A solutions.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 1
This lesson covers SRDF/A architecture and resiliency features with an emphasis on the
differences between Enginuity 5876 and HYPERMAX OS 5977+.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 2
Let us review the legacy SRDF/A architecture - The legacy SRDF/A architecture applies to VMAX
family arrays running Enginuity 5876 and lower. SRDF/A’s architecture delivers replication over
extended distances with no performance impact. SRDF/A uses Delta Sets to maintain a group of
writes over a short period of time. Delta Sets are discrete buckets of data that reside in different
sections of the VMAX cache. Starting at 1, each Delta Set is assigned a numerical value that is
one more than the preceding one.

There are four types of Delta Sets to manage the data flow process in the legacy SRDF/A
architecture. The Capture Delta Set in the source VMAX (numbered N in this example), captures
(in cache) all incoming writes to the source volumes in the SRDF/A group. The Transmit Delta
Set in the source VMAX (numbered N-1 in this example), contains data from the immediately
preceding Delta Set. This data is being transferred to the remote VMAX. The Receive Delta Set in
the target system is in the process of receiving data from the transmit Delta Set N-1. The target
VMAX contains an older Delta Set, numbered N-2, called the Apply Delta Set. Data from the Apply
Delta set is being assigned to the appropriate cache slots ready for de-staging to disk. The data in
the Apply Delta set is guaranteed to be consistent and restartable should there be a failure of the
source VMAX.

In the legacy SRDF/A mode, the VMAX performs a cycle switch once data in the N-1 set is
completely received, data in the N-2 set is completely applied, and the minimum cycle time
elapsed. Default minimum cycle time is 15 seconds with Enginuity 5875 onward. Prior to this it
was 30 seconds.

During the cycle switch, a new delta set (N+1) becomes the capture set, N is promoted to the
transmit/receive set and N-1 becomes the apply Delta Set.

The slide depicts writes to the capture delta set as red dots. Overlapping dots indicate writes to
the same locations. Upon a cycle switch, SRDF/A has to send only the final version of a repeated
writes. In this example two locations are written to multiple times.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 3
This is a continuation of the previous slide. A cycle switch has occurred indicated by the change in
the cycle numbers.

Capture is now numbered N+1, Transmit and Receive are numbered N, and Apply is numbered N-
1. The Transmit cycle only shows 4 red dots, this is because even though a total of 7 writes were
performed in the cycle, five to those writes were rewrites to two locations, thus after the cycle
switch, only the final version of the data on the 4 unique locations have to be transmitted.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 4
This is a continuation of the previous slide. One more cycle switch has occurred.

The Apply set is Numbered N. Transmit and Receive are numbered N+1, Capture is N+2.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 5
SRDF/A Multi-Cycle Mode (MCM) allows more than two delta sets on the R1 side. If both arrays in
the solution are running HYPERMAX OS, SRDF/A operates in multi-cycle mode. There can be 2 or
more cycles on the R1, but only 2 cycles on the R2 side. Cycle switches are decoupled from
committing delta sets to the next cycle.

When the preset Minimum Cycle Time is reached, the R1 data collected during the capture cycle is
added to the transmit queue and a new R1 capture cycle is started. There is no wait for the
commit on the R2 side before starting a new capture cycle. The transmit queue is a feature of
SRDF/A MCM. It provides a location for R1 captured cycle data to be placed so a new capture
cycle can occur.

The capture cycle will occur even if no data is transmitted across the link. If no data is
transmitted across the link the capture cycle data will again be added to the transmit queue. Data
in the transmit queue is committed to the R2 receive cycle when the current transmit cycle and
apply cycle are empty. The transmit cycle will transfer the data in the oldest capture cycle to the
R2 first and then repeat the process.

SRDF/A MCM is only supported if both the R1 and R2 are VMAX3 arrays. If either the R1 or R2
arrays is not a VMAX3 then cycling will behave as in previous version of Enginuity. MCM supports
Single Session Consistency (SSC) and Multi Session Consistency (MSC).

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 6
In SRDF/A legacy mode cycle switch is coupled between the R1 and R2 arrays. A new capture
cycle cannot start until the transmit cycle completes its commit of data from the R1 side to the R2
side. Cycle switching can occur as often as the preset Minimum Cycle Time, but it can also take
longer since it is dependent on both the time it takes to transfer the data from the R1 transmit
cycle to the R2 receive cycle and the time it takes to mark the R2 apply cycle as write pending.
Delays in cycle switching can lead to large delta sets and thus large and unpredictable RPO on the
R2 side.

As discussed on the previous slide, in MCM when the preset Minimum Cycle Time is reached, the
R1 data collected during the capture cycle is added to the transmit queue and a new R1 capture
cycle is started. There is no wait for the commit on the R2 side before starting a new capture
cycle. Thus cycle switches are decoupled between the R1 and R2 arrays. Queuing allows smaller
cycles of data to be buffered on the R1 side and smaller delta sets to be transferred to the R2
side. The SRDF/A session can adjust to accommodate changes in the solution. If the SRDF link
speed decreases or the apply rate of the R2 side increases, more SRDF/A cycles can be queued on
the R1 side. The R2 side will still have two delta sets, the receive and the apply. SRFD/A MCM
increases the robustness of SRDF/A sessions and reduces DSE spillover.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 7
During short-term network interruptions, Transmit Idle keeps the SRDF/A session active, allowing
for recovery that does not require user intervention. When there is an outage the SRDF links, the
remote SRDF mirror remains as Ready on the link even though no data is sent. This prevents
SRDF/A from terminating abnormally.

Transmit Idle is enabled by default when dynamic SRDF groups are created. When all SRDF links
are lost, SRDF/A still stays active. In SRDF/A legacy mode, cycle switching stops and the capture
cycle continues to grow. In SRDF/A multi-cycle mode the cycle switching continues on the R1
side. Multiple transmit delta sets accumulate on the source side.

Transmit idle works seamlessly with SRDF/A Delta Set Extension. With SRDF/A MCM between
VMAX3 arrays, Delta Set Extension is enabled by default. DSE will use the designated Storage
Resource Pool. We will cover DSE shortly.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 8
SRDF/A Delta Set Extension (DSE) provides a mechanism for augmenting the cache-based Delta
Set buffering mechanism of SRDF/A with a disk-based buffering ability. This extended Delta Set
buffering ability may allow SRDF/A to ride through larger and/or longer SRDF/A throughput
imbalances than would be possible with cache-based Delta Set buffering alone.

DSE works in tandem with Transmit Idle and Group Level Write Pacing.

One VMAX arrays running Enginuity 5876, DSE offloads data to a DSE pool that must be
configured. By default DSE is disabled. On VMAX3 arrays DSE offloads data into a designated
Storage Resource Pool. By default the default SRP for FBA devices is used for DSE. DSE is
automatically enabled for SRDF/A between VMAX3 arrays.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 9
One VMAX arrays running Enginuity 5876 or lower SRDF/A DSE pools have to be configured.
SRDF/A DSE Pools and Save devices are managed in the same way as TimeFinder/Snap pools.

An RDF group can have at most one pool of each emulation. A single rdfa_dse pool can be
associated with more than one RDF group, similar to snap pools shared by multiple snap sessions.

SRDF/A DSE Threshold sets the percentage of cache used for SRDF/A that will start offloading
cache to disk.

DSE must be enabled on both the source and target arrays. DSE enabled on only one side of a
link would lead to failure of the SRDF/A recovery with SRDF/A dropping because the R2 side
would fail to have enough cache to hold the large and extended Transmit cycle.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 10
With VMAX3, DSE pools no longer need to be configured by a user. Instead when SRDF/A spills
tracks it will use a Storage Resource Pool (SRP) designated for use by DSE. Autostart for DSE is
enabled by default on both the R1 or R2 sides. When running SRDF/A MCM, smaller cycles on the
R2 side eliminate the need for DSE on the R2 side. Autostart is enabled on the R2 side in case
there is a personality swap. Managing a DSE pool or associating a DSE pool with an SRDF group is
no longer needed with VMAX3 arrays.

If the array on one side of an SRDF device pair is running HYPERMAX OS and the other side is
running a Enginuity 5876 or earlier, then SRDF/A sessions run in Legacy mode. DSE is disabled by
default on both arrays. EMC recommends that you enable DSE on both sides.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 11
The SRDF/A Group-Level Write Pacing feature has the ability to dynamically monitor and detect
when the SRDF/A I/O service rates are lower than host write I/Os to the R1 devices that are in an
SRDF group. Service rates are refer to the transmit cycle rate between the Primary (R1) and
Secondary (R2) VMAX systems. This monitoring is done at the SRDF group level on the R1 side.
The pacing algorithm and takes corrective action to slow down or pace host I/O write rates to
match the slower SRDF/A I/O service rates. Only host writes are paced. System calls, reads, and
other commands are not paced. SRDF/A Write Pacing help to control the amount of cache used by
SRDF/A. This can prevent cache from being exhausted on the R1 side of the SRDF link, thereby
keeping the SRDF/A sessions alive. SRDF/A Write Pacing provides the user an additional method
of extending the availability of SRDF/A.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 12
Enginuity will monitor the apply rates (also known as the restore rates) to the R2 devices in an
active SRDF/A session, when device-level write pacing has been configured and enabled. If the
R2 devices are sources of TimeFinder/Snap sessions and if the apply rate for any R2 device(s)
is slower than the host write rate to the corresponding R1 device, then Enginuity will slow
down the writes to those R1 devices. Device-level write pacing is also set on the RDF group,
however only those devices that are involved in a TimeFinder/Snap session will be paced. The
pacing (or the delay) and the cache threshold are inherited from the values set of the RDF
group.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 13
VMAX3 introduces enhanced group-level pacing. Enhanced group-level pacing paces host I/Os to
the DSE spill-over rate for an SRDF/A session.

When DSE is activated for an SRDF/A session, host-issued write I/Os are throttled so their rate
does not exceed the rate at which DSE can offload the SRDF/A session’s cycle data. The system
will pace at the spillover rate until the usable configured capacity for DSE on the SRP reaches its
limit. At this point, the system will pace at the SRDF/A session’s link transfer rate.

Enhanced group-level pacing responds only to spillover rate on the R1 side, it is not affected by
the spillover on the R2 side.

All existing pacing features are supported and can be utilized to keep SRDF/A sessions active.
Enhanced group-level pacing is supported between VMAX3 arrays and VMAX arrays running
Enginuity 5876 with fix 67492.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 14
This lesson covered SRDF/A architecture and resiliency features with an emphasis on the
differences between Enginuity 5876 and HYPERMAX OS 5977+.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 15
This lesson covers SRDF/A design considerations and modeling tools.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 16
When designing an SRDF/A solution, a number factors influence the design. The actual workload
and the expected RPO (Recovery Point Objective) will determine the amount of Cache, Bandwidth,
the number of RA’s (Remote Adaptors) and the DSE configuration.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 17
SRDF/A requires a delicate balance between bandwidth and cache. Asynchronous mode does not
introduce any host response issues, but requires additional cache to support peak production
workload.

In the example, write workload shows the average throughput is computed over the entire
timeline. If one assumes that the RDF link bandwidth is equal to this average, then during time
period T1 and T3, extra cache would be require to buffer these writes.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 18
In the previous slide, we stated that the inflow and outflow of writes into the system need to be
equal on average. We did not define the timeline on which this average needs to be calculated.
Obviously, there is going to be a different result when averaging the load across five minutes, an
hour, or a day. The peak time defines that timeline, and answers the question of “What is my
busiest five minutes, hour, etc.?” In other words, we use the peak time to evaluate the busiest
time period during the workload. For example, if a customer sets the peak time to be x minutes,
we will look for the busiest x minutes in the workload to determine the average write peak. This
average write peak will directly impact the link bandwidth, and the amount of cache required for
buffering the writes. Increasing the peak time will, in most cases, decrease the RDF link
bandwidth requirement (because we are averaging over a longer period of time), and increase the
amount of cache required (because more data will have to be buffered).

It is important to understand that regardless of the collection tool, when collecting performance
data, the minimal interval of the collection will also determine the minimal peak time that the
model can take into account. For example, if the system collects data at a 10-minute interval,
the accuracy of the calculation cannot take into account less than that time interval, so the
minimal peak time will be 10 minutes.

The highest Peak I/O and duration must be identified to estimate SRDF/A operational cache
requirement. The design must verify that the link bandwidth can sustain the average write peak
throughput of the environment. Workload data is essential for a successful design.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 19
The image depicts the effect of data collection intervals. If we were collecting data at 2 min
intervals, we would see that at certain times of day, the workload goes up to almost 90 MB/s. But
if we were to collect the same data at 10 minute intervals, the data would indicate a maximum
only of 70 MB/s (this is because averaging tends to smooth out the peaks).

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 20
Write folding is the term given to the phenomenon of repeated writes to the same track, resulting
in a decrease in the number of tracks to send. This feature conserves link bandwidth, but it is
hard to predict the extent to which a particular SRDF/A installation will benefit from it.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 21
This example is intended to illustrate the logic behind EMC’s cache calculation tools. The example
shown does not account for varying I/O sizes, partial track writes, the effect of random vs.
sequential I/O patterns and locality of reference.

The assumptions for this example are as follows: SRDF/A in legacy mode, the average write
workload is less than 10 MB/s, the available RDF link bandwidth is 10 MB/s and the minimum
cycle time is set to 30s. As long as the workload does not exceed 10 MB/s, the SRDF/A cycles
time will be 30s and the amount of cache used by SRDF/A will be approximately 10 MB/s*30s =
300 MB. Now let us assume that the write workload becomes 20 MB/s for the next 15.5 minutes.
The table shows how the SRDF/A cycle times will be affected for the next few cycles.

Cycle 1 – Workload has just become 20MB/s, the transmit delta set would have at most 300 MB
of data from the previous cycle. This can be sent across in less than 30s and so the cycle switch
for the current cycle (1) can occur in 30s. In these 30s, the capture delta set would have filled up
with 30s*20MB/s = 600 MB.

Cycle 2 – Transmit set is 600 MB – will take 600/10 = 60s to transmit, so Cycle time will be 60s.
In these 60s capture delta set would have filled up with 60s*20MB/s = 1200 MB.

We can continue to do the math as shown in the table. Cycle 5 will be 8 minutes long, and it this
time the capture delta set would have 9.6 GB of data. The elapsed time at the end of Cycle 5 from
the time the workload increased to 20MB/s = 0.5+1+2+4+8=15.5. At the next cycle switch (6)
the Workload would be less than 20 MB/s, the 9.6 GB would still have to be transmitted leading to
a cycle time of 16 min. The capture delta set would be less than 9.6 GB because the write rate
would be below 10MB/s – the peak duration has completed. In this example the cycle time went
up to 16 min, with a cache requirement of almost 19.2 GB.

When the arrival rate of writes exceeds link capacity, cycle times go up exponentially. The longest
cycle occurs after the peak is over much like the worst flooding in a river often occurs after the
rains have stopped. After the peak write rate is over, the cycle times decrease exponentially as
well, since each cycle takes only half as long as the previous one.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 22
The best practice is to spread the DSE pool devices across as many disks as possible. Do not
underestimate the bandwidth and throughput demands that DSE places on the disks. When DSE
is active, you could have one random write and one random read for every host write. Consider
an example where DSE pool devices with RAID 1 protection are used. When DSE is active, a host
write load of 1500 writes/sec will result in 3000 back-end writes/second and 1500 reads/second
to the DSE pool. Spread the DSE pool across as many disks as possible.

RAID 1 is the recommended protection type for DSE pool devices. RAID 5 is another option, but it
will consume more back-end resources. Do not use dedicated disks for DSE pool. Use only one
hyper from each disk for the DSE pool. The other hypers can be used for any other purpose.

EMC recommends the use of 15K RPM FC Disks for DSE. No more that 50% of the performance of
the disks should be DSE related. This is about 75 I/Os per second. Based on this one can use the
simple formula listed below to compute the number of disks required for DSE.

RAID 1 - # of Disks for DSE = (Host Write/s)*3/75 = Example (1500*3)/75 = 60 Disks

RAID 5 - # of Disks for DSE = (Host Write/s)*5/75 = Example (1500*5)/75 = 100 Disks

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 23
The size of the DSE pool depends on:

1. The average write rate when DSE is active.

2. The length of time DSE is active.

A simple formula for calculating DSE pool size is:

Write Rate*Length of time when SRDF/A will be paged into DSE pool*64 KB

Example: If the write rate is 3000 writes/second and you want to ride over a 1 hour link outage,
the size of the pool is 3000*3600*64KB = 659 GB. (Note: This formula assumes that the I/Os are
smaller than 64 KB. For I/Os larger than 64 KB, adjust accordingly. In the above example, if each
write was 128 KB, the size would be 3000*3600*128 KB).

This size is before protection: if the protection is mirrored, you need to double the disk space. In
the above example, you will need 659 GBx2 = 1318 GB of disk space. This assumes that there
are no rewrites, but it is better to be safe than sorry in this case. Spread the pool over as many
disks as possible – based on the simple formula discussed in the previous slide we compute 120
disks for RAID 1.

SymmMerge and BCSD can also be used to estimate the size of the DSE pool.

The simple formulas discussed in this and the previous slide are take from the SPEED Bits & Bytes
article “Best Practices for SRDF/A Delta Set Extension.”

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 24
BCSD would be typically used in a pre-sales situation when VMAX/VMAX3 data is unavailable.
When VMAX/VMAX3 Data is available, we should use SymmMerge to validate the SRDF/A
configuration. In the event there is no EMC Array in place and we need to use host data, we
should model the solution with BCSD. The SRDF/A calculations in BCSD are done with the same
DLLs that SymmMerge uses.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 25
At the present time, BCSD addresses the key risks listed above.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 26
This lesson covered SRDF/A design considerations and modeling tools.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 27
This lesson covers the use of BCSD for designing a single session SRDF/A solution.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 28
We will walk through an SRDF/A sizing effort using BCSD. The customer requirements are listed
on the slide. VMAX Sizer was used to size the appropriate VMAX3 array based on the BTP data.
VMAX Sizer has recommended a VMAX 200K 3 Engine configuration with 3 TB of cache. Sizer has
also recommended 8 1Gb GbE ports for SRDF.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 29
Here is the recommended Sizer configuration.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 30
The proposed BCSD modeling configuration is shown. These elements will be built in BCSD.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 31
Please note that we are not showing all steps to create the BCSD model. Building the BCSD model
for SRDF/A is very similar to building an SRDF/S model. Only the screens relevant to SRDF/A are
being show here.

The array settings for the local array are show here.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 32
This is continuation of the Local Array Settings. In the Replication Group panel, the Rep Flag
should be set to Source for the R1 array. We have specified that all 8 of the RA ports will be used
by this replication group. To model for DSE and Transmit Idle make sure that the Delta Set Ext
box is checked. By default BCSD assumes a Transmit Idle Time of 150 seconds. So BCSD will
model for a link outage of 150 seconds and compute the cache and DSE impact for the same. In
the SRFD/A information panel one has to specify the Host Type – open systems in this case and
the total logical volume count in the array. The logical volume count is used to estimate the Write
Pending limits.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 33
The remote array settings that are different from the local array are shown here. The key
difference is the fact that the Rep Flag should be set to Target for the R2 array. All the other
settings for the Remote array should be similar to the source array.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 34
Both the local and remote switches are VMAX3 GbE switches. The outbound ports are Ethernet.
When using GbE, the “Qty” field should equal the number of GbE ports that will be used. We have
8 GbE ports on each of the arrays so we specify the “Qty” as 8. For the VMAX3 GbE switches the
inbound port type is set to None and the outbound port type is set to the appropriate Ethernet
speed. In this case we are using 1 Gb.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 35
Based on the Customer’s information, the network circuit is an Ethernet circuit with a total of 300
MB/s of bandwidth. This is an Asynchronous solution, so the Synchronous box is left unchecked.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 36
The physical relationship tells the model how the various components are physically connected. In
this example, Site Local is connected to Site Remote.

Local Array is connected to Remote Array.

The Rep Port Group “Local RA Port Group” on Local Array is connected to the “Remote RA Port
Group” Rep Port Group on Remote Array.

The Local Port Group is connected to the switch “Local Switch”, while the Remote Port Group is
connected to the switch “Remote Switch”.

The switch groups are connected together with the “Asynch Network Circuit” Network Circuit.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 37
The logical relationship is used to specify the type of modeling that is to be done – In this case
SRDF/A. In addition the logical relationship is used to specify the Physical Relationship and the
Replication Groups that should be modeled. For SRDF/A we also specify the desired minimum
cycle time and an SRDF/A Write Load Adjustment Factor. The default is 1.5x.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 38
The Write Adjustment setting helps us account for any burstiness in the workload. Typically, data
collection is done at a 10 minute interval. Any peaks or bursts in the workload in this 10 minute
interval is smoothed out and not seen in this collection. To account for peaks and burstiness, the
recommendation is to use the Write Adjustment setting. Performance engineering recommends
using a value of 1.5x. But if you want to be even more conservative, you can use a value of 2x.
This is done just so we can get a better estimate of the cache and DSE paging space required.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 39
We have loaded the BTP data and then we have mapped the BTP Data along with the volume list
of devices to the Replication Group. The steps are not shown here. We covered this in the SRDF/S
module.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 40
Run the model.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 41
The Modeling Properties tab of the Logical Relationship will show a summary of the workload. For
this case we can see that the Max Writes/s is about 19,000 and that the maximum raw
throughput that would be required is about 710 MB/s. We can also see that the RA utilization
about 56%. The SRDF/A RPO of about 15 minutes, this is well above the customer requirement
of 5 minutes.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 42
The RPO is almost 15 min because during certain times of the day we do not have adequate
bandwidth. Thus to meet the 5 min RPO we will have to provide more bandwidth.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 43
We re-run the model with more bandwidth – We edited the Network Circuit Configuration to 450
MB/s. The model now shows us that the RPO is less than 5 minutes.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 44
The RA utilization is less than 50% most of the time. The RAs are sufficient for this configuration.

The Recovery Point Objective (RPO) is the difference in time between primary and remote
volumes in seconds. In the event of a disaster, this chart shows the amount of data loss that
would occur (depending on the time of day that the disaster occurs). In order to drive a
consistently low RPO value, the supplied bandwidth should always exceed the data change rate.
When the data change rate exceeds the available bandwidth, the SRDF/A process on the source
array will absorb the additional demand into cache, until the data can be de-staged across the
link.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 45
The Cache chart shows the amount of cache required for SRDF/A which is in addition to the cache
required for this array’s storage configuration. The Paging Disk Space chart shows the amount of
disk space required for SRDF/A to page to when DSE is enabled. Just be careful not to undersize
the cache as too little cache will cause thrashing, severely impacting RPOs and could degrade
performance for part or even the entire array.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 46
The SRDF/A risks are tabulated for the Physical and Logical relationships. The slide shows the Risk
Tables for the model we just ran.

For VMAX3 arrays BCSD cannot properly estimate the Inter-Array Drive Config and Inter-Array
RAID Configuration risks for the logical relationship because BCSD does not have the ability to
account for Storage Resource Pools. So ignore the SRDF/A risk table for the Logical Relationship
when modeling VMAX3 arrays. In this model we did not define the drive or RAID configurations.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 47
Problems found during modeling need to be addressed before the solution can be implemented.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 48
This demo covers the modeling of a single session SRDF/A solution with BCSD.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 49
This lesson covered the use of BCSD for designing a single session SRDF/A solution.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 50
This lesson covers designing a SRDF/A MSC solution with BCSD.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 51
Both BCSD and SymmMerge can be used to model an SRDF/A MSC solution.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 52
Now let us walk through a sample sizing effort for SRDF/A MSC using BCSD.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 53
The steps to create the various BCSD components for this MSC model are not shown. In this
example we have two sites, two arrays at each site. The application spans the two arrays at the
local site and is replicated with SRDF/A MSC. The desired RPO for this solution is 5 minutes.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 54
All the arrays configurations are similar. Local Array 1 configuration is shown. We have assumed
that all the array will be 2 Engine VMAX 200K arrays with 2 TB of usable cache. Each array as 4 x
10 GbE ports. Each array will have one replication group which uses all four available ports.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 55
All the switch configurations are similar. Local switch configuration is shown. We have assumed
that all the switches will be VMAX3 GbE switches. We have used a quantity of 8 to match the total
number of ports at each site. The Outbound Port Type is set to Ethernet 10 Gb to match the 10
Gb GbE ports.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 56
Here are the SRDF Physical Relationships used in the model. The first connects Local Array 1 and
Remote Array 1. The second connect Local Array 2 and Remote Array 2.

All the SRDF traffic will flow through the same Local Switch, same Network Circuit and same
Remote Switch.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 57
Here are the SRDF Logical Relationships used in the model. The first connects Local Array 1 and
Remote Array 1. We assume that this Logical Relationship can use 75% of the available
bandwidth. The second connects Local Array 2 and Remote Array 2. We assume that this can use
the remaining 25% of the bandwidth. We are modeling SRDF/A, we have set the desired cycle
time to 15 seconds.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 58
In BCSD a Composite Group is a collection of SRDF Logical Relationships. To create a new
composite group right click on the Composite Groups folder and then choose New > Composite
Group. In the dialog enter the Name, MSC Desired Cycle Time and the desired logical
relationships. In this example, we have created a Composite Group “MSC Composite Group” which
links two logical relationships (MSC Logical Relationship 1and MSC Logical Relationship 2) into the
MSC session. We have set the desired cycle time to 15 seconds.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 59
In this example, our MSC session has two Replication Groups, so the appropriate performance
data must be loaded and mapped to each of the replication groups. The steps are not shown here.
We are ready to run the model now.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 60
The model has been run. We first look at he Modeling Properties Tab of both logical relationships
to look at the summary of the results. We can see that the SRDF/A Maximum Cycle Time and RPO
are identical for both the logical relationships. This is because we have coupled the two via a
Composite Group. The RA Utilization is acceptable for both the logical relationships.

The required bandwidth, cache and paging disk space are also shown.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 61
The RPO Charts are identical for both logical relationships, this is because they are coupled
together. The RPO is under 5 minutes – thus the desired RPO is met.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 62
The SRDF/A cache requirements are shown for both logical relationships.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 63
The compressed throughput chart indicates that the 75/25 spilt that we had set earlier is
reasonable.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 64
We can display the Composite Compressed Throughput chart from the Network Circuit for this
model. This allows us to see the bandwidth requirements for the two SRDF/A sessions managed
with MSC. The total available bandwidth is around 460 MB/s after accounting for all the overhead.
The peak compressed throughput required for both the sessions is about 353 MB/s.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 65
This demo covers modeling a SRDF/A MSC solution with BCSD.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 66
This lesson covered designing a SRDF/A MSC solution with BCSD.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 67
This lesson covers the use of SymmMerge for designing SRDF/A solutions. As noted in earlier
parts of this course, SymmMerge is a tool available for SPEED certified members only.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 68
For environments with existing VMAX Systems, we can use SymmMerge to model SRDF/A. Let’s
walk through an example of using SymmMerge for designing a SRDF/A solution. In this example
the BTP and Bin files for an existing VMAX 20K is available. We want to model the SRDF/A
requirements for one of the RDF groups on a new VMAX3 array. SymmMerge SRDF/A calculator
will be used.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 69
The first thing we do is to load the BIN file into SymmMerge. With the BIN file loaded, we can
then load our BTP data.

Make sure to check the “Account for SRDF” box while importing the bin file. Next choose the BTP
file. Click on Next.

Click on Finish on the Utilization Settings screen (not shown), the BIN and BTP file will be loaded
into SymmMerge.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 70
The SRDF/A Calculator in SymmMerge offers various analysis options:
• All Sym workload V3
– Simulation for VMAX3 Systems. Ignores all RDF group definitions and assumes that all
the volumes participate in the SRDF/A workload. Accounts for DSE and Transmit Idle.
• Group Workload V3
– Simulation for VMAX3 Systems. Uses the volumes defined per RDF group and takes DSE
and Transmit Idle into account.
• All Sym Workload + DSE (72) & Group workload + DSE (72):
– Simulation for pre-VMAX3 arrays. Accounts for DSE and Transmit Idle.
• All Symm Workload (71) & Group Workload (71):
– Simulation for pre-VMAX3 arrays. Does not account for DSE

SymmMerge also allows the simulation of a Link Down event, to help compute DSE space.

In the example shown on the slide, the bin file has 4 RDF group defined.

Click on the Calculate button to determine the required bandwidth, cache and paging space. The
calculator will display view graphs. The SRFD/A input panel is in the Component list tab.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 71
The Component list tab will show the “Input Information” for the SRDF/A calculator. The boxes in
white can be changed, while the gray boxes cannot be changed.

The Session box is used for MSC. Tie the sessions together by setting the same session ID. The
minimum cycle time defaults to 30 seconds, but can be changed to the desired value.

The Transmit Idle time defaults to ¼ of the data collection interval. The Transmit Idle time in the
SRDF/A calculator represents the length of time one wishes to keep the delta sets in cache before
spilling over to DSE.

The link throughput defaults to 0. If left at 0, the SRDF/A calculator will estimate the required link
throughput based on the peak workload. Typically this should be set to the available link
throughput.

Write Adjustments allows one to account for burstiness in the workload. This is exactly the same
as the SRDF/A Write Load Adjustment Factor in BCSD. Defaults to None, 1.5x is recommended.

Is Pessimistic – checking this box, ignores any locality of reference in the workload and is a more
conservative approach. Un-checking this box will account for locality of reference.

Make the desired changes to the input and then click on Calculate to see the results.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 72
The SRDF/A calculator calculates the Total Cache, Cycle Time, RPO and Paging Disk Space based
on the inputs. If the Link throughput is left at the default value of 0, SymmMerge determines the
required Link Throughput. The Transmit Idle Time in Cache is initially assumed to be ¼ the
collection interval. The assumption is that if the links fail, then transmit idle will keep SRDF/A
sessions running by buffering in cache up to a particular point and then use disk buffering with
DSE. Based on the Link Throughput, the Transmit Idle time in Cache and the workload, the
amount of cache and DSE paging space required for SRDF/A is computed.

Based on our example, it looks like a link throughput of about 120 MB/s will give us a 30 second
cycle time (60 second RPO) throughout the day. The SRDF/A total cache requirement to support
this is 16.45 GB. With this throughput, the DSE space of 22.98 GB is currently recommended. The
Customer actually only has 100 MB/s link throughput available.

We should also account for any burstiness in the workload. We can do this by using the Write
Adjustment input value.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 73
A write adjustment of 1.5x increases the required total cache to 24.7 GB; the recommended
paging space is 44.3 GB. We will finalize the design by changing the Link Throughput to the actual
available value, and will keep the write adjustment at 1.5x.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 74
We finalize the design by running the SRDF/A calculator with the available bandwidth of 100
MB/s. The WA is still at 1.5x. The final design requires 24.7 GB of Total cache and 47.47 GB of
paging space. We can see that the RPO is less than 300s. So 100 MB/s bandwidth is adequate.

SymmMerge can also simulate a link outage and compute the required DSE pool size and RPO
during the outage. We will see this in the next slide.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 75
The Link Down Length is specified as a multiple of the data collection Time Stamp interval. In this
case the data was collected at 5 min intervals so 6 time stamps is equal to 30 min. SRDF/A
calculator indicates that we would require paging space of 104 GB during the outage and that the
maximum RPO due to the link outage would be 1024s.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 76
This lesson covered the use of SymmMerge for designing SRDF/A solutions.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 77
This module covered the factors that impact SRDF/A design. BCSD and SymmMerge were used to
design SRDF/A solutions.

Copyright 2015 EMC Corporation. All rights reserved. Module: Designing SRDF/A Solutions 78
This course provided a thorough exposure to the procedures, methodologies, and best practices
for designing VMAX3 solutions. Use of the VMAX Sizer and the BCSD tools was presented for
various use cases. All concepts were reinforced with demonstrations.

Copyright 2015 EMC Corporation. All rights reserved. VMAX3 Solutions Design Course Summary 79
This concludes the Training. Thank you for your participation.

Copyright 2015 EMC Corporation. All rights reserved. VMAX3 Solutions Design Course Summary 80

You might also like