VMworld 2013: DRS: New Features, Best Practices and Future Directions

DRS: New Features, Best Practices
and Future Directions
Aashish Parikh, VMware
VSVC5280
#VSVC5280

22
Disclaimer
 This session may contain product features that are
currently under development.
 This session/overview of the new technology represents
no commitment from VMware to deliver these features in
any generally available product.
 Features are subject to change, and must not be included in
contracts, purchase orders, or sales agreements of any kind.
 Technical feasibility and market demand will affect final delivery.
 Pricing and packaging for any new technologies or features
discussed or presented have not been determined.

33
Talk Outline
 Advanced Resource Management concepts
• VM Happiness and Load-balancing
 New features in vSphere 5.5
• Going after harder problems, tackling specific pain-points
 Integrating with new Storage and Networking features
• Playing nicely with vFlash, vSAN and autoscaling proxy switch ports
 Future Directions
• What’s cooking in DRS labs!

44
Related Material
 Other DRS talks at VMworld 2013
• VSVC5821 - Performance and Capacity management of DRS clusters
(3:30 pm, Monday)
• STO5636 - Storage DRS: Deep Dive and Best Practices to Suit Your Storage
Environments (4 pm, Monday & 12:30 pm Tuesday)
• VSVC5364 - Storage IO Control: Concepts, Configuration and Best Practices to Tame
Different Storage Architectures (8:30 am, Wed. & 11 am, Thursday)
 From VMworld 2012
• VSP2825 - DRS: Advanced Concepts, Best Practices and Future Directions
• Video: http://www.youtube.com/watch?v=lwoVGB68B9Y
• Slides: http://download3.vmware.com/vmworld/2012/top10/vsp2825.pdf
 VMware Technical Journal publications
• VMware Distributed Resource Management: Design, Implementation, and
Lessons Learned
• Storage DRS: Automated Management of Storage Devices In a Virtualized Datacenter

55
Advanced Resource
Management Concepts
Contention, happiness, constraint-handling and load-balancing

66
The Giant Host Abstraction
 DRS treats the cluster as one giant host
• Capacity of this giant “host” = capacity of the cluster
1 “giant host”
CPU = 60 GHz
Memory = 384 GB
6 hosts
CPU = 10 GHz
Memory = 64 GB

77
Why is That Hard?
Main issue: fragmentation of resource across hosts
Primary goal: Keep VMs happy by meeting their resource demands!
1 “giant host”
CPU = 60 GHz
Memory = 384 GB
VM demand < 60 GHz
All VMs running happily

88
Why Meet VM Demand as the Primary Goal?
 VM demand satisfied VM or application happiness
 Why is this not met by default?
• Host level overload: VM demands may not be met on the current host
 Three ways to find more capacity in your cluster
• Reclaim resources from VMs on current host
• Migrate VMs to a powered-on hosts with free capacity
• Power on a host (exit DPM standby mode) and then migrate VMs to it
 Note: demand current utilization

99
Why Not Load-balance the DRS Cluster as the Primary Goal?
 Load-balancing is not free
• When VM demand is low, load-balancing may have some cost but no benefit
 Load-balancing is a mechanism used to meet VM demands
Consider these 4 hosts with almost-idle VMs
Note: all VMs are getting what they need
Q: Should we balance VMs across hosts?

1010
Important Metrics and UI Controls
 CPU and memory entitlement
• VM deserved value based on demand, resource settings and capacity
Load is not balanced but all VMs are happy!

1111
DRS Load-balancing: The Balls and Bins Problem
 Problem: assign n balls to m bins
• Balls could have different sizes
• Bins could have different sizes
 Key Challenges
• Dynamic numbers and sizes of balls/bins
• Constraints on co-location, placement and others
 Now, what is a fair distribution of balls among bins?
 DRS load-balancing
• VM resource entitlements are the ‘balls’
• Host resource capacities are the ‘bins’
• Dynamic load, dynamic capacity
• Various constraints: DRS-Disabled-On-VM, VM-to-Host affinity, …

1212
Goals of DRS Load-balancing
 Fairly distribute VM demand among hosts in a DRS cluster
 Enforce constraints
• Recommend mandatory moves
 Recommend moves that significantly improve imbalance
 Recommend moves with long-term benefit

1313
How Is Imbalance Measured?
 Imbalance metric is a cluster-level metric
 For each host, we have
 Imbalance metric is the standard deviation of these normalized
entitlements
 Imbalance metric = 0 perfect balance
 How much imbalance can you tolerate?
AWESOME!OOPS!OUCH!YIKES!
0.65 0.1 0.25
Imbalance metric = 0.2843

1414
The Myth of Target Balance
 UI slider tells us what star threshold is acceptable
 Implicit target number for cluster imbalance metric
• n-star threshold tolerance of imbalance up to n * 10% in a 2-node cluster
 Constraints can make it hard to meet target balance
 Meeting aggressive target balance may require many migrations
• DRS will recommend them only if they also make VMs happier
If all VMs are happy, a little imbalance is not really a bad thing!

1515
Key Takeaway Points
 The Goals of DRS
• Provide the giant host abstraction
• Meet VM demand and keep applications at high performance
• Use load balancing as a secondary metric while making VMs happy
 DRS recommends moves that reduce cluster imbalance
 Constraints can impact achievable imbalance metric
 If all VMs are happy, a little imbalance is not really a bad thing!

1616
New Features in vSphere 5.5
Going after harder problems, tackling specific pain-points

1717
Restrict #VMs per Host
 Consider a perfectly balanced cluster w.r.t. CPU, Memory
 Bigger host is ideal for new incoming VMs
 However, spreading VMs around is important to some of you!
• In vSphere 5.1, we introduced a new option
 LimitVMsPerESXHost = 6
• DRS will not admit or migrate more than 6 VMs to any Host
 This number must be managed manually – not really the DRS way!

1818
AutoTune #VMs per Host
 *NEW* in vSphere 5.5: LimitVMsPerESXHostPercent
 Currently: 20 VMs New: 12 VMs Total: 32 VMs
 VMs per Host limit automatically set to: Mean + (Buffer% * Mean)
• Mean (VMs per host, including new VMs): 8
• Buffer% (value of new option): 50
• Therefore, new limit automatically set to: 8 + (50% * 8) = 12
 Use for new operations, not to correct current state
Host1 Host2 Host3 Host4

1919
Latency-sensitive VMs and DRS
 Several efforts to help latency-sensitive apps
• VSVC5596 - Extreme Performance Series: Network Speed Ahead
• VSVC5187 - Silent Killer: How Latency Destroys Performance...And What to
Do About It
 DRS recognizes VMs marked as latency-sensitive
 DRS initial placement “just works”
 DRS load-balancing treats VMs as if soft-affine to current hosts
• Latency-sensitive workloads may be sensitive to vMotions
• Soft affinity is internal, no visible new rule in the UI
 VMs will not be migrated unless absolutely necessary
• To fix otherwise unfixable host overutilization
• To fix otherwise unfixable soft/hard rule-violation
 DRS recognizes VMs marked
as latency-sensitive

2020
CPU Ready Time: Overview
 Ready time: The time a virtual CPU (vCPU) waits in a ready-to-run
state before it can be scheduled on a physical CPU (pCPU)
 Many causes for high ready time; many fixes; many red-herrings
 Active area of investigation and research, chime in! We need help!
• Give us more problem details, VC and ESX support files
• www.yellow-bricks.com/2013/05/09/drs-not-taking-cpu-ready-time-in-to-
account-need-your-help
NEW TERMINATED
WAITING
READY RUNNING
dispatch
interrupt
exitadmitted
I/O or event waitI/O or event complete

2121
CPU Ready Time: Common “Problems” and Solutions
 “%RDY is 80, that cannot be good”
• Cumulative number – divide by #vCPUs to get an accurate measure
• Rule-of-thumb: up to 5% per vCPU is usually alright
 “Host utilization is very low, %RDY is very high”
• Host power-management settings reduce effective pCPU capacity!
• Set BIOS option to “OS control” and let ESX decide
• Low VM or RP CPU limit values restrict cycles delivered to VMs
• Increase limits (or set them back to “unlimited”, if possible)
 “Host and Cluster utilization is very high, %RDY is very high”
• Well, physics.. 
• Add more capacity to the cluster

2222
CPU Ready Time: NUMA Scheduling Effects
 NUMA-scheduler can increase %RDY
 vCPUs scheduled for data locality
in caches and local memory
 So what’s the solution?
• There’s no real solution – there’s no real problem here!
 Application performance is often better!
• Benefit of NUMA-scheduling is usually more than the hit to %RDY
 For very CPU-heavy apps, disabling NUMA-rebalancer might help
• http://blogs.vmware.com/vsphere/2012/02/vspherenuma-loadbalancing.html
node0 node1
Localmemory
Localmemory
0 7
1 6
2 5
3 4
cores
7 0
6 1
5 2
4 3
cores

2323
CPU Ready Time: Better Handling in DRS
 Severe load imbalance in the cluster - high #vCPU / #pCPU ratio
• DRS automatically detects and addresses severe imbalance aggressively
• Functionality added in vSphere 4.1 U3, 5.0 U2 and 5.1
 CPU demand estimate too low – DRS averaging out the bursts
• 5-minute CPU average may capture CPU active too conservatively
• *NEW* in vSphere 5.5: AggressiveCPUActive = 1
• DRS Labs!: Better CPU demand estimation techniques
 Other reasons for performance hit due to high %RDY values
• Heterogeneous workloads on vCPUs of a VM, correlated workloads
• DRS Labs!: NUMA-aware DRS placement and load-balancing

2424
Memory Demand Estimation
 DRS’ view of VM memory hierarchy
 Active memory determined by statistical sampling by ESX, often
underestimates demand!
 *NEW* in vSphere 5.5 – burst protection!
• PercentIdleMBInMemDemand = 25 (default)
 DRS / DPM memory demand = Active + (25% * IdleConsumed)
• For aggressive demand computation, PercentIdleMBInMemDemand = 100
Active Idle Consumed
Consumed
Configured

2525
Key Takeaway Points
 Number of VMs per host can now be managed with ease
• Use this wisely, VM happiness and load-balance can be adversely affected
 DRS handles latency-sensitive VMs effectively
 Better handling of CPU ready time
• *NEW* in vSphere 5.5: AggressiveCPUActive
• We need your help – precise problem descriptions and dump files
 Better handling of memory demand – burst protection
• *NEW* in vSphere 5.5: PercentIdleMBInMemDemand

2626
Integrating with New Storage and
Networking Features
Playing nicely with vFlash, vSAN, autoscaling proxy switch ports

2727
VMware vFlash: Overview
 VMware vFlash: new host-side caching solution in vSphere 5.5
 vFlash reservations statically defined per VMDK
 Cache is write-through (i.e. used as read cache only)
 Cache blocks copied to destination host during vMotion by default
ESX host
SSDvFlash
vmdk1 vmdk2
Shared storage

2828
VMware vFlash: DRS Support and Best Practices
 DRS initial placement “just works”
• vFlash reservations are enforced during admission control
 DRS load-balancing treats VMs as if soft-affine to current hosts
 VMs will not be migrated unless absolutely necessary
• To fix otherwise unfixable host overutilization or soft/hard rule-violation
 Best practices:
• Use vFlash reservations judiciously
• Mix up VMs that need and do not need vFlash in a single cluster

2929
VMware vFlash: Things to Know
 Migration time is proportional to cache allocation
 No need to re-warm on destination
 vMotions may take longer
 Host-maintenance mode operations may take longer
 vFlash space can get fragmented across the cluster – no defrag. mechanism
 Technical deep-dive
• STO5588 - vSphere Flash Read Cache Technical Overview

3030
VMware vSAN: Interop with DRS
 vSAN creates a distributed datastore using local SSDs and disks
 A vSAN cluster can also have hosts without local storage
 VMDK blocks are duplicated across several hosts for reliability
 DRS is compatible with vSAN
 vMotion may cause remote accesses until local caches are warmed

3131
Autoscaling Proxy Switch Ports and DRS
 DRS admission control – proxy switch ports test
• Makes sure that a host has enough ports on the proxy switch to map
• vNIC ports
• uplink ports
• vmkernel ports
 In vSphere 5.1, ports per host = 4096
• Hosts will power-on no more than ~400 VMs
 *NEW* in vSphere 5.5, autoscaling switch ports
• Ports per host discovered at host boot time
• Proxy switch ports automatically scaled
 DRS and HA will admit more VMs!
VMware DRS
Virtual Switch

3232
Key Takeaway Points
 DRS and vFlash play nice!
• Initial placement “just works”
• vFlash reservations are enforced during admission control
• DRS load-balancing treats VMs as if soft-affine to current hosts
 DRS and vSAN are compatible with each other
 Proxy switch ports on ESX hosts now autoscale upon boot
• Large hosts can now admit many more VMs

3333
Future Directions
What’s cooking in DRS labs!

3434
Network DRS: Bandwidth Reservation
 Per-vNIC shares and limits have been available since vSphere 4.1
 We are experimenting with per vNIC bandwidth reservations
 In presence of network reservations
• DRS will respect those during placement and future vMotion recommendations
• pNIC capacity at host will be pooled together for this
 What else would you like to see in this area?
 Is pNIC contention an issue in your environment?
 Network RPs?

3535
Static VM Overhead Memory: Overview
 Static overhead memory is the amount of additional memory
required to power-on or resume a VM
 Various factors affect the computation
• VM config. Parameters
• VMware features (FT, CBRC; etc)
• Host config. parameters
• ESXi Build Number
 This value is used for admission control!
• (Reservation + static overhead) must be admitted for a successful power-on
• Used by DRS initial placement, DRS load-balancing, manual migrations

3636
Static VM Overhead Memory: Better Estimation
 Better estimation of static overhead memory leads to more
consolidation during power-on!
• www.yellow-bricks.com/2013/05/06/dynamic-versus-static-overhead-memory
 Some very encouraging results so far
 Here’s an example of a 10 vCPU VM with memsize = 120 GB
• The current approach computes static overhead memory:
• The new approach computes static overhead memory:
<drumroll>
988.06 MB
12.26 GB

37
Proactive DRS: Overview
Monitor
ComputeEvaluate
Remediate Verify
 Inventory
 Rules and constraints
 Available capacity
 Usage statistics
 Alarms, notifications
 RP tree integrity
 Rule and constraint
parameters
 Free capacity
 Growth rates
 Derived metrics
 RP violations
 Rule and constrain
violations
 Capacity sufficient?
 VMs happy?
 Cluster balanced?
 Migrate to fix violations
 Increase VM entitlement
• On same host (re-divvy)
• On another host (migrate)
• On another host
(power on + migrate)
Monitor
RemediateCompute
Evaluate Predict
 Motivation
• VM demand can be clipped
• Remediation can be
expensive
 Goals
• Predict and avoid spikes
• Make remediation cheaper
• Proactive load-balancing
• Proactive DPM
• Use predicted data for
placement, evacuation
DRS life-cycleProactive DRS life-cycle

3838
Proactive DRS: Predicting Demand with vC Ops
 VMware Operations Manager – a.k.a. vC Ops
• Monitors VCs and their inventories
• Collects stats, does analytics and capacity planning
 Monitors VC performance stats and provides dynamic thresholds
• Gray area (literally )
• Range of predicted values
 Proactive DRS uses upper bound as predicted demand
• Various CPU and Memory metrics
• Not perfect: capacity planning vs demand estimation
• Work in progress, encouraging results so far

3939
Proactive DRS: Fling
 We wrote you a fling!
• http://www.labs.vmware.com/flings/proactive-drs
• http://flingcontest.vmware.com
 Fun tool to try out basic Proactive DRS
• Connects to vC Ops, downloads dynamic thresholds, creates stats file
• Puts per-cluster “predicted.stats” files into your VCs, runs once a day
 To enable Proactive DRS, simply set ProactiveDRS = 1
 More details, assumptions, caveats on the fling page

4141
In Summary
 VM Happiness is the key goal of DRS
 Load balancing keeps VMs happy and resource utilization fair
 New features in vSphere 5.5
• Automatic management of #VMs per ESX host, Latency-sensitive VM support
• Better ready-time handling, Better memory demand estimation
 Interoperability with vFlash, vSAN, autoscaling proxy switch ports
 Tons of cool DRS stuff in the pipeline, feedback welcome!
• Beginnings of network DRS, Better estimation of VM static overhead memory
• Proactive DRS!

VMworld 2013: DRS: New Features, Best Practices and Future Directions

4646
DRS — Overview
DRS
Cluster
vMotion
 Ease of Management
 Initial Placement
 Runtime CPU/Memory Load-balancing
 VM-to-VM and VM-to-host
Affinity and Anti-Affinity
 Host Maintenance Mode
 Add Host
•••
Always respect resource controls!

4747
How Are Demands Computed?
 Demand indicates what a VM could consume given more resources
 Demand can be higher than utilization
• Think ready time
 CPU demand is a function of many CPU stats
 Memory demand is computed by tracking pages in guest address
space and the percentage touched in a given interval

4848
Evaluating Candidates Moves to Generate Recommendations
 Consider migrations from over-loaded to under-loaded hosts
• Δ = Imbalance Metricbefore move – Imbalance Metricafter move
• ‘Δ’ is also called goodness
 Constraints and resource specifications are respected
• VMs from Host2 are anti-affine with Host1
 Prerequisite/dependent migrations also considered
 List of good moves is filtered further
Only the best of these moves are recommended
Host1
Host2
Host3

4949
CPU Ready Time: Final Remarks
 “Host utilization is very high, %RDY is very high”
• Well, physics.. 
 Other reasons for performance hit due to high %RDY values
• Heterogenous workloads on vCPUs of a VM, correlated workloads
 Playing with more ideas in DRS Labs!
• Better CPU demand estimation
• NUMA-aware DRS placement and load-balancing
 Active area of investigation and research, chime in! We need help!
• Give us more problem details, VC and ESX support files
• www.yellow-bricks.com/2013/05/09/drs-not-taking-cpu-ready-time-in-to-
account-need-your-help

VMworld 2013: DRS: New Features, Best Practices and Future Directions

More Related Content

VMworld 2013: DRS: New Features, Best Practices and Future Directions