SQL Server On VMware-Best Practices Guide
SQL Server On VMware-Best Practices Guide
2012 VMware, Inc. All rights reserved. This product is protected by U.S. and international copyright and
intellectual property laws. This product is covered by one or more patents listed at
http://www.vmware.com/download/patents.html.
VMware is a registered trademark or trademark of VMware, Inc. in the United States and/or other
jurisdictions. All other marks and names mentioned herein may be trademarks of their respective
companies.
VMware, Inc.
3401 Hillview Ave
Palo Alto, CA 94304
www.vmware.com
2012 VMware, Inc. All rights reserved.
Page 2 of 48
Contents
1.
Introduction ...................................................................................... 5
1.1 Purpose .......................................................................................................................... 5
1.2 Target Audience ............................................................................................................. 6
1.3 Scope ............................................................................................................................. 6
2.
3.
4.
5.
6.
7.
List of Figures
Figure 1. Scale-Up Performance in vSphere 4 Compared with Native......................................................... 8
Figure 2. Consolidation of Multiple SQL Server Virtual Machines ................................................................ 9
Figure 3. Overcommit Fairness for Eight Virtual Machines ........................................................................ 10
Figure 4. Traditional Physical Database Consolidation .............................................................................. 16
Figure 5. Database Consolidation on vSphere ........................................................................................... 17
Figure 6. Scale-Up versus Scale-Out Approach ......................................................................................... 18
Figure 7. Virtual Machine Memory Settings ................................................................................................ 22
Figure 8. VMware Storage Virtualization Stack .......................................................................................... 25
Figure 9. Random Mixed (50% read/50% write) I/O Operations per Second (higher is better) ................. 26
Figure 10. Sequential Read I/O Operations per Second (higher is better) ................................................. 27
Figure 11. Read Throughput for Different I/O Block Sizes ......................................................................... 29
Figure 12. Write Throughput for Different I/O Block Sizes .......................................................................... 29
Figure 13. Storage Multipathing Requirements for vSphere ....................................................................... 30
Figure 14. Configure EagerZeroedThick Disk in vSphere 4.x .................................................................... 32
Figure 15. Configure EagerZeroedThick Disk in vSphere 5.x .................................................................... 32
Figure 16. Optimize with Device Separation ............................................................................................... 33
Figure 17. Virtual Networking Concepts ..................................................................................................... 34
Figure 18. Performance Chart Viewed with the vSphere Client ................................................................. 39
Figure 19. resxtop CPU Metrics .................................................................................................................. 40
Figure 20. vFabric Data Director ................................................................................................................. 46
Figure 21. vCenter Operations Manager .................................................................................................... 47
Figure 22. VMware vCenter Site Recovery Manager ................................................................................. 48
List of Tables
Table 1. Scale-Up Performance .................................................................................................................... 8
Table 2. Aggregate System Metrics for Eight SQL Server Virtual Machine ............................................... 10
Table 3. SQL Server 2012 High Availability Options .................................................................................. 13
Table 4. Metrics ........................................................................................................................................... 14
Table 5. Overhead Memory Required for Power-On .................................................................................. 22
Table 6. VMFS and Raw Disk Mapping Trade-Offs .................................................................................... 28
Table 7. Typical SQL Server Disk Access Patterns .................................................................................... 32
Table 8. Key SQL Counters ........................................................................................................................ 41
Table 9. Key Performance Metrics .............................................................................................................. 43
2012 VMware, Inc. All rights reserved.
Page 4 of 48
1.
Introduction
Microsoft SQL Server is one of the most widely deployed database platforms in the world, with many
organizations having dozens or even hundreds of instances deployed in their environments. The flexibility
of SQL Server, with its rich application capabilities combined with the low costs of x86 computing, has led
to a wide variety of SQL Server installations ranging from large data warehouses to small, highly
specialized departmental and application databases. The flexibility at the database layer translates
directly into application flexibility, giving end users more useful application features and ultimately
improving productivity.
Application flexibility often comes at a cost to operations. As the number of applications in the enterprise
continues to grow, an increasing number of SQL Server installations are brought under life-cycle
management. Each application has its own set of requirements for the database layer, resulting in
multiple versions, patch levels, and maintenance processes. For this reason, many application owners
insist on having an SQL Server installation dedicated to an application. As application workloads vary
greatly, many SQL Server installations are allocated more hardware than they need, while others are
starved for compute resources.
The challenge for the administrator is to provide database services to application owners with the
flexibility and autonomy they expect while keeping the infrastructure as simple and economical as
possible. The proliferation of large, multisocket, multicore servers has led many organizations to attempt
traditional database consolidation, moving small databases into large shared database environments.
Migrating to such a model can be an extremely complex endeavor requiring in-depth application
remediation at the forefront and rigorous attention to operational processes after implementation for
version control and continued application compatibility.
Virtualizing Microsoft SQL Server with VMware vSphere can allow the best of both worlds,
simultaneously optimizing compute resources through server consolidation and maintaining application
flexibility through role isolation. SQL Server workloads can be migrated in their current states without
expensive and error-prone application remediation, and without changing operating system or application
versions or patch levels. For high performance databases, VMware and partners have demonstrated the
capabilities of vSphere to run the most challenging SQL Server workloads. For smaller, specialized
databases, vSphere offers high consolidation ratios and advanced resource scheduling features, giving
application owners the flexibility and performance they need while simplifying and lowering costs for the
enterprise.
In addition, SQL Server virtual machines are much easier to manage than physical servers. For example,
VMware vSphere vMotion can help to reduce the impact of business or infrastructure changes by
migrating live virtual machines to another physical server in case of hardware changes or upgrades,
without interrupting the users or their applications. VMware vSphere Distributed Resource Scheduler
(DRS) can be used to dynamically balance SQL Server workloads and VMware vSphere High Availability
(HA) and VMware vSphere Fault Tolerance (FT) can provide simple and reliable protection for SQL
Server virtual machines. vSphere is the key to satisfying your organizations need for a rich application
environment reliant on flexible database services while simultaneously providing substantial cost savings
and unprecedented management capabilities.
1.1
Purpose
This guide provides best practice guidelines for deploying Microsoft SQL Server on vSphere. The
recommendations in this guide are not specific to any particular set of hardware or to the size and scope
of any particular SQL Server implementation. The examples and considerations in this document provide
guidance only and do not represent strict design requirements, as varying application requirements would
result in many valid configuration possibilities.
1.2
Target Audience
This guide assumes a basic knowledge and understanding of VMware vSphere and SQL Server.
Architectural staff can use this document to gain an understanding of how the system will work as a
whole as they design and implement various components.
Engineers and administrators can use this document as a catalog of technical capabilities.
DBA staff can use this document to gain an understanding of how SQL might fit into a virtual
infrastructure.
Management staff and process owners can use this document to help model business processes to
take advantage of the savings and operational efficiencies achieved with virtualization.
1.3
Scope
Section 2, SQL Server Performance on vSphere This section provides background information on
SQL Server performance in a virtual machine. It also provides guidelines for conducting and
measuring internal performance tests.
Section 4, VMware vSphere Host Best Practices for SQL Server This section provides best practice
guidelines for properly preparing the vSphere platform to run SQL Server. This section includes
guidance in the areas of CPU, memory, storage, and networking.
Section 5, SQL Server In-Guest Best Practices This section provides best practice guidelines for
configuring SQL Server parameter that could affect performance when running on the vSphere
platform.
Section 6, Ongoing Performance Monitoring and Tuning This section provides best practice
guidelines for configuring SQL Server to run on the vSphere platform.
Section 7, vSphere Enhancements for Deployment and Operations This section provides a brief
look at vSphere features and add-ons that enhance deployment and management of SQL Server.
The following topics are out of scope for this document, but may be addressed in other documentation in
this solution kit:
Availability and Recovery Options Although this Best Practices Guide briefly covers VMware
features that can enhance availability and recovery, a more in-depth discussion of this subject is
covered in Microsoft SQL Server on VMware Availability and Recovery Options, included in this
solution kit.
Support and Licensing This information can be found in the Microsoft SQL Server on VMware:
Support and Licensing Guide included in this solution kit. All information in this section is based on
the most current Microsoft material at the time of this writing and is subject to change without notice.
This and other guides in this solution kit are limited in focus to deploying SQL Server on vSphere. SQL
Server deployments cover a wide subject area, and SQL Server design principles should always follow
Microsoft guidelines.
2.
VMware vSphere contains numerous performance related features that make it easy to virtualize a
resource-heavy database with minimal impact to performance. The improved resource management
capabilities in vSphere facilitate more effective consolidation of multiple SQL Server virtual machines on a
single host without compromising performance or scalability. Greater consolidation can significantly
reduce the cost of physical infrastructure and of licensing SQL Server, even in smaller-scale
environments.
In 2009, VMware conducted a detailed performance analysis of Microsoft SQL Server 2008 running on
vSphere. The performance test placed a significant load on the CPU, memory, storage, and network
subsystems. The results demonstrate efficient and highly scalable performance for an enterprise
database workload running on a virtual platform.
To demonstrate the performance and scalability of the vSphere platform, the test performed the following:
Measured performance of SQL Server 2008 in an 8 virtual CPU (vCPU), 58GB virtual machine using
a high-end OLTP workload derived from TPC-E1.
Scaled the workload, database, and virtual machine resources from 1 vCPU to 8 vCPUs (scale up
tests).
Quantified the performance gains from some of the key new features in vSphere.
Single virtual machine OLTP throughput relative to native (physical machine) performance in the
same configuration.
The following sections summarize the performance results of experiments with the Brokerage workload
on SQL Server in a native and virtual environment. Single and multiple virtual machine results are
examined, and results are given that show improvements due to specific vSphere 4.0 features.
The information in this guide is limited to a summary of test results. See Performance and Scalability of
Microsoft SQL Server on VMware vSphere 4
(http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf).
2.1
The following figure shows how vSphere 4 performs and scales relative to native. The results are
normalized to the throughput observed in a 1 CPU native configuration.
Figure 1. Scale-Up Performance in vSphere 4 Compared with Native
The chart demonstrates the 1 and 2 vCPU virtual machines performing at 92 percent of native. The 4 and
8 vCPU virtual machines achieve 88 and 86 percent of the non-virtual throughput, respectively. At 1, 2,
and 4 vCPUs on the 8 CPU server, vSphere is able to effectively offload to idle cores certain tasks such
as I/O processing. Having idle processors also gives vSphere resource management more flexibility in
making virtual CPU scheduling decisions. However, even with 8 vCPUs on a fully committed system,
vSphere still delivers excellent performance relative to the native system.
The scaling in the chart represents the throughput as all aspects of the system are scaled such as
number of CPUs, size of the benchmark database, and SQL Server buffer cache memory. The following
table shows vSphere scaling comparable to the native configurations ability to scale performance.
Table 1. Scale-Up Performance
Comparison
Performance Gain
1.71
1.67
2.2
These experiments demonstrate that multiple heavy SQL Server virtual machines can be consolidated to
achieve scalable aggregate throughput with minimal performance impact to individual virtual machines.
The following figure shows the total benchmark throughput as eight 2 vCPU SQL Server virtual machines
are added to the Brokerage workload onto the single 8-way host.
Figure 2. Consolidation of Multiple SQL Server Virtual Machines
Each 2 vCPU virtual machine consumes about 15 percent of the total physical CPUs, 5GB of memory in
the SQL Server buffer cache, and performs about 3600 I/Os per second (IOPS).
As the graph illustrates, the throughput increases linearly as up to four virtual machines (8 vCPUs) are
added. As the physical CPUs were overcommitted by increasing the number of virtual machines from four
to six (a factor of 1.5), the aggregate throughput increases by a factor of 1.4.
Adding eight virtual machines to this saturates the physical CPUs on this host. vSphere 4.0 now
schedules 16 vCPUs onto eight physical CPUs, yet the benchmark aggregate throughput increases a
further 5% as the vSphere scheduler is able to deliver more throughput using the few idle cycles left over
in the 6 vCPU configuration.
The following table highlights the resource-intensive nature of the eight virtual machines that were used
for the scale out experiments.
Table 2. Aggregate System Metrics for Eight SQL Server Virtual Machine
Aggregate Throughput in
Transactions per Second
Host CPU
Utilization
Network Packet
Rate
Network
Bandwidth
2760
100%
23K
8Kbps receive
9Mbps receive
7.5Kbps send
98Mbps send
3.
When considering SQL Server instances as candidates for virtualization, you need a clear understanding
of the business and technical requirements for each instance. These requirements span multiple
dimensions, such as availability, performance, scalability, growth and headroom, patching, and backups.
Use the following high-level procedure to simplify the process of identifying SQL Server candidates for
virtualization.
1. Understand the database workload requirements for each instance of SQL Server.
2. Understand availability and recovery requirements, including uptime guarantees (number of nines)
and site recovery.
3. Capture resource utilization baselines for existing databases (on physical).
4. Plan the migration to vSphere.
5. Understand database consolidation approaches.
3.1
The SQL Server database platform can support a very wide variety of applications. Before deploying SQL
Server on vSphere, you must understand the database workload requirements of the applications that
your SQL Servers will support. Each application will have different requirements for capacity,
performance, and availability, and consequently, each database should be designed to optimally support
those requirements. Many organizations classify databases into multiple management tiers, using
application requirements to define service level agreements (SLAs). The classification of a database
server will often dictate the resources allocated to it.
Mission critical databases (sometimes referred to as Tier 1 databases) are considered absolutely
essential to your companys core operations. Mission critical databases and the applications they
support often have SLAs that require very high levels of performance and availability. SQL Server
virtual machines running mission critical databases might require more careful resource allocation
(CPU, memory, disk) to achieve optimal performance. They might also be candidates for database
mirroring or failover clustering.
Other databases and applications are busy only during specific periods for such tasks as reporting,
batch jobs, and application integration or ETL workloads. These databases and applications might be
essential to your companys operations, but they have much less stringent requirements for
performance and availability. They may, nonetheless, have other very stringent business
requirements, such as data validation and audit trails.
Still other smaller, lightly used databases typically support departmental applications that may not
adversely affect your companys real-time operations if there is an outage. You can tolerate such
databases and applications being down for extended periods.
Resource needs for SQL Server deployments are defined in terms of CPU, memory, disk and network
I/O, user connections, transaction throughput, query execution efficiency/latencies, and database size.
Some customers have established targets for system utilization on hosts running SQL Server, such as
80% CPU utilization, leaving enough headroom for any usage spikes.
Understanding database workloads and how to allocate resources to meet service levels helps you to
define appropriate virtual machine configurations for individual SQL Server databases. Because you can
consolidate multiple workloads on a single vSphere host, this characterization also helps you to design a
vSphere and storage hardware configuration that provides the resources you need to deploy multiple
workloads successfully on vSphere.
3.2
Running Microsoft SQL Server on vSphere offers many options for database availability and disaster
recovery utilizing the best features from both VMware and Microsoft. For example, vSphere vMotion and
vSphere DRS can help to reduce planned downtime and balance workloads dynamically, and VMware
vSphere High Availability (HA) can help to recover SQL Servers in the case of host failure.
3.2.1 vSphere HA, DRS, and vSphere vMotion for High Availability
VMware technologies such as vSphere HA, vSphere DRS, and vMotion can be used in a high availability
design.
3.2.1.1. VMware vSphere High Availability
vSphere HA provides easy-to-use, cost-effective, high availability for applications running in virtual
machines. In the event of physical server failure, affected virtual machines are automatically restarted on
other production servers that have spare capacity. Additionally, if there is an operating system-related
failure within a virtual machine, the failure is detected by vSphere HA and the affected virtual machine is
restarted on the same physical server.
3.2.1.2. vSphere Distributed Resource Scheduler
vSphere DRS collects resource usage information for all hosts and virtual machines and generates
recommendations for virtual machine placement. These recommendations can be applied manually or
automatically. DRS can dynamically load balance all virtual machines in the environment by shifting
workloads across the entire pool of vSphere hosts so that critical SQL Server virtual machines in the
environment always have the CPU and RAM resources needed to maintain optimal performance.
3.2.1.3. vSphere vMotion
vSphere vMotion leverages the complete virtualization of servers, storage, and networking to move a
running virtual machine from one physical server to another. This migration is performed with no impact to
running workloads or connected users. During vMotion migration, the active memory and execution state
of the virtual machine is rapidly transmitted over the network to the new physical server, while maintaining
its network identity and connections.
Granularity
Storage Type
RTO Downtime
AlwaysOn Availability
Groups
Database
Non-shared
None (with
synchronous commit
mode)
~3 seconds or
Administrator
Recovery
AlwaysOn Failover
Cluster Instances
Instance
Shared
None
~30 seconds
Database Mirroring
Database
Non-shared
< 3seconds or
Administrator
Recovery
Log Shipping
Database
Non-shared
Possible transaction
log
Administrator
Recovery
For a more in-depth treatment of availability and recovery options for SQL Server, see SQL Server on
VMware Availability and Recovery Options.
Each of these options is covered in SQL Server on VMware Availability and Recovery Options.
3.3
After you clearly understand your organizations needsbusiness and technical requirements, availability
and other operational requirements for implementing SQL Serverthe next important step is to establish
a baseline, using data from the current, running physical deployment. The best way to accomplish this is
to use the Virtualization Assessment service delivered by VMware or its partners. Using VMware Capacity
Planner, this service collects all essential performance metrics, including processor, disk, memory, and
network statistics, and specific SQL Server metrics on existing database servers in the environment.
Capacity Planner analyzes this data to recommend server consolidation opportunities. You can monitor
these metrics, in addition to the essential system metrics, if you want to capture the baseline information
yourself. It is important to collect the data over a period of time long enough to reflect all variations in the
usage patterns of SQL Server in your organization. The duration can range from one week to one month,
depending on seasonal or cyclical usage peaks such as when they occur, their intensity, how long they
last, and other factors. This exercise helps you to understand what resources your current physical SQL
Servers use, and makes it easier for you to create a short list of SQL Server instances to virtualize and to
determine the order in which you should virtualize the SQL Server instances. Refer to the following
metrics when running your own analysis. Capture data for average as well as peak periods.
Table 4. Metrics
Resource Type
Perfmon Counter
Note
Processor
Processor(*)\%Processor Time
Process(sqlservr)\%Processor Time
Process(msmdsrv)\%Processor Time
Memory\Available Mbytes
Memory\Pages Input/Sec
PhysicalDisk(*)\Disk Reads/Sec
PhysicalDisk(*)\Disk Writes/Sec
Memory
Disk
Network
3.4
Perfmon Counter
Note
PhysicalDisk(*)\Disk Sec/Read
PhysicalDisk(*)\Disk Sec/Write
Network Interface(*)\Packets
Received/Sec
After you establish baseline profiles for your existing SQL Server databases, the next step is to design a
vSphere architecture that meets these profiles. The best practices described in Section 4, VMware
vSphere Host Best Practices for SQL Server, can help you to optimize your vSphere environment for SQL
Server.
Microsoft offers extensive information on best practices for deploying SQL Server in the Microsoft SQL
Server Tech Center (http://technet.microsoft.com/en-us/sqlserver/default.aspx) on Microsoft Technet.
(Third-party web sites are not controlled by VMware, so links to these Web sites might change.)
These best practices papers provide real-world guidelines and expert tips, and you should follow them for
SQL Server deployments on vSphere as well. Similarly, VMware recommends following vendor-specific
best practice guidelines for configuring your server hardware, storage subsystems, and network. In
general, best practices in physical environments also apply to deployments on VMware vSphere without
any changes.
Virtualizing SQL Server should provide benefits that go beyond server consolidation and lower total cost
of ownership. A successful SQL Server deployment using vSphere provides better management and
administration flexibility as well as higher levels of availability at lower cost.
3.5
Most legacy database applications were traditionally deployed on dedicated physical hardware. Each
database instance provisioned onto its own database server. This approach allows the database
administration team to provide predictable service levels to their customer base through isolation of those
physical resources. Rogue database instances cannot consume resources from other databases and
impact their service levels. A significant limitation of this approach is that each of these databases is
bounded by the physical resources available on its physical server and consequently is usually
significantly overprovisioned. VMware captured capacity planning data from the servers of the customer
base and found that physical database servers use between 5% and 15% of available resources on
average, as shown in the following figure. This is a very inefficient use of server hardware, power and
cooling resources, as well as database software licenses. Customers have recognized the inefficiency in
this approach and in some cases have looked at physical database consolidation as a solution.
Figure 4. Traditional Physical Database Consolidation
Database consolidation within a virtual infrastructure provides the benefits of physical database
consolidation while also significantly reducing the described implementation challenges. Many customers
approach virtual database consolidation by doing a Physical to Virtual (P2V) conversion of each of their
physical servers. Because the new virtual machine contains the entire isolated software stack that was on
the physical server, there is no reduction in resource isolation from a Windows or SQL Server
perspective. There is no need to re-architect the security model within the new Windows guest operating
system. vSphere provides the ability to present resources (CPU, memory, and storage) to virtual
machines as needed, and to guarantee those resources for applications that require it. These capabilities
reduce the need to overprovision the virtual machine to handle peak workloads. The virtual machines also
2012 VMware, Inc. All rights reserved.
Page 17 of 48
The scale-up approach involves consolidating multiple SQL Server instances in a single large virtual
machine. The scale-up approach is attractive because it enables you to consolidate onto fewer physical
servers and may provide some Windows licensing advantages. However, this approach requires making
similar compromises discussed earlier regarding application compatibility, workload isolation, flexibility for
maintenance, availability requirements, and security. Furthermore, in a virtual infrastructure, deploying
additional virtual machines to house additional instances of SQL Server is relatively painless.
The larger virtual machines also exhibit higher performance overhead associated with the scaling of SMP
virtual processors. The scale-up approach may result in bottlenecks in the operating system itself,
especially if you reach 32-bit operating system memory limits. Additionally, using larger virtual machines
also makes it harder for DRS to move the virtual machine due to the greater demand placed on available
resources.
In a virtual infrastructure, these drawbacks to the scale-up approach generally outweigh the benefits. The
scale-out approach is generally more appropriate for a virtual infrastructure. With the scale-out approach,
you deploy fewer SQL instances per virtual machine and customize the configuration as needed. The
usual drawbacks to a scale-out approach that you encounter in a physical infrastructure, such as server
sprawl and high TCO, are minimized when you deploy a virtual infrastructure. This approach provides
better workload and security isolation, and allows easier maintenance and change management because
of the increased granularity of deploying fewer SQL instances per virtual machine. DRS can function
more effectively with smaller virtual machines, with the added benefit of faster vSphere vMotion
migrations.
2012 VMware, Inc. All rights reserved.
Page 18 of 48
4.
A properly designed vSphere host platform is crucial to the successful implementation of enterprise
applications such as SQL Server. Before we address best practices specific to the SQL Server
application, the following sections outline general best practices for designing your vSphere hosts.
4.1
General Guidelines
VMware Server, VMware Workstation, and even VMware Fusion are hosted products and are all
technically capable of running SQL Server. However, we strongly recommend using a VMware
enterprise-class hypervisor, vSphere, to deploy virtualized SQL Server instances, even for development
and test environments. When using the hosted products, depending on configurations and guest
operating system support, the disk I/O caching performed by the host operating system can provide
unpredictable performance and application availability results under specific conditions.
Follow SQL Server best practices. Microsoft offers extensive best practices for deploying SQL Server.
Best practices are documented on Microsoft TechNet in SQL Server Best Practices
(http://technet.microsoft.com/en-us/sqlserver/bb671430.aspx). These best practices are based on
real world guidelines and expert tips, and should be followed for virtual SQL Server deployments as
well.
The vSphere hosts should be sized with adequate capacity to provide resources for all running virtual
machines and have enough headroom to account for normal workload variability. This is especially
important when various virtual machines exhibit similar workload profiles, and are likely to bottleneck
in contention for the same resources.
4.2
SQL Server virtual machine configuration usually depends on the specific database profile. A thorough
virtualization exercise greatly simplifies virtual machine sizing. In general, follow the guidelines discussed
in the following sections.
For small SQL Server virtual machines, allocate virtual machine CPUs equal to or less than the
number of cores in each physical NUMA node.
For wide SQL Server virtual machines, size virtual machine CPUs to align with physical NUMA
boundaries. Configure vNUMA to enable SQL Server NUMA optimization to take advantage of
managing memory locality.
Any This is the default setting. The vCPUs of this virtual machine can freely share cores with other
virtual CPUs of this or other virtual machines.
None The vCPUs of this virtual machine have exclusive use of a processor whenever they are
scheduled to the core. Selecting None in effect disables hyperthreading for your virtual machine. This
option is recommended if you are deploying a Tier 1 SQL Server workload.
Internal This option is similar to none. Virtual CPUs from this virtual machine cannot share cores
with virtual CPUs from other virtual machines. They can share cores with the other virtual CPUs from
the same virtual machine.
4.3
This section provides guidelines for allocation of memory to SQL Server virtual machines. The guidelines
outlined here take into account vSphere memory overhead and a virtual machines memory settings.
Memory sharing across virtual machines that have similar data (same guest operating systems).
Memory overcommitment, which means allocating more memory to virtual machines than is
physically available on the vSphere host. Overcommitment is not necessarily to be avoided. Many
customers can achieve high levels of consolidation and efficiency using it. However, overcommitment
must be carefully monitored to avoid negative performance impact.
A memory balloon technique, whereby virtual machines that do not need all memory they have been
allocated give memory to virtual machines that require additional allocated memory.
For more details about vSphere memory management concepts, see one of the following documents:
The vSphere memory settings for a virtual machine include the following parameters:
Touched memory memory actually used by the virtual machine. vSphere allocates guest operating
system memory only on demand.
Swappable virtual machine memory that can be reclaimed by the balloon driver or by vSphere
swapping. Ballooning occurs before vSphere swapping. If this memory is in use by the virtual
machine (touched and in use), the balloon driver causes the guest operating system to swap. Also,
this value is the size of the per-virtual machine swap file (.vswp) that is created on the VMware
vSphere VMFS (Virtual Machine File System) file.
If the balloon driver is unable to reclaim memory quickly enough, or is disabled, or not installed,
vSphere forcibly reclaims memory from the virtual machine using the VMkernel swap file.
1 vCPU
2 vCPU
4 vCPU
8 vCPU
256
20.29
24.28
32.23
48.16
1024
25.90
29.91
37.86
53.82
4096
48.64
52.72
60.67
76.78
16384
139.62
143.98
151.93
168.60
To avoid performance latency resulting from remote memory accesses, you should size an SQL
Server virtual machines memory so it is less than the amount available per NUMA node. Both
vSphere and SQL Server support NUMA. As with SQL Server, vSphere has intelligent, adaptive
NUMA scheduling and memory placement policies that can manage all virtual machines
transparently, so you do not need to deal with the complexity of manually balancing virtual machines
among nodes.
vSphere supports large pages in the guest operating system. If the operating system or application
can benefit from large pages on a native system, that operating system or application can potentially
achieve a similar performance improvement in a virtual machine. The use of large pages results in
reduced memory management overhead and can therefore increase hypervisor performance. For
details, see Large Page Performance (http://www.vmware.com/files/pdf/large_pg_performance.pdf).
Use the Active memory counter from vSphere with caution. Active memory is the amount of memory
thats currently being used by the guest operating system and its applications. SQL Server does its
own caching and memory management, so the Active memory counter might not accurately reflect
the memory consumption of an SQL Server workload. You should always confirm memory usage of
an SQL Server virtual machine by checking memory counters within the guest operating system.
Do not set memory limits for SQL Server virtual machines. Virtual machine memory allocation target
is subject to the virtual machines memory limit and reservation. vSphere offers features to enable
dynamic scalability of virtual machine memory. Setting memory limits can cause unexpected
swapping.
= 1MB on x86
= 2MB on x64
= 4MB on IA64
Use SQL Server memory performance metrics and work with your database administrator to determine
the SQL Server maximum server memory size and maximum number of worker threads. Refer to the
virtual machine overhead table for virtual machine overhead.
Enable Memory Page Sharing and Memory Ballooning. vSphere provides optimizations such as
memory sharing and memory ballooning to reduce the amount of physical memory used on the
underlying host. In some cases these optimizations can save more memory than is taken up by the
virtualization overhead.
Confirm that cumulative physical memory available on a server is adequate to meet the needs of the
virtual machines by testing target workloads in the virtualized environment. Memory overcommitment
should not adversely affect virtual machine performance as long as the actual virtual machine
memory requirements are less than the total memory available on the system.
4.4
Storage configuration is critical to any successful database deployment, especially in virtual environments
where you may consolidate many different SQL Server workloads on a single vSphere host. Your storage
subsystem should provide sufficient I/O throughput as well as storage capacity to accommodate the
cumulative needs of all virtual machines running on your vSphere hosts.
Most traditional physical SQL Server environments created many islands of information. When you move
to virtualized SQL Server deployments, a shared storage model strategy provides many benefits, such as
more effective storage resource utilization, reduced storage white space, better provisioning, and mobility
RDM
For Fibre Channel, read throughput is limited by the bandwidth of the 4Gbps Fibre Channel link for I/O
sizes at or above 64KB. For IP-based protocols, read throughput is limited by the bandwidth of the 1Gbps
Ethernet link for I/O sizes at or above 32KB.
The following figure shows, for each of the storage protocols, the sequential write throughput (in MB/sec)
of running a single virtual machine in a standard workload configuration for different I/O block sizes. Refer
to Comparison of Storage Protocol Performance in VMware vSphere 4
(http://www.vmware.com/files/pdf/perf_vsphere_storage_protocols.pdf).
Figure 12. Write Throughput for Different I/O Block Sizes
HBA (Host Bus Adapter) A device that connects one or more peripheral units to a computer and
manages data storage and I/O processing.
FC (Fibre Channel) A gigabit-speed networking technology used to build storage area networks
(SANs) and to transmit data.
SP (Storage Processor) A SAN component that processes HBA requests routed through an FC
switch and handles the RAID/volume functionality of the disk array.
Create VMFS partitions from within VMware vCenter. They are aligned by default.
Align the data disk for heavy I/O workloads using diskpart or, with Windows Server 2008, the disk
is automatically aligned to a 1 MB boundary.
Consult with the storage vendor for alignment recommendations on their hardware.
For more information about this topic see the white paper Performance Best Practices for VMware
vSphere 5.1 (http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.1.pdf).
4.4.5.2. Avoid Lazy Zeroing
When running on VMFS, virtual machine disk files can be deployed in three different formats: thin,
zeroedthick, and eagerzeroedthick. Thin provisioned disk enables 100% storage on demand, where disk
space is allocated and zeroed at the time disk is written. This is the vSphere default. Zeroedthick disk
storage is pre-allocated, but blocks are zeroed by the hypervisor the first time the disk is written.
Eagerzeroedthick disk is pre-allocated and zeroed when the disk is initialized during provision time. There
is no additional cost for zeroing the disk at run time.
Both thin and thick options employ a lazy zeroing technique to allow for more efficient disk space usage.
However that does not come without a cost, because there is a performance overhead during first write of
the disk. Depending on the SQL Server configuration and the type of workloads, the performance could
be significant. Write intensive workloads and database maintenance tasks tend to be penalized more by
lazy zeroing. If you are deploying a Tier 1 mission-critical SQL Server, VMware recommends that you
consider using eagerzeroedthick disks for SQL Server data, transaction log, and tempdb files. An
eagerzeroedthick disk can be configured by enabling the Support clustering features such as Fault
Tolerance option on the Disk Provisioning screen in vSphere 4.x or by selecting the Thick Provision
Eager Zeroed option in vSphere 5.x.
Random / Sequential
Read / Write
Size Range
OLTP Log
Sequential
Write
Up to 64K
OLTP Data
Random
Read/Write
8K
Bulk Insert
Sequential
Write
Sequential
Read
Backup
Sequential
Read
1 MB
Place SQL Server binary, log, and data files into separate VMDKs. In additional to the performance
advantage, separating SQL Server binary from data and log also provides better flexibility for backup.
The OS/SQL Server Binary VMDK can be backed up with snapshot-based backups, such as VMware
Data Recovery. The SQL Server data and log files can be backed up through traditional database
backup solutions.
Maintain 1:1 mapping between VMDKs and LUNs. When this is not possible, group VMDKs and SQL
Server files with similar I/O characteristics on common LUNs.
Use multiple vSCSI adapters. Place SQL Server binary, data, log onto separate vSCSI adapter
optimizes I/O by distributing load across multiple target devices.
Deploying multiple, lower-tier SQL Server systems on VMFS facilitates easier management and
administration of template cloning, snapshots, and storage consolidation.
Manage performance of VMFS. The aggregate IOPS demands of all virtual machines on the VMFS
should not exceed the IOPS capability the physical disks.
Use VMware vSphere Storage DRS for automatic load balancing between datastores to provide
space and avoid I/O bottlenecks as per pre-defined rules.
4.5
Networking in the virtual world follows the same concepts as in the physical world, but these concepts are
applied in software instead of using physical cables and switches. Many of the best practices that apply in
the physical world continue to apply in the virtual world, but there are additional considerations for traffic
segmentation, availability, and making sure that the throughput required by services hosted on a single
server can be fairly distributed.
As shown in the figure, the following components make up the virtual network:
Physical network interface (vmnic) Provides connectivity between the vSphere host and the local
area network.
vSwitch The virtual switch is created in software and provides connectivity between virtual
machines. Virtual switches must uplink to a physical NIC (also known as vmnic) to provide virtual
machines with connectivity to the LAN. Otherwise, virtual machine traffic is contained within the virtual
switch.
Port group Used to create a logical boundary within a virtual switch. This boundary can provide
VLAN segmentation when 802.1q trunking is passed from the physical switch, or can create a
boundary for policy settings.
Virtual NIC (vNIC) Provides connectivity between the virtual machine and the virtual switch.
2012 VMware, Inc. All rights reserved.
Page 34 of 48
Virtual Adapter Provides Management, vSphere vMotion, and FT Logging when connected to a
vSphere Distributed Switch.
NIC Team Group of physical NICs connected to the same physical/logical networks providing
redundancy.
Use NIC Teaming Use two physical NICs per vSwitch, and if possible, uplink the physical NICs to
separate physical switches. Teaming provides redundancy against NIC failure and, if connected to
separate physical switches, against switch failures. NIC teaming does not necessarily provide higher
throughput.
If a physical NIC is shared by multiple consumers (that is, virtual machines and/or the VMkernel),
each such consumer could impact the performance of others. Thus, for the best network
performance, use separate physical NICs for management traffic (vSphere vMotion, FT logging,
VMkernel) and virtual machine traffic. 10Gbps NICs have multiqueue support. 10Gbps NICs can
separate these consumers onto different queues. Thus as long as 10Gbps NICs have sufficient
queues available, this recommendation typically does not apply.
If using iSCSI, the network adapters should be dedicated to either network communication or iSCSI,
but not both.
Enable jumbo frames for iSCSI and the vSphere vMotion network.
Use the VMXNET3 paravirtualized NIC. VMXNET 3 is the latest generation of paravirtualized NICs
designed for performance. It offers several advanced features including multiqueue support, Receive
Side Scaling, IPv4/IPv6 offloads, and MSI/MSI-X interrupt delivery.
5.
5.1
SQL Server can dynamically adjust memory consumption based on workloads. SQL Server maximum
server memory and minimum server memory configuration settings allow you to define the range of
memory for the SQL Server process in use. The default setting for minimum server memory is 0, and the
default setting for maximum server memory is 2147483647MB. Minimum server memory will not
immediately be allocated on startup. However, after memory usage has reached this value due to client
load, SQL Server will not free memory unless the minimum server memory value is reduced.
SQL Server is capable of consuming all memory on the virtual machine. Setting the maximum server
memory allows you to reserve sufficient memory for the operating system and other applications running
on the virtual machine. In a traditional SQL Server consolidation scenario where you are running multiple
instances of SQL Server on the same virtual machine, setting maximum server memory will allow memory
to be shared effectively between the instances.
Setting the minimum server memory is a good practice to maintain SQL Server performance under host
memory pressure. When running SQL Server on vSphere, if the vSphere host is under memory pressure,
the balloon drive might inflate and take memory back from the SQL Server virtual machine. Setting the
minimum server memory provides SQL Server with at least a reasonable amount of memory.
For Tier 1 mission-critical SQL Server deployments, consider setting the SQL Server memory to a fixed
amount by setting both maximum and minimum server memory to the same value. Before setting the
maximum and minimum server memory, confirm that adequate memory is left for the operating system
and virtual machine overhead.
For performing SQL Server maximum server memory sizing for vSphere, use the following formulas as a
guide:
SQL Max Server Memory = VM Memory - ThreadStack - OS Mem - VM Overhead
ThreadStack = SQL Max Worker Threads * ThreadStackSize
ThreadStackSize
= 1MB on x86
= 2MB on x64
= 4MB on IA64
5.2
Granting the Lock Pages in Memory user right to the SQL Server service account prevents SQL Server
buffer pool pages from paging out by Windows. This setting is useful and has a positive performance
impact because it prevents Windows from paging a significant amount of buffer pool memory out of the
process, which enables SQL Server to manage the reduction of its own working set.
Any time Lock Pages in Memory is used, because SQL Server memory is locked and cannot be paged
out by Windows, you might experience negative impacts if the vSphere balloon driver is trying to reclaim
memory from the virtual machine. If you set the SQL Server Lock Pages in Memory user right, also set
the virtual machines reservations to match the amount of memory you set in the virtual machine
configuration.
5.3
Large Pages
Hardware assist for MMU virtualization typically improves the performance for many workloads. However,
it can introduce overhead arising from increased latency in the processing of TLB misses. This cost can
be eliminated or mitigated with the use of large pages. Refer to Large Page Performance
http://www.vmware.com/resources/techresources/1039 for additional information.
SQL Server supports the concept of large pages when allocating memory for some internal structures
and the buffer pool, when the following conditions are met:
The Lock Pages in Memory privilege is set for the service account.
As of SQL Server 2008, some of the internal structures, such as lock management and buffer hash, can
use large pages automatically if the preceding conditions are met. You can confirm that by checking the
ERRORLOG for the following messages:
2009-06-04 12:21:08.16 Server Large Page Extensions enabled.
2009-06-04 12:21:08.16 Server Large Page Granularity: 2097152
2009-06-04 12:21:08.21 Server Large Page Allocated: 32MB
On 64-bit system, you can further enable all SQL Server buffer pool memory to use large pages by
starting SQL Server with trace flag 834. Consider the following behavior changes when you enable trace
flag 834:
With large pages enabled in the guest operating system, and the virtual machine is running on a host
that supports large pages, vSphere does not perform Transparent Page Sharing on the virtual
machines memory.
With trace flag 834 enabled, SQL Server startup behavior changes. Instead of allocating memory
dynamically at runtime, SQL Server allocates all buffer pool memory during startup. Therefore, SQL
Server startup time can be significantly delayed.
With trace flag 834 enabled, SQL Server allocates memory in 2MB contiguous blocks instead of 4KB
blocks. After the host has been running for a long time, it might be difficult to obtain contiguous
memory due to fragmentation. If SQL Server is unable to allocate the amount of contiguous memory it
needs, it can try to allocate less, and SQL Server might then run with less memory than you intended.
Refer to SQL Server and Large Pages Explained (http://blogs.msdn.com/b/psssql/archive/2009/06/05/sqlserver-and-large-pages-explained.aspx) for additional information on running SQL Server with large
pages.
5.4
Customers might have virtual scan software running on an SQL Server virtual machine. However, you
can consider excluding SQL Server data and log files from real time virus monitoring and scanning to
minimize performance impacts to SQL Server. The database server should be secured by other means
so that it is not vulnerable to malware. This includes both strict access controls and operational discipline
such as not using an Internet browser to download data or executable files from external sites.
6.
Virtualization adds new software layers and new types of interactions between the database and
hardware components. Although the general methodology for monitoring and troubleshooting database
performance does not change, VMware provides additional tools for monitoring and troubleshooting at the
physical host level.
6.1
When monitoring database performance, you can use the database virtual machine-level performance
monitoring tools as the primary tools for identifying problem areas and resource bottlenecks. The
methodologies are the same as performance monitoring for a physical database server. VMware provides
additional tools for host-level performance monitoring. You can further correlate performance data
collected from the database virtual machine level with data from the host level. By focusing on key
performance metrics you can quickly isolate issues to a particular resource area.
Metric
Description
SQLServer :SQL
Statistics
Batch Requests/sec
SQL Compilations/sec
SQL ReCompilations/sec
SQLServer: Buffer
Manager
SQLServer: Memory
Manager
Free Pages
Procedure Cache
Pages
Readahead
Pages/sec
Memory Grants
Pending
Connection Memory
(KB)
Granted Workspace
Memory (KB)
Metric
Description
SQLServer: General
Statistics
Logins/sec
Logout/sec
User connections
SQLServer: Latches
SQLServer:Locks
SQLServer: Databases
Number of
deadlocks/sec
Transactions /sec
6.2
Among the performance data exposed by vSphere, the following is a list of key metrics that can help
DBAs quickly isolate issues to a specific resource area such as CPU, memory, storage, or network. See
VMware Communities: Interpreting resxtop Statistics (http://communities.vmware.com/docs/DOC-9279)
and vCenter Performance Counters (http://communities.vmware.com/docs/DOC-5600) for a full list of
counters.
The measurement units reported in resxtop and the vSphere Client can differ. See the
preceding VMware community links for details.
Table 9. Key Performance Metrics
Resource
Metric (resxtop)
Metric
(vSphere
Client)
Host/Virtual
Machine
Description
CPU
%USED
Used
Both
%RDY
Ready
Virtual
Machine
%SYS
System
Both
Swapin, Swapout
Swapinrate,
Swapoutrate
Both
MCTLSZ (MB)
vmmemctl
Both
READs/s,
WRITEs/s
NumberRead,
NumberWrite
Both
DAVG/cmd
deviceLatency
Both
KAVG/cmd
KernelLatency
Both
GAVG/cmd
TotalLatency
Both
MbRX/s, MbTX/s
Received,
Transimitted
Both
PKTRX/s,
PKTTX/s
PacketsRx,
PacketsTx
Both
%DRPRX,
%DRPTX
DroppedRx,
DroppedTx
Both
Memory
Disk
Network
Of the CPU counters, the total used time indicates system load. Ready time indicates overloaded CPU
resources. A significant swap rate in the memory counters is a clear indication of a shortage of memory,
and high device latencies in the storage section point to an overloaded or misconfigured array. Network
traffic is not frequently the cause of most database performance problems.
2012 VMware, Inc. All rights reserved.
Page 43 of 48
7.
You can leverage vSphere to provide significant benefits in a virtualized SQL datacenter, including:
Increased operational flexibility and efficiency Rapid software applications and services deployment
in shorter time frames.
Efficient change management Increased productivity when testing the latest Windows and SQL
Server software patches and upgrades.
Minimized risk and enhanced IT service levels Zero-downtime maintenance capabilities, rapid
recovery times for high availability, and streamlined disaster recovery across the datacenter.
7.1
vSphere vMotion technology enables the migration of virtual machines from one physical server to
another without service interruption. This migration allows you to move SQL Server virtual machines from
a heavily-loaded server to one that is lightly loaded, or to offload them to allow for hardware maintenance
without any downtime.
DRS takes the vSphere vMotion capability a step further by adding an intelligent scheduler. DRS allows
you to set resource assignment policies that reflect business needs. DRS performs the calculations and
automatically handles the details of physical resource assignments. It dynamically monitors the workload
of the running virtual machines and the resource utilization of the physical servers within a cluster.
vSphere vMotion and DRS perform best under the following conditions:
The source and target vSphere hosts must be connected to the same gigabit network and the same
shared storage.
The virtual machine must not use physical devices such as CD-ROM or floppy.
The source and destination hosts must have compatible CPU models, or migration with vSphere
vMotion will fail. For a listing of servers with compatible CPUs, consult vSphere vMotion compatibility
guides from specific hardware vendors.
To minimize network traffic it is best to keep virtual machines that communicate with each other
together on the same host machine.
Virtual machines with smaller memory sizes are better candidates for migration than larger ones.
VMware does not currently support vSphere vMotion or DRS load balancing with SQL Server
failover cluster. However, a cold migration is possible after the guest operating system is properly
shut down.
With vSphere HA, SQL Server virtual machines on a failed vSphere host can be restarted on another
vSphere host. This feature provides a cost-effective failover alternative to expensive third-party clustering
and replication solutions. If you use vSphere HA, be aware that:
vSphere HA handles vSphere host hardware failure but does not monitor the status of the SQL
Server servicesthese must be monitored separately.
Proper DNS hostname resolution is required for each vSphere host in a vSphere HA cluster.
vSphere HA heartbeat is sent over the vSphere service console network, so redundancy in this
network is recommended.
7.2
Templates
VMware template cloning can increase system administration and testing productivity in SQL Server
environments. A VMware template is a golden image of a virtual machine that can be used as a master
copy to create and provision new virtual machines. You can use virtual machine templates create an
image of the operating system with service patches, significantly reduced the time needed to provision an
SQL Server.
7.3
VMware vFabric Data Director 2.5 extends support for SQL Server 2008 R2 and SQL Server 2012.
vFabric Data Director helps provision and administer databases more efficiently, more securely, and more
cost-effectively.
7.3.1 Standardization
One of the key objectives of vFabric Data Director is to bring order into the database management space.
By streamlining many of the repetitive tasks with easy to create templates, vFabric Data Director is able to
enforce predefined policies across the enterprise. This in turn allows IT to leverage the underlying High
Availability of the underlying virtualized infrastructure to scale dynamically and deliver higher SLAs.
By supporting common security and administration policies across databases, your new DBaaS can
leverage common policies across a farm of databases running a variety of operating systems. These
policies can enforce uniformity of frequent administration tasks across databases, providing compliance,
consistency, and security, and at a lower total cost.
7.3.2 Automation
vFabric Data Director automates the administration and monitoring tasks like creation, backup, recovery,
tuning, optimization, patching, and upgrading. Based on custom policies, vFabric Data Director enables
DBAs to greatly automate mundane database management tasks and focus on proactive maintenance
and dealing with abnormalities.
7.3.3 Self-Service
The basic requirement of anything as a service is the capability for users to self-provision necessary
resources. vFabric Data Director allows database consumers such as application developers, testers, and
architects to provision databases easily based on built-in or custom templates and under fine grained
security constrains.
7.1
The VMware vCenter Operations Management Suite can provide a holistic approach to performance,
capacity, and configuration management. By using patented analytics, service levels can be monitored
and maintained proactively. vCenter Operations Management Suite supports a number of adapters, for
example, VMware vFabric Hyperic , Microsoft System Center Operations Manager, and General
SQLLoader Adapter for monitoring SQL Server performance. When performance or capacity problems
arise in your SQL Server environment, VMware vCenter Operations Manager is able to analyze metrics
from the application all the way through to the infrastructure to provide insight into problematic
components, whether it is compute (physical or virtual), storage, networking, operating system, or
application related. By establishing trends over time, vCenter Operations Manager can cut through the
noise of false alerts and proactively alert on the potential root cause of building performance problems
before end users are impacted. The following is an example showing vCenter Operations Management
Suite using a vFabric Hyperic adapter to display statistics from an SQL Server database.
In an SQL Server environment, constant monitoring is required to maintain acceptable service levels.
vCenter Operations Management Suite includes patented capacity analytics which can run through what
if capacity scenarios to understand growth trends and identify upcoming compute power shortages or
over-provisioned resources. vCenter Operations Manager monitors configurations across virtual
machines and detects unwanted changes to help maintain continuous compliance with operational best
practices.
7.2
VMware vCenter Site Recovery Manager takes advantage of virtual machine encapsulation to make
testing and initiating DR failover a simple, integrated vCenter process. vCenter Site Recovery Manager
runs alongside VMware vCenter Server to provide planning, testing, and automated recovery in the
case of a disaster. By using VMware vSphere Replication or storage-based replication technology,
vCenter Site Recovery Manager eliminates the manual steps required during a failover scenario to
provide consistent and predictable results. At a high level, the steps that can be performed during a
failover test or an actual run are as follows:
Take and mount snapshot of recovery storage in read/write mode (test only).
SQL Server has a variety of options for disaster recovery, including AlwaysOn Availability Groups,
database mirroring, failover clustering, and log shipping. While all of these are good choices for SQL
Server recovery, the application-centric nature of these technologies might not be in line with a
companys disaster recovery plans. vCenter Site Recovery Manager is not a replacement for applicationaware clustering solutions that might be deployed within the guest operating system. vCenter Site
Recovery Manager provides integration of the storage replication solution, vSphere, and customerdeveloped scripts to provide a simple, repeatable, and reportable process for disaster recovery of the
entire virtual environment, regardless of the application.
Figure 22. VMware vCenter Site Recovery Manager