for SQL Server 2012 Glenn Berry Glenn@SQLskills.com Introduction
Using a standardized naming scheme for disks and directories
Considering your workload type for storage Choosing the storage type Introduction to RAID RAID levels and SQL Server workloads Provisioning your logical drives Testing logical drive performance Using a Standard Naming Scheme for Disks
This makes it easier to work with multiple database servers
It is also very useful for database mirroring and AlwaysOn AGs This also makes it easier for restoring during disaster recovery Example naming scheme: Drive letter and directory Purpose L:\SQLLogs SQL Server log files K:\SQLLogs SQL Server log files M:\SQLBackups SQL Server backup files P:\SQLData SQL Server data files Q:\SQLData SQL Server data files R:\SQLData SQL Server data files S:\SQLData SQL Server data files T:\TempDB SQL Server tempdb files Considering Your Workload for Storage
SQL Server can have several different workload types
Three most common types: Online Transaction Processing (OLTP) Relational Data Warehouse (DW) Online Analytical Processing (OLAP) These workload types have different I/O access patterns OLTP workload has frequent writes to data files and log file Also has random reads from data files if database does not fit in memory Random I/O performance is very important DW workload has large sequential reads from data files Sequential I/O performance is very important OLAP workload has lots of random reads from cube files Random I/O performance is very important Additional Workload Considerations
You may have a mixed I/O workload for several reasons
If you have multiple databases on the same instance This complicates and randomizes the I/O workload If you have multiple databases with log files on the same LUN This will make the I/O workload on that LUN random instead of sequential If you will be using HA/DR features that read from the transaction log This will cause reads from the LUN where the log files are located Index creation and maintenance will cause sequential I/O pressure Reads and writes to data files, and writes to log file Database backups will cause sequential I/O pressure Reads from data files and log files, and writes to the backup file(s) Database restores will cause sequential I/O pressure Reads from backup file(s) and writes to data files and log file Choosing the Storage Type for SQL Server
Depends on server usage, performance requirements, budget
Existing infrastructure, employee skillset, and politics also matter Four main storage types Internal drives - traditional magnetic drives or solid state drives (SSDs) PCI-E storage cards Direct-attached storage (DAS) - traditional magnetic drives or SSDs Storage area networks (SAN) - traditional magnetic drives or SSDs Internal, DAS and SAN can use hybrid or tiered-storage Mixture of magnetic storage and SSD storage Good compromise between space, performance and cost Storage details can make a huge difference for I/O performance 10K drives versus 15K drives, 3Gbps SAS versus 6Gbps SAS Bandwidth of RAID controller, HBA or iSCSI NIC is very important Tips for Requesting Storage
Don’t just ask for storage based on space requirements
You will be more likely to get poor performance storage Specify your necessary performance requirements Sequential performance in MB/second or GB/second Random performance in input/output operations per second (IOPS) Specify your necessary redundancy requirements What will the storage be used for? Data files, log files, backup files, etc. Consider using “short-stroking” to improve I/O performance Intentionally using a smaller percentage of your available space You ask for much more space than you think you will need RAID Basics
Redundant array of inexpensive disks (RAID)
Standardized method of managing multiple drives with a controller Provides redundancy and higher performance than a single drive Allows higher capacity logical drives than is possible with one drive Hardware RAID controllers manage multiple drives Server RAID controllers have dedicated cache memory Cache can be used for reads or writes or both Several different RAID levels are commonly used RAID 1 RAID 5 RAID 50 RAID 10 RAID 1
RAID 1 is called mirroring
Requires two physical drives Data is copied to both drives Requires 50% storage space overhead Drive array can survive the loss of one drive You need to replace the failed drive and allow the RAID controller to automatically rebuild the mirror as soon as possible No performance impact after the loss of one drive Very common to install the OS to a RAID 1 volume on a server Usually done with two internal drives in the server This allows the server to operate normally after losing one drive RAID 5
RAID 5 is called striping with parity
Requires at least three physical drives Data is striped between all drives After data is written to all drives, parity information is calculated and then striped to all of the drives This causes a write performance penalty This allows the array to survive the loss of one drive in the array Performance is severely affected after the loss of one drive Failed drive must be replaced as soon as possible Requires 1/(the number of drives) as storage overhead RAID 5 is very popular with I.T. departments It is quite economical because of low storage overhead Risk of failure goes up as you add drives to the array RAID 50
RAID 50 is called striping across multiple RAID 5 data sets
Requires at least six physical drives Minimum of two, three-drive RAID 5 arrays Requires 1/(number of drives) in each RAID 5 array for storage overhead Can survive the loss of one drive in each RAID 5 array Performs better than RAID 5 after the loss of one drive Can be a good compromise between RAID 5 and RAID 10 Less expensive than RAID 10 More expensive than RAID 5, but provides better redundancy Not all RAID controllers support RAID 50 RAID 10
RAID 10 is called a striped set of mirrors
Data is mirrored and then striped Possible to survive the loss of more than one drive Requires a minimum of four physical drives Must be an even number of physical drives No write performance penalty Very well-suited to write intensive workloads Ideal for SQL Server log files Requires a 50% storage space overhead More expensive than RAID 5 RAID 10 is very popular with database administrators Provides better write performance and better redundancy than RAID 5 It is more expensive than RAID 5 Raid Level and SQL Server Workloads
The number of spindles in an array is extremely important
A larger number of smaller drives will perform much better than a small number of larger drives RAID 5 has a write performance handicap RAID 5 cannot survive the loss of more than one disk in an array RAID 5 arrays with larger numbers of disks are more likely to lose a disk Try to put infrequently accessed data on RAID 5 to save money RAID 10 and RAID 1 have very good write performance RAID 10 also has more redundancy than RAID 5 Always try to use RAID 1 or RAID 10 for log files As a DBA, don’t negotiate with yourself on storage Ask for RAID 10, and then negotiate down if necessary Use RAID 5 for data files and backup files if necessary Provisioning Your Logical Drives
Drive arrays must be created and presented to the host
This is done differently depending on the storage type Internal drives Internal PCI-E storage Direct attached storage (DAS) Storage area network (SAN) Windows Disk Manager is used to initialize and format drives You also assign drive letters and name the LUN Use your standardized naming scheme It is a good idea to also create the directories for SQL Server This will make the SQL Server installation go more quickly Testing Your Logical Drive Performance
CrystalDiskMark is a quick way to test drive performance
Enables you to test each logical drive in a few minutes Tests sequential and random I/O performance Use for first round of testing, before you use SQLIO SQLIO allows you to do much more detailed logical drive testing Does not require SQL Server to be installed Does not generate a database specific workload Can be much more time consuming to run comprehensive tests CrystalDiskMark 3.0.1 x64
Very easy to use, no complicated configuration required
You can choose the file size for the test runs 50MB, 100MB, 500MB, 1000MB, 2000MB, 4000MB You can choose the file type Random data or non-random data Some SSD controllers use compression for performance Random data is not very compressible You can choose the number of test runs (1-9) Quickly measures sequential and random I/O performance Sequential reads and writes in MB/second Large and small random reads and writes at different queue depths Measured in MB/sec and IOPS Free download http://bit.ly/TDoGOi CrystalDiskMark Example Output SQLIO Disk Benchmark
Despite the name, it has nothing to do with SQL Server
Free tool developed by Microsoft to evaluate I/O performance You can use it on any server running a recent version of Windows Command-line utility Requires some expertise to properly configure and run Can take a long time to run a comprehensive set of tests Allows you to test the limits of your I/O subsystem Measures IOPS Sequential throughput in MB/second Latency in milliseconds Download location http://bit.ly/QxwUV8 Summary
Use a standardized naming scheme for drives and directories
Makes it easier to work with multiple database servers Makes it easier to use HA/DR features like AlwaysOn AGs Consider your workload type when you think about storage Read versus write frequency Sequential versus random I/O performance Choose the appropriate RAID level for your logical drives This affects performance, redundancy and cost Make sure your drives are provisioned and tested before you install SQL Server 2012 Use CrystalDiskMark for the first round of I/O testing Use SQLIO for more comprehensive I/O testing What is Next?
Module 4 will cover actually installing SQL Server 2012
Selecting which features to install Specifying Service accounts for SQL Server Services Changing SQL Server Agent to automatic start Database Engine configuration tasks Specifying data directories for SQL Server