MySQL ZFS Best Practices
MySQL ZFS Best Practices
One of the cool things about talking about MySQL performance with ZFS is Today
that there is not much tuning to be done Tuning with ZFS is considered Search
evil, but a necessity at times. In this blog I will describe some of the tunings
Search
that you can apply to get better performance with ZFS as well as point out
performance bugs which when fixed will nullify the need for some of these
tunings. Past Entries
For the impatient, here is the summary. See below for the reasoning behind My last day at
these recommendations and some gotchas. Sun - 9/18/2009
cmdtruss -- truss
-c MySQL
1. Match ZFS recordsize with Innodb page size (16KB for Innodb (COM_*)
Datafiles, and 128KB for Innodb log files). Commands
2. If you have a write heavy workload, use a Seperate ZFS Intent Log. Inniostat -
InnoDB IO
3. If your database working set size does not fit in memory, you can get a Statistics
big boost by using a SSD as L2ARC. MySQL Innodb
4. While using storage devices with battery backed caches or while ZFS Best
Practices
comparing ZFS with other filesystems, turn off the cache flush. Optimizing
5. Prefer to cache within MySQL/Innodb over the ZFS Adaptive MySQL
replacement cache (ARC). Performance with
ZFS - Slides
6. Disable ZFS prefetch. available
7. Disable Innodb double write buffer. MySQL 5.4 on 2
Socket Nehalem
Lets look at all of them in detail. system (Sun Fire
X4270)
Reducing Innodb
Match ZFS recordsize with Innodb page size (16KB for mutex contention
WHAT Datafiles, and 128KB for Innodb log files). MySQL
Scalability on
HOW zfs set recordsize=16k tank/db Nehalem
systems
The biggest boost in performance can be obtained by SSDs for
matching the ZFS record size with the size of the IO. Since a Performance
Innodb Page is 16KB in size, most read IO is of size 16KB Engineers
(except for some prefetch IO which can get coalesced). The Trading off
default recordsize for ZFS is 128KB. The mismatch between Efficiency for the
Sake of Flexibility
the read size and the ZFS recordsize can result in severely MySQL and UFS
inflated IO. If you issue a 16KB read and the data is not Introduction to
already there in the ARC, you have to read 128KB of data to the Innodb IO
get it. ZFS cannot do a small read because the checksum is subsystem
calculated for the whole block and you have to read it all to Building MySQL
5.1.28 on
verify data integrity. The other reason to match the IO size 5.1.28 on
WHY and the ZFS recordsize is the read-modify-write penalty. With Opensolaris
a ZFS recordsize of 128KB, When Innodb modifies a page, if using Sun Studio
compilers
the zfs record is not already in memory, it needs to be read in
Learning MySQL
from the disk and modified before writing to disk. This Internals via bug
increases the IO latency significantly. Luckily matching the reports
ZFS recordsize with the IO size removes all the problems Innodb just got
mentioned above. better!
Unlocking
For Innodb log file, the writes are usually sequential and MySQL : Whats
varying in size. By using ZFS recordsize of 128KB you hot and what's
amortize the cost of read-modify-write. not
Peeling the
MySQL
Scalability Onion
You need to set the recordsize before creating the database Storage engine
files. If you have already created the files, you need to copy or MySQL
NOTE the files to get the new recordsize. You can use the stat(2) server? Where
command to check the recordsize (look for IO Block: ) has the time
gone?
Improving filesort
performance in
If you have a write heavy workload, use a seperate intent log MySQL
WHAT (slog). uperf - A network
benchmark tool
HOW zpool add log c4t0d0 c4t1d0
If your database does not fit in memory, every time you miss
the database cache, you have to read a block from disk. This
cost is quite high with regular disks. You can minimize the
database cache miss latency by using a (or multiple) SSDs as
WHY
a level-2 cache or L2ARC. Depending on your database
working set size, memory and L2ARC size you may see
several orders of magnitude improvement in performance.
NOTE
NOTE
Posted at 01:21PM May 26, 2009 by Neelakanth Nadgir in MySQL |
Comments:
Post a Comment:
Comments are closed for this entry.