| Some time go, we announced a new release model for MySQL. As all new things, it had some initial hiccups (with MySQL 5.4 we were still getting acquainted with the new model), but now it seems to be in full swing. By the time you read these lines, MySQL 5.5 will be available. If the mirrors aren't seeded yet, the impatient can compile and use the new version from the launchpad source tree..
|
Overview
What's this new release anyway? I'll leave it to
Kaj's blog to give you the full description. Suffice it to say that this release is the second milestone of the current installment. It is of beta quality, and it will mature to RC quality. There will be yet another milestone before we release a GA in mid 2010.
One thing that this milestone shows is that there are no dead weights. If a feature doesn't make the deadline, i.e. it doesn't reach beta quality by the scheduled date, it will be dropped, and eventually rescued at the next milestone.
With the introduction of the milestone model, we have also increased our internal QA, especially thanks to the
Random Query Generator, which finds bugs in early stages of the code faster than any other method.
(1)Built-in InnoDB plugin
The InnoDB plugin 1.0.5 is included in the distribution, and, unlike MySQL 5.1, it's built-in. There is no need to load and register the plugin. The performance enhancements developed for MySQL 5.4 are now available together with the other enhancements available with the InnoDB plugin. This was already available in the previous milestone, but it's worth mentioning it now, because not many people are aware of that.
Semi-synchronous replication
Of all the new features, this one is probably the most relevant. It is based on a patch made by Google to the InnoDB engine, and adapted by MySQL developers to make it engine-independent.
In short, it's a safety device that establishes some internal communication between master and slaves, and makes sure that at least one slave has received the changes being committed. That is, before committing, the master waits until at least one slave has acknowledged that it has received the portion of binary log necessary to reproduce the transaction.
UPDATE As Harrison Fisk rightly notes, there is a mistake in my description. The acknowledgment of the transaction being sent to the slave only happens after the master's commit to the binary log.Some caveats apply:
- It's called semi-synchronous replication, because it doesn't necessarily apply to all the slaves. Although you can manually check if the replication has worked for all the slaves, it's enough for the master to make sure that at least one slave has got the goods.
- Received by a slave doesn't mean "executed". The slave acknowledges when it has got the binary log, even if the SQL thread is busy or stopped.
- If there is no slave that can acknowledge receipt (e.g. slaves are all down or stopped), then the master reverts to the normal asynchronous operation.
To use this feature, you need to install two plugins: one for the master and one for each slave. No need to compile anything, though. They are provided with the binaries. All you need to do is load the appropriate plugin for each server.
master > INSTALL PLUGIN rpl_semi_sync_master SONAME 'libsemisync_master.so';
slave1 > INSTALL PLUGIN rpl_semi_sync_slave SONAME 'libsemisync_slave.so';
Additionally, there are a few variables that you must set, either in the options file or online.
master > SET GLOBAL rpl_semi_sync_master_enabled=1;
slave1 > SET GLOBAL rpl_semi_sync_slave_enabled=1;
Now that the system is ready, let's see how to use it.
Before doing anything, we ask for the value of two important status variables:
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+-----------------------------+-------+
The first one is the number of failed synchronized transactions, the second one is the number of successful ones. Since nothing has happened so far, they are both zero.
create table t1 (i int not null primary key) engine=innodb;
Query OK, 0 rows affected (0.13 sec)
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 1 |
+-----------------------------+-------+
The first operation (a table creation) was successfully transferred to a slave. Let's do one more.
set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
insert into t1 values (1);
Query OK, 1 row affected (0.00 sec)
COMMIT;
Query OK, 0 rows affected (0.00 sec)
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 2 |
+-----------------------------+-------+
Also this one was successful.
Now, let's try something sneaky. On each slave, we execute "STOP SLAVE SQL_THREAD". Normal replication would not work, but semi-synchronous replication will go on.
insert into t1 values (2);
Query OK, 1 row affected (0.01 sec)
SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_yes_tx | 3 |
+-----------------------------+-------+
2 rows in set (0.00 sec)
The semi-synch replication has worked. However, if we query both master and slaves, only the master has the new record. The slaves have it only in their relay logs, which you can easily ascertain with mysqlbinlog.
Enhanced partitioning syntax
About one year ago, I briefly announced that this feature was
in the making. With some interface improvement, it is now part of the regular partitioning. It's an extension of partitioning BY RANGE. As you know, you can only partition on one column value, and you can only partition on INTEGER columns. Both these restrictions were lifted in 5.5, with a syntax change that makes the code more readable and the overall feature more usable.
You can now partition by date, datetime, varchar, and char columns, not just integers, and you can use more than one column in your list. The most immediate usage of this extension is the ability of using dates without resorting to functions that convert the dates into integers. For example:
CREATE TABLE t2
(dt date,a int, b int, c int)
PARTITION BY RANGE COLUMNS (dt)
(
PARTITION p0 VALUES LESS THAN ('2007-01-01'),
PARTITION p1 VALUES LESS THAN ('2008-01-01'),
PARTITION p2 VALUES LESS THAN ('2009-01-01'),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
The
COLUMNS keyword does the trick. The
manual has more examples.
The
partition helper has been updated to handle this new feature and generate partitions accordingly.
SIGNAL and RESIGNAL
If you have used stored routines extensively, you will certainly have asked yourself "why isn't there any way of raising an exception?" In the SQL standard, exception handling is implemented using the SIGNAL and RESIGNAL keywords, which were notably missing in MySQL 5.0 and 5.1 stored routines.
There have been many clever hacks by several community members to emulate the missing SIGNAL, but none were quite satisfactory. After long waiting here we have
SIGNAl and RESIGNAL, which make stored routines programming much more robust and easier to debug. An authoritative example on how to use the new syntax is available in
Roland Bouman's blog.
There is more. For the complete list of features, have a look at the
official manual.
Happy hacking!
UPDATE Added more partitions to the example, as suggested by Jon.
(1) For the more technologically savvy, here's how Philip Stoev, one of my distinguished QA colleagues, describes the enhancements:
Historically, most of the MySQL tests have been manually created, however a modern database is so complex that it is impossible to test manually even a tiny percentage of the available functionality. Therefore for Betony [codename for MySQL 5.5], and the upcoming Celosia [5.6], the majority of our testing effort was concentrated around stochastic testing, using random data and millions of random queries to validate the behavior of the server across a wide range of scenarios and workloads.
For each new feature, and some existing ones, we automatically generated tests that attempt to cover all relevant SQL constructs, including the interaction between the feature being tested and existing code within the server. For features that have concurrency implications, we ran the random queries as a stress test or along with concurrent DDL statements. For areas such as the partitioning, we used the random queries to functionally validate the new code, by comparing the result from each query to a reference source, such as a previous version of the server.