The Data Charmer: March 2009

Tuesday, March 31, 2009

Query Analyzer features integrated in MySQL Sandbox

Community strikes.
It was about time that someone provided a match for MySQL Query Analyzer, the flagship feature of MySQL Enterprise.
Now MySQL Sandbox includes a --query_analyzer option, which will convert your sandboxed server into an analysis machine. No need to buy Enterprise. No need to waste tons of money on expensive consultants.
Just download the latest tarball and install a sandbox as usual, with the additional option.


$ make_sandbox 5.1.32 --query_analyzer

And that's it! Your worries will be over. Enjoy!

Update As it should be clear by now, this was an April's fool joke. See the follow up post for more info.

Monday, March 30, 2009

New version of employees test DB

The Employees Test database has been updated. There was a subtle bug in the data. One employee was assigned to two departments with the same start and end date. And one of the sample procedures fell into the trap of assuming that the data was clean, thus reporting incorrect statistics.
Now the bug is fixed, the test suite is updated, and I can wait for the next bug report.

Sunday, March 29, 2009

MySQL Sandbox approaching version 3

MySQL Sandbox is approaching another milestone. Preparing for version 3.0, which will include some new features, version 2.0.98 (based on 2.0.18) can be installed like any other Perl module, and the scripts become available in the PATH.
It means a few seconds more to install the scripts, but a faster and easier usage.

This latest change also means that MySQL Sandbox will be available through the CPAN (Comprehensive Perl Archive Network, for the uninitiated).
The next step will be to include this package in Linux distros (Ubuntu, and eventually Debian, and then Fedora).
If you use the new package, the installation goes like this:

The hard way

Download and unpack MySQL Sandbox, then issue these commands:

$ perl Makefile.PL
$ make
$ make test
$ sudo make install

If you want to install the module in your home directory, say under $HOME/usr/local (a good idea, since MySQL Sandbox was designed for the purpose of installing servers in user space), then you can do the following:

$ export PATH=$PATH:$HOME/usr/local
$ export PERL5LIB=$HOME/usr/local/lib/perl5/site_perl/5.8.8
# change the perl version accordingly
$ perl Makefile.PL PREFIX=$HOME/usr/local
$ make
$ make test
$ sudo make install

The easy way

$ sudo cpan
cpan > install MySQL::Sandbox

Usage

Either way, the Sandbox scripts will now be in your PATH, and you can invoke the scripts quite easily:

$ make_sandbox /path/to/mysql-tarball-5.1.32-YOUR-OS.tar.gz

There is one (very small) incompatible change. Previously, you could get debugging output by setting the DEBUG environmental variable. Now you will have to set SBDEBUG instead.
Enjoy!

Thursday, March 26, 2009

Forums are for sissies. The next thing is Twitter

Once upon a time, if you had a problem with, say, Perl, you went to a forum, checked the forum rules, signed up, and asked a question, which eventually would get you an answer. Then you had a problem with MySQl, and you went to another forum, and asked a different question.

A forum for each topic is tiresome. Someone made an improvement, and then you have forums where you can ask pretty much anything.
But also that is not as general purpose as the concept of LazyWeb. When you need help, you just want to stand up and ask.
Here comes Twitter.
No additional subscriptions needed. Just the Swiss Army knife of the web questions. And you are not limited to geeky questions, you can ask anything.
Naive Twitter users may think that when you ask a question, only your followers can hear you. So if you have 50 followers, you don't stand a chance of getting an answer. Not so. Try this. Using Twirl (but also the simple twitter search would do), you create a new search, "activate it", and add it to your "home". Presto! you will be now receiving twitters that match your search, no matter if they are followers or not. The whole world is at your disposal.

You may be the one asking, and be surprised when you get back an answer in minutes from a complete stranger, or you may know the answer for one of the "lazyweb" questions, and feel compelled to respond.
Well, maybe the forums aren't dead yet, but surely they have a fierce rival.

Another command line tip

Encouraged by Baron Schwartz tip on result set comparison, here are a few more, on the same vein.
First, you can send a result set to a file. Probably you will say "yeah, I know, using SELECT INTO OUTFILE". Correct. Except that you can't rewrite to an existing file, if you want to, and you will get a raw output, not the well formatted one that you usually see on the command line. For example:


mysql > select 1 into outfile '/tmp/f1.txt';
mysql > \! cat /tmp/f1.txt
1

mysql > select 1 into outfile '/tmp/f1.txt';
ERROR 1086 (HY000): File '/tmp/f1.txt' already exists

BTW, \! command is a handy shortcut for executing a shell command.
Let's see what happens with the alternative method:


mysql > pager cat > /tmp/f1.txt
mysql > select 1;
\! cat /tmp/f1.txt
+---+
| 1 |
+---+
| 1 | 
+---+

Now, Using the above trick, you can check the differences between two result sets visually:


mysql > pager cat > /tmp/f1.txt 
mysql > select "one" union select "two" union select "three";
3 rows in set (0.00 sec)

mysql > pager cat > /tmp/f2.txt 

mysql > select "one" union select "two" union select "Three";
3 rows in set (0.00 sec)

mysql > nopager

mysql > \! vimdiff -o  /tmp/f[12].txt

And here is what you get:

Not only you'll know that there is something different, but you will also know exactly what.
This trick is part of a collection of command line advice that I have meant to write for long time. I will publish it all before the Users Conference. I just couldn't resist with this one!

Wednesday, March 25, 2009

MySQL 5.x performance with logging

There has been much talking about MySQL performance related to logging. Since MySQL 5.1.21, when Bug #30414 was reported (Slowdown (related to logging) in 5.1.21 vs. 5.1.20) I have been monitoring the performance of the server, both on 5.0 and 5.1.
Recently, I got a very powerful server, which makes these measurements meaningful.
Thus, I measured the performance of the server, using all publicly available sources, because I want this benchmark to be repeatable by everyone.
I will first describe the method used for the benchmarks, and then I report the results.

The server

The server is a Linux Red Hat Enterprise 5.2, running on a 8core processor, with 32 GB RAM and 1.5 TB storage.


$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.2 (Tikanga)

$ cat /proc/cpuinfo |grep "processor\|model name" | sort |uniq
model name : Intel(R) Xeon(R) CPU           E5450  @ 3.00GHz
processor : 0
processor : 1
processor : 2
processor : 3
processor : 4
processor : 5
processor : 6
processor : 7

cat /proc/meminfo
MemTotal:     32967056 kB
MemFree:      22790272 kB

Method

I downloaded the source code tarball, as released for every version of MySQL, from 5.0.45 to 5.0.77 and from 5.1.20 to 5.1.33. I also took the code for 5.0.79 from the Bazaar tree on Launchpad.
For each version, I did the following:
* expanded the tarball to a directory;
* compiled the code using ./BUILD/compile-pentium64-max;
* built a binary using ./source/make_binary_distribution;
* installed the binary using MySQL Sandbox, assigning 5GB of RAM to innodb_buffer_pool_size;
* Tun the sysbench OLTP transactional test, 8 threads, 1,000,000 records, 1 minute run;

After installing, all database instances where shut down, and only one database server was active at any given time during the tests, to have clean results.
The servers were cleaned up, with all databases and log files removed between tests.
Additionally, all operating system memory cache was erased between tests.

The sysbench commands used for the tests were the following:

sysbench --test=oltp --oltp-table-size=1000000 \
  --mysql-db=test --mysql-user=msandbox \
  --mysql-password=msandbox --mysql-host=127.0.0.1 \
  --mysql-port=$PORT --num-threads=8 prepare

sysbench --test=oltp --oltp-table-size=1000000 \
  --mysql-db=test --mysql-user=msandbox \
  --mysql-password=msandbox --mysql-host=127.0.0.1 \
  --mysql-port=$PORT --max-time=60 --oltp-read-only=on \
  --max-requests=0 --num-threads=8 run

sysbench --test=oltp --oltp-table-size=1000000 \
  --mysql-db=test --mysql-user=msandbox  \
  --mysql-password=msandbox --mysql-host=127.0.0.1  \
  --mysql-port=$PORT --max-time=60 --oltp-read-only=off \
  --max-requests=0 --num-threads=8 run

The not so exciting results first

Using Sysbench with --oltp-read-only=on, MySQL 5.0 outperforms MySQL 5.1 constantly. In a read-only situation, it looks like 5.0 is much better than 5.1.

Additional bad news is that Bug #30414 is still unfixed. Table logging takes away 50% of performance and thus is not recommended. General logs on file, instead, with the ability of logging on demand, is an affordable diagnostic tool.
Partial good news can be deducted by the ability of MySQL 5.1 to perform better with file logging, compared to MySQL 5.0.

The good news

Normal operations in a MySQL server are not read only, and include a binary log. For my second batch of testing, then, I compared performance with --oltp-read-only=off with and without binary logs. In this situation, MySQL 5.1 outperforms MySQL 5.0 in all cases. It's also nice to observe that the performance is improving gradually from earlier versions to recent ones. Kudos to MySQL engineers for their constant dedication to the server improvement.

The even better news is that, in MySQL 5.1, performance gain on 5.0.x has been increasing steadily.

What does this tell you? That MySQL 5.1 is much, much better than 5.0 for online transaction processing, while MySQL 5.0 seems to be better at concurrent read-only operations. My immediate reaction is that I should use MySQL 5.1 for a master and 5.0 for slaves. However, there are two considerations that stop me. First, using a master of a higher version than the slaves is not recommended, although some tests that I made after this discovery show that you can actually do it, provided that you don't use any specific MySQL 5.1 features. And second, the improvement path is such, that I believe MySQL 5.1 is going to catch up on 5.0 for read-only operations. As soon as this happens, I will let you know immediately! Probably, we will know more at the Users Conference 2009.

Summing up

I think that MySQL 5.1 has much to offer in comparison to 5.0.
The tests that I have performed are not definite. Sysbench is just one of the possible tests, and it doesn't mean that your production server will follow the same pattern or have the same performance. It's just an independent way of measuring the server performance, and especially is a repeatable way, which everyone can reproduce.
I warmly invite you to repeat these tests on different machines and let me know your results.

Tuesday, March 24, 2009

Another usability bug bites the dust

In MySQL 5.1.33 there is a fix for an apparently innocuous bug.
Bug #36540 CREATE EVENT and ALTER EVENT statements fail with large server_id.
This is a usability bug, that makes the DBA life unnecessarily hard. The reason for having a large server_id is because a DBA might want to use the IP address as server ID, to make sure that there are unique IDs, and to have an easy way of identifying the server through the IP.
All is well until you mix the server_id assignment with event creation:


 select version();
+-----------+
| version() |
+-----------+
| 5.1.32    | 
+-----------+
1 row in set (0.00 sec)

 set global server_id =inet_aton('192.168.2.55');
Query OK, 0 rows affected (0.00 sec)

 select @@server_id;
+-------------+
| @@server_id |
+-------------+
|  3232236087 | 
+-------------+
1 row in set (0.00 sec)

 create event e1 on schedule at now() + interval 60 second do set @a = 1;
ERROR 1538 (HY000): Failed to store event body. Error code 1 from storage engine.

Oops! the event table has a problem. The "originator" column is designed as a signed INT (2^31 -1), while the server_ID can take values up to 2^32 - 1.
In the latest version, soon to hit the market, the problem is solved.


select version();
+-----------+
| version() |
+-----------+
| 5.1.33    | 
+-----------+
1 row in set (0.01 sec)

set global server_id =inet_aton('192.168.2.55');
Query OK, 0 rows affected (0.00 sec)

select @@server_id;
+-------------+
| @@server_id |
+-------------+
|  3232236087 | 
+-------------+
1 row in set (0.00 sec)

create event e1 on schedule at now() + interval 60 second do set @a = 1;
Query OK, 0 rows affected (0.00 sec)

Something to know about the event scheduler and replication

MySQL 5.1 has been GA for 4 months now, and I am sure that many people have been using the event scheduler.
There is something that you must know if you are using the event scheduler in a replicated environment.
The important thing to know is that, when you use the events in replication, by default the event is active on the master only. The event creation is replicated, but the event on the slaves is not active

The reference manual explains it in detail.
There are two things that you must remember, if you are going to promote a slave to master:

The event_scheduler variable is not replicated. If you set it on the master, the slaves don't know anything about that.
The event creation is replicated, but disabled. Even if you set the event_scheduler variable on the slave, the events won't start.

An example will clarify it. Let's create an event on the master:

$ ./m -e "create event e1 on schedule at now() + interval 1 minute do create table et1 (i int);"
$ ./m -e "set global event_scheduler=1";

And then request the status of the event scheduler.

$ ./m -e 'select @@event_scheduler';
+-------------------+
| @@event_scheduler |
+-------------------+
| ON                |
+-------------------+
./m -e 'show events from test\G'
*************************** 1. row ***************************
                 Db: test
               Name: e1
            Definer: msandbox@%
          Time zone: SYSTEM
               Type: ONE TIME
         Execute at: 2009-03-24 00:25:41
     Interval value: NULL
     Interval field: NULL
             Starts: NULL
               Ends: NULL
             Status: ENABLED
         Originator: 1
character_set_client: latin1
collation_connection: latin1_swedish_ci
 Database Collation: latin1_swedish_ci

Everything seems to be in order on the master. Let's see the situation on the slave.

./s1 -e 'select @@event_scheduler';
+-------------------+
| @@event_scheduler |
+-------------------+
| OFF               |
+-------------------+

That was to be expected. We know that SET commands are not replicated. What about the event creation? This should be replicated, and indeed it is:

./s1 -e 'show events from test\G'
*************************** 1. row ***************************
                 Db: test
               Name: e1
            Definer: @
          Time zone: SYSTEM
               Type: ONE TIME
         Execute at: 2009-03-24 00:25:41
     Interval value: NULL
     Interval field: NULL
             Starts: NULL
               Ends: NULL
             Status: SLAVESIDE_DISABLED
         Originator: 1
character_set_client: latin1
collation_connection: latin1_swedish_ci
 Database Collation: latin1_swedish_ci

It's easy to check that the event effects are replicated. After one minute, we list the tables on both master and slaves, and table "et1" was created.

./m -e 'show tables from test'
+----------------+
| Tables_in_test |
+----------------+
| et1            | 
+----------------+

./s1 -e 'show tables from test'
+----------------+
| Tables_in_test |
+----------------+
| et1            | 
+----------------+

But here's the tricky part. The event is created, but disabled. This is the desired behavior, because the effects of the events are replicated and we don't want the same operation executed twice on the slaves.
The only drawback of this situation is that you need to change "SLAVESIDE_DISABLED" into ENABLED when you promote a slave to master. You need to act with administrator powers, because a ALTER EVENT won't achieve the desired result. You need, instead, to update the mysql.event table, as explained in the manual.
It is not a big deal, but you must remember the above two issues when promoting a slave.

Monday, March 23, 2009

My favorite tutorial at the UC2009 : Build and release management

I am looking forward to the MySQL Users Conference and Expo 2009. Since I am a tutorial speaker, my choice of tutorials to attend is limited. Upon completion of my duties, I will attend Greg Haase's tutorial on Build and Release Management for Database Engineers.
There are many reasons for that. For starting, Greg is the winner of the MySQl 5.1 Use Case competition where he has shown his DBA skills, and then, he is using the MySQL Sandbox among the tools of the trade recommended in hist tutorial. So you know that I haven't much choice but to attend it!

Easy server testing with MySQL Sandbox

MySQL Sandbox 2.0.18 introduces a new feature, changing port. You can now change the listening port for a sandboxed server, either as a standalone operation, or while moving it, using the sbtool.
There is a feature in the Sandbox, introduced in 2.0.13, that makes really easy to test servers in special conditions. If you need to start or restart a server using an option that you know you will need only for the next test, you can add the option to the command line invocation of the start or restart scripts.

$ ./start --key-buffer=2G
$ ./use -e "show variables like 'key_buffer_size'"
+-----------------+------------+
| Variable_name   | Value      |
+-----------------+------------+
| key_buffer_size | 2147483648 | 
+-----------------+------------+
./use -e "show variables like 'innodb_buffer_pool_size'"
+-------------------------+---------+
| Variable_name           | Value   |
+-------------------------+---------+
| innodb_buffer_pool_size | 8388608 | 
+-------------------------+---------+

$ ./restart --innodb-buffer-pool_size=2G
./use -e "show variables like 'innodb_buffer_pool_size'"
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| innodb_buffer_pool_size | 2147483648 | 
+-------------------------+------------+

Some more news. There is a mailing list to discuss the Sandbox development, and new members are welcome. The Sandbox is also awaiting volunteer students for Google Summer of Code.

Monday, March 16, 2009

Free Webinar: How to give a good presentation

Sheeri K. Cabral, one of the most active speakers in the MySQL community, will give a presentation on the art of making a presentation.
Tuesday March 17th at 10am PDT. (= 1pm EDT, 5pm UTC, 6pm CET)
The event is dedicated to the presenters at the MySQL Users Conference, who should have received an invitation. If you were not invited and wish to attend, see the information below:

Event address for attendees: click here
Date and time: Tuesday, March 17, 2009 10:00 am
Pacific Daylight Time (GMT -07:00, San Francisco)

Tuesday, March 17, 2009 1:00 pm
Eastern Daylight Time (GMT -04:00, New York)

Tuesday, March 17, 2009 5:00 pm
GMT Standard Time (GMT 00:00, London)
Duration: 1 hour

Event number: 590 329 510
Event password: Attendee1

There is voice streaming. In case it doesn't work, you can dial one of the numbers below:
Teleconference: (US & Canada): 866-224-2652
International: (281) 913-0892
Code: 474700 #

You may log in to WebEx 15 minutes prior to the meeting start time, this is advised as there may be downloads that WebEx might need to perform on your system
Copy and paste the attendee link in to your web browser (this seems to work better than clicking on the link)
PASSWORDS FOR WEBEX ARE CASE SENSITIVE

MySQL Sandbox and Summer of Code 2009

As you may have seen, MySQL Sandbox is improving, but there are still many features to be implemented.
If you know Perl - or you want to convert the Sandbox into another portable language - and you want to help with the new development, you may consider getting into Google Summer of Code under MySQL mentorship. To apply, you must be a student, and accept the program terms.

sbtool - a new advanced entry in the MySQL Sandbox toolkit

MySQL Sandbox 2.0.17 introduces a new item in the toolkit. sbtool is a multi-purpose tool for advanced tasks with MySQL sandboxes.
Most notables: assigned and used port detection, port range detection, replication tree, copying and moving sandboxes.

Overview

./sbtool --help
usage: ./sbtool [options]
 -o     --operation       (s) <> - what task to perform
   'info'     returns configuration options from a Sandbox
   'copy'     copies data from one Sandbox to another
   'ports'    lists ports used by the Sandbox
   'tree'     creates a replication tree
   'move'     moves a Sandbox to a different location
   'range'    finds N consecutive ports not yet used by the Sandbox
 -s     --source          (s) <> - source directory for move,copy
 -d     --dest            (s) <> - destination directory for move,copy
 -u     --only_used       (-) <> - for "ports" operation, shows only the used ones
 -i     --min_range       (i) <5000> - minimum port when searching for available ranges
 -x     --max_range       (i) <32000> - maximum port when searching for available ranges
 -z     --range_size      (i) <10> - size of range when searching for available port range
 -f     --format          (s)  - format for "ports" and "info"
   'perl'     fully structured information in Perl code
   'text'     plain text dump of requested information
 -p     --search_path     (s)  - search path for ports and info
 -a     --all_info        (-) <> - print more info for "ports" operation
        --tree_nodes      (s) <> - description of the tree (x-x x x-x x|x x x|x x)
        --mid_nodes       (s) <> - description of the middle nodes (x x x)
        --leaf_nodes      (s) <> - description of the leaf nodes (x x|x x x|x x)
        --tree_dir        (s) <> - which directory contains the tree nodes
 -v     --verbose         (-) <> - prints more info on some operations
 -h     --help            (-) <> - this screen

The list of features may look intimidating. The important ones are the values that you can give to the --operation modifier. The rest is a list of ancillary options to be used with the possible operations.

Port detection

The most straightforward operation is 'ports', which returns the list of ports that have been assigned to sandboxes installed under $SANDBOX_HOME (default $HOME/sandboxes).


$ ./sbtool -o ports
5077  0
5080  0
5130  0
5131  1
5133  1
5333  0
6080  0

The ports followed by a "0" have been assigned but are not currently in use. The ones followed by a "1" are in use. If you need only the latter, it's easy.


$ ./sbtool -o ports --only_used
5131  1
5133  1

If you want to use this information from a Perl script, there is an appropriate modifier.

$ ./sbtool -o ports --only_used --format=perl
$ports = {
          '5133' => 1,
          '5131' => 1
        };

Speaking about ports, if you want to make a custom installation of one or more sandboxes for which you need a range of ports, sbtool can help. Let's assume that you need at least 20 ports, not lower than 5130.

$ ./sbtool -o range --min_range=5130 --range_size=20
5134 -  5154

Moving a sandbox

Moving a sandbox has been in the wish list for long time. Finally, there is a clean way of moving a sandbox to an arbitrary location, keeping its functionality intact.
For example, let's move msb_5_1_32 to /tmp/xyz, and see if it works.

$ ./sbtool -o move  --source_dir=$HOME/sandboxes/msb_5_1_31 --dest_dir=/tmp/xyz
the old scripts have been saved as filename.bak

$ /tmp/xyz/start
. sandbox server started

$ /tmp/xyz/use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.31 MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql [localhost] {msandbox} ((none)) > exit
Bye

# Now we move the sandbox back to its previous location

$ ./sbtool -o move  --dest_dir=$HOME/sandboxes/msb_5_1_31 --source_dir=/tmp/xyz
the old scripts have been saved as filename.bak

The sbtool script knows when a sandbox is in use, and stops it before starting the moving process. After the move, all original scripts are saved as ".bak", and the current scripts are updated with the new location.

Copying a sandbox

Copying a sandbox meets a different need. You want mostly to copy the data from one server to another. You don't want to use replication, especially if it is a lot of data. sbtool can copy the data from one sandbox to another, stopping the source and destination when necessary. It will only copy across compatible versions (5.1 to 5.1 or newer, but not 5.0 to 5.1 or vice versa).
When copying a data directory, sbtool skips relay-log.info, the binary logs, and the relay logs.
Both the source and destination directories must be sandboxes, or the operation is refused.

$ ./sbtool -o copy -s $HOME/sandboxes/msb_5_1_31/ -d $HOME/sandboxes/msb_5_1_33/

Replication trees

This option is very much required for testing.
You want to have one master, two or more relay slaves, and several final slaves.
For example. let's assume that you need this schema:

             1  
            |  
     +------+-----+-----------+
     2            3           4                   
     |            |           |  
 +---+---+    +---+---+   +---+---+
 |   |   |    |       |   |   |   |  
 5   6   7    8       9   10  11  12

The way to go is to create a multi_sandbox with 12 nodes, and then using sbtool to make the tree.

$ make_multiple_sandbox --how_many_nodes=12 5.0.77
...
group directory installed on $HOME/sandboxes/multi_msb_5_0_77

$ ./sbtool -o tree --mid_nodes="2 3 4" --leaf_nodes="5 6 7|8 9|10 11 12" --tree_dir=$HOME/sandboxes/multi_msb_5_0_77
node 1 is master
. sandbox server started
. sandbox server started
enabling node 2 to relay updates
. sandbox server started
   node 2 is slave of node 1
. sandbox server started
. sandbox server started
       node 5 is slave of node 2
. sandbox server started
. sandbox server started
       node 6 is slave of node 2
. sandbox server started
. sandbox server started
       node 7 is slave of node 2
. sandbox server started
enabling node 3 to relay updates
. sandbox server started
   node 3 is slave of node 1
. sandbox server started
. sandbox server started
       node 8 is slave of node 3
. sandbox server started
. sandbox server started
       node 9 is slave of node 3
. sandbox server started
enabling node 4 to relay updates
. sandbox server started
   node 4 is slave of node 1
. sandbox server started
. sandbox server started
       node 10 is slave of node 4
. sandbox server started
. sandbox server started
       node 11 is slave of node 4
. sandbox server started
. sandbox server started
       node 12 is slave of node 4

Let's test it:

gmax@gmac3[sandboxes]$ multi_msb_5_0_77/n1 -e "create table test.t1(i int); insert into test.t1 values (@@server_id)"
gmax@gmac3[sandboxes]$ multi_msb_5_0_77/use_all "select * from test.t1"
# server: 1:
i
101
# server: 2:
i
102
# server: 3:
i
103
# server: 4:
i
104
# server: 5:
i
105
# server: 6:
i
106
# server: 7:
i
107
# server: 8:
i
108
# server: 9:
i
109
# server: 10:
i
110
# server: 11:
i
111
# server: 12:
i
112

$ multi_msb_5_0_77/n1 -e "show slave hosts"
+-----------+-------+-------+-------------------+-----------+
| Server_id | Host  | Port  | Rpl_recovery_rank | Master_id |
+-----------+-------+-------+-------------------+-----------+
|       102 | node2 | 19778 |                 0 |       101 |
|       104 | node4 | 19778 |                 0 |       101 |
|       103 | node3 | 19778 |                 0 |       101 |
+-----------+-------+-------+-------------------+-----------+

$ multi_msb_5_0_77/n2 -e "show slave hosts"
+-----------+-------+-------+-------------------+-----------+
| Server_id | Host  | Port  | Rpl_recovery_rank | Master_id |
+-----------+-------+-------+-------------------+-----------+
|       105 | node5 | 19779 |                 0 |       102 |
|       106 | node6 | 19779 |                 0 |       102 |
|       107 | node7 | 19779 |                 0 |       102 |
+-----------+-------+-------+-------------------+-----------+

$ multi_msb_5_0_77/n3 -e "show slave hosts"
+-----------+-------+-------+-------------------+-----------+
| Server_id | Host  | Port  | Rpl_recovery_rank | Master_id |
+-----------+-------+-------+-------------------+-----------+
|       109 | node9 | 19780 |                 0 |       103 |
|       108 | node8 | 19780 |                 0 |       103 |
+-----------+-------+-------+-------------------+-----------+

So, this is it. Now the code needs some testing. Please comment on this blog, and report bugs at the MySQL Sandbox home.

Thursday, March 12, 2009

Normalization and smoking

An interesting question came yesterday at the end of the MySQL workshop in Athens. An user has a server with about 40,000 tables, and the performance is not brilliant (oh really?). The reason is that there is an application that requires one or more new tables for any user, and the tables are of about ten different structures.
The user asked the wrong question: "will the performance improve if I change the storage engine from InnoDB to MyISAM?"

Salle gave the first answer. Converting the tables to MyISAM would only exacerbate the problem. Instead of having one file per table, the user would have 3 per table. Any operating system would choke with such a number of files concentrated in a handful of directories.
My answer focused on normalization. I explained that, if 10 different table structures are used, then the user should create only 10 tables, add a column for user ID, and export the 40,000 tables into the 10 tables. When I explained that this reorganization would improve performance a lot, but it would require changing the application, the user tried to protest, asking for some silver bullet solution. Then I delivered the punch line: "Your business is growing, and as it grows, you are adding new table for every new user. If you continue like this, there will be so many tables that your operating system will collapse. Your database is doomed. It's like if your doctor tells you that if you don't quit smoking, you will die in six months. The same for your database. If you don't quit creating tables, your database will die unpleasantly. Quit smoking, now!"
That made quite an impression on the audience!

Taxi rides and packed rooms at Athens Open Source Conference

I arrived in Athens shortly before midnight. An unusual sight. Most of the taxi drivers in the queue are female. My driver is a middle aged woman wearing a torn leather jacket and with no understanding of English beyond numbers. She said "yes. Liberian hotel", and she adds the name of a mysterious location that I don't understand. I show her the hotel name and address, written in both Latin and Greek alphabet. "Yes. Iberian hotel. I know where." "Not Iberian - I say patiently, pointing at the hotel name on the paper - Imperial.
She says yes, and off we go.
After a minute she fishes a cell phone from her jacket, and asks something. It's a flurry of incomprehensible Greek, but the hotel name I hear very clearly "yadda yadda yadda Liberia hotel yadda yadda". I lean forward and I repeat "Not Liberia. Imperial". "yes, she says. Iberia".
I tell her to stop the taxi. I show her again the name of the hotel, "Imperial," and the place where it is located, "Karaiskaki". She nods vigorously and we go again. After a while, she's back on the phone, asking about "karaiskaki". According to my Athenian colleague who booked the hotel, everybody knows about this place. After yadda-yaddaying more on the phone, she gives the phone to me. A young female voice in tentative English asks for my destination. I repeat. She starts giving me directions. I remind her that I am in the back seat and she should better give these directions to the driver. I Give back the phone to the driver, and she talks for a while more. We keep going on the main highway. No mistakes so far. It's a straight drive.
After a few minutes, in the same highway, she calls again, "Karaiskaki" is mentioned several times. She gives me the phone again. The distant woman asks where I am coming from. I snap that where I come from is not of big significance, and the matter of where I am going to end up has more practical and immediate value to me.
The driver seems more confident now. We take an exit from the highway and she calls again. "Liberia hotel" and "karaiskaki" are definitely the topic of the conversation. I sigh and give up. If we arrive to Karaiskaki, the Imperial hotel should be there.
At last, we stop in a large square. The drive says "Iberia hotel". I get out, look up at the large building in front of me. It says "Imperial hotel", and I ask for a receipt. The taxi meter reads 31 EUR, but she asks for 50, saying something that sounds like "extra charge". I refuse, and offer 30. She says 50 again, and we start to bargain. After a few minutes, we settle for 40. Welcome to Athens.
The next morning, we start late. My session is delayed, but I managed to tell the MySQL story in front of an audience of about 150 people. They are a bit shy about asking questions, so I show them my MySQL 5.1. GA T-shirt, noting that the names on the back are the distribution list. If they see someone with that T-shirt, they can ask the hard questions. Both I and Salle are mentioned there, and we will welcome questions, both hard and soft, during the afternoon workshop.
In the afternoon, the workshop room was prepared with about 30 seats. It is a parallel workshop, so the intention is that the few people interested in MySQL should attend the workshop, while the majority will stay in the main event. We start the workshop with about 20 people in the room, but after about five minutes, the room is invaded by the more attendees who realized late that the workshop was not in the large room. We get more than half the audience, about 70 people, packed in the workshop room. And so we start the show. We ask them what they want to hear, and this time, in the cozy illusion of comfort and camaraderie created by the small room, now crowded to capacity, the questions flow easily and friendly.
Between me and Salle, we get quickly rid of the hard questions "is MySQL Cluster the right choice for my business?" (no, unless you have a Telco-like database). "Does MySQL support load balancing?" (which leads to an explanation of what replication works, with demos. Then we dispatch the myth of Circular replication as a silver bullet scalability option, and explain how DRBD works. The scalability options are taken all into account, including the manual ones and the mythical transparent ones. We explain how MySQL Proxy works, its strength and weaknesses, and how they can play with it. A mandatory demo of how Proxy makes coffee is well received. The last topic is partitioning, and it is demonstrated with a connection to one of the community team test servers in a Norway server farm. After I show that a query that takes 30 seconds in a 741 million records non-partitioned table is reduced to 5 seconds in a table with the same records but partitioned by range, they are believers. The final applause is very much genuine and full of gratitude.

When I and Salle finally go out for a walk to the Acropolis, we feel like we have definitely been helpful.
The Acropolis was closed (business hours from 8:30 to 15 only), but, hey! You can't have it all! So we ended the day in a friendly restaurant near the archaeological sites.
All in all, it was a good experience. Athens deserves another visit.

Sunday, March 08, 2009

A deceiving error message while setting a replication slave

I was setting up a slave manually for a quick-and-dirty experiment, when I found this one:


mysql  %gt; slave start;
ERROR 1200 (HY000): The server is not configured as slave; fix in config file or with CHANGE MASTER TO

Hmm... I did the CHANGE MASTER TO already, but just for the sake of it, let's do it again.


mysql %gt; change master to master_host='127.0.0.1', master_port=22346, 
master_user='msandbox', master_password='msandbox', 
master_log_file='mysql-bin.000002', master_log_pos=106;
Query OK, 0 rows affected (0.00 sec)

mysql  %gt; slave start;
ERROR 1200 (HY000): The server is not configured as slave; fix in config file or with CHANGE MASTER TO

Again! Why is it asking to execute CHANGE MASTER TO, which has been accepted right now?

Then it came to me. Because I did a manual configuration, and I didn't set the server_id.


mysql [localhost] {msandbox} ((none)) %gt; set global server_id=4000;
Query OK, 0 rows affected (0.00 sec)
mysql  %gt; slave start;
Query OK, 0 rows affected (0.00 sec)

Yep. That was it. The error message could have been more helpful, though.