The Data Charmer: 2009

Tuesday, December 29, 2009

Filtering mysqldump output

Several people have suggested a more flexible approach at mysqldump output in matter of user privileges.
When you dump the data structure for views, triggers, and stored routines, you also dump the permissions related to such objects, with the DEFINER clause.
It would be nice to have such DEFINER clauses removed, or even replaced with the appropriate users in the new database.

The mysqldump filter was created with this need in mind. It allows you to remove all DEFINER clauses and eventually replacing them with a better one.
For example:


mysqldump --no-data sakila | dump_filter --delete > sakila_simple.sql
mysqldump --no-data sakila | dump_filter --replace='newuser@`10.%`' > sakila_secure.sql

The first example removes all references to DEFINER, while the second one replaces every definer with a new user name.

Update Since MySQL Forge is not available anymore, here's the code:


#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Long qw(:config no_ignore_case );

my $replace = undef;
my $delete  = undef;
my $help    = 0;

GetOptions (
   'replace|r=s' => \$replace,
   'delete|d'    => \$delete,
   'help|h'      => \$help,
) or help('unrecognized options');

help() if $help;

if ($delete and $replace) {
    help( 'you must choose EITHER "delete" OR "replace". Not both');
}

$delete = 1 unless $replace;

while (my $line = ) {
    if ($delete) {
        $line =~ s{(/\*!\d+ )\s*definer\s*=\s*\S+}{$1}i;
    }
    elsif ($replace) {
        $line =~ s{(/\*!\d+\s+definer\s*=\s*)(\S+)}{$1 $replace}i;
    } 
    print $line;
}

sub help {
    my ($msg) = @_;
    if ($msg) {
        print "*** $msg\n";
    }
    print "dump_filter - mysqldump filter \n",
          "(C) Giuseppe Maxia, 2009\n",
          "removes/changes DEFINER clauses from MySQL dumps\n",
          "USAGE: dump_filter [options]\n",
          "    -d|--delete    removes the DEFINER clauses\n",
          "    -r|--replace=s replaces every DEFINER clause with the \n",
          "                   new value provided\n",
          "    -h|--help      This text\n";
    exit(1);
}


__END__
notes: MySQLdump filter.
Removes or replaces the DEFINER clauses from a dump.
It is necessary when you want to migrate a database to a new server with different users.

Thursday, December 24, 2009

Holiday gift - A deep look at MySQL 5.5 partitioning enhancements

Half a day into my vacation, I managed to finish an article on a topic that has been intriguing me for a while.
Since several colleagues were baffled by the semantics of the new enhancements of MySQL 5.5 partitions, after talking at length with the creator and the author of the manual pages, I produced this article: A deep look at MySQL 5.5 partitioning enhancements.
Happy holidays!

UPDATE This matter was more tricky than it appeared at first sight. As Bug#49861 shows, several MySQL engineers were initially fooled by the multiple column partitions. Also I wrote something wrong in the article, and I updated the text to explain more accurately the behavior of the partitioning engine.

Sunday, December 20, 2009

MySQL Conference 2010 - The call for participation is open

The MySQL Conference 2010, with Sun Microsystems as founding sponsor, has opened its Call for participation.
There is already an impressive lineup of tutorials, and I don't say that only because I am on that list. You will find the usual suspects (Replication, Cluster, Certification) and several new ones: Partitioning (covering 5.5), Drizzle replication plugins and core development, Dual master setup, Scaling Applications, Diagnosing and fixing performance, Inspecting variables, command line magic.

The names next to the above topics are all well known: Sheeri K. Cabral, Brian Aker, Jay Pipes, Toru Maesaka, Baron Schwartz, Andrew Hutchings, Andrew Morgan , Geert Vanderkelen, Arjen Lenz, Morgan Tocker, Kai Voigt, Alan Kasindorf, Padraig O'Sullivan, Mats Kindahl, Lars Thalmann, and yours truly.
If you want to join the ranks of these already confirmed folks, it's now time to gather all your skills and make a proposal for a talk.
The Call for Participation opens today, and it's open till the end of January. Not much time, but not a bad deal either. You have certainly enough time to pull an excellent proposal. If you don't remember the rules, you can read again what I wrote about the past conference. The review committee is different, but the current one cares about quality as much as the previous ones.
And consider how much better is it this year for innovative speakers: due to the delay in the conference announcement (on the reasons of which I am not going to speculate), you are in a unique position, since you can propose talks based on technology that has been released in the past two months. There is a whole lot of talks to be created around MySQL 5.5 and other fresh releases.
Don't be lazy. Start writing your proposal now, and you may be in the conference schedule by February!

Tuesday, December 15, 2009

Getting started with MySQL 5.5

Some time go, we announced a new release model for MySQL. As all new things, it had some initial hiccups (with MySQL 5.4 we were still getting acquainted with the new model), but now it seems to be in full swing.
By the time you read these lines, MySQL 5.5 will be available. If the mirrors aren't seeded yet, the impatient can compile and use the new version from the launchpad source tree..

Overview

What's this new release anyway? I'll leave it to Kaj's blog to give you the full description. Suffice it to say that this release is the second milestone of the current installment. It is of beta quality, and it will mature to RC quality. There will be yet another milestone before we release a GA in mid 2010.
One thing that this milestone shows is that there are no dead weights. If a feature doesn't make the deadline, i.e. it doesn't reach beta quality by the scheduled date, it will be dropped, and eventually rescued at the next milestone.
With the introduction of the milestone model, we have also increased our internal QA, especially thanks to the Random Query Generator, which finds bugs in early stages of the code faster than any other method. ⁽¹⁾

Built-in InnoDB plugin

The InnoDB plugin 1.0.5 is included in the distribution, and, unlike MySQL 5.1, it's built-in. There is no need to load and register the plugin. The performance enhancements developed for MySQL 5.4 are now available together with the other enhancements available with the InnoDB plugin. This was already available in the previous milestone, but it's worth mentioning it now, because not many people are aware of that.

Semi-synchronous replication

Of all the new features, this one is probably the most relevant. It is based on a patch made by Google to the InnoDB engine, and adapted by MySQL developers to make it engine-independent.
In short, it's a safety device that establishes some internal communication between master and slaves, and makes sure that at least one slave has received the changes being committed. That is, before committing, the master waits until at least one slave has acknowledged that it has received the portion of binary log necessary to reproduce the transaction.
UPDATE As Harrison Fisk rightly notes, there is a mistake in my description. The acknowledgment of the transaction being sent to the slave only happens after the master's commit to the binary log.
Some caveats apply:

It's called semi-synchronous replication, because it doesn't necessarily apply to all the slaves. Although you can manually check if the replication has worked for all the slaves, it's enough for the master to make sure that at least one slave has got the goods.
Received by a slave doesn't mean "executed". The slave acknowledges when it has got the binary log, even if the SQL thread is busy or stopped.
If there is no slave that can acknowledge receipt (e.g. slaves are all down or stopped), then the master reverts to the normal asynchronous operation.

To use this feature, you need to install two plugins: one for the master and one for each slave. No need to compile anything, though. They are provided with the binaries. All you need to do is load the appropriate plugin for each server.

master >  INSTALL PLUGIN rpl_semi_sync_master SONAME 'libsemisync_master.so';

slave1 > INSTALL PLUGIN rpl_semi_sync_slave SONAME 'libsemisync_slave.so';

Additionally, there are a few variables that you must set, either in the options file or online.

master > SET GLOBAL rpl_semi_sync_master_enabled=1;
slave1 > SET GLOBAL rpl_semi_sync_slave_enabled=1;

Now that the system is ready, let's see how to use it.
Before doing anything, we ask for the value of two important status variables:

SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx  | 0     |
| Rpl_semi_sync_master_yes_tx | 0     |
+-----------------------------+-------+

The first one is the number of failed synchronized transactions, the second one is the number of successful ones. Since nothing has happened so far, they are both zero.

create table t1 (i int not null primary key) engine=innodb;
Query OK, 0 rows affected (0.13 sec)

 SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx  | 0     |
| Rpl_semi_sync_master_yes_tx | 1     |
+-----------------------------+-------+

The first operation (a table creation) was successfully transferred to a slave. Let's do one more.

set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

insert into t1 values (1);
Query OK, 1 row affected (0.00 sec)

COMMIT;
Query OK, 0 rows affected (0.00 sec)

SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx  | 0     |
| Rpl_semi_sync_master_yes_tx | 2     |
+-----------------------------+-------+

Also this one was successful.
Now, let's try something sneaky. On each slave, we execute "STOP SLAVE SQL_THREAD". Normal replication would not work, but semi-synchronous replication will go on.


insert into t1 values (2);
Query OK, 1 row affected (0.01 sec)

SHOW STATUS LIKE 'Rpl_semi_sync%tx';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_no_tx  | 0     |
| Rpl_semi_sync_master_yes_tx | 3     |
+-----------------------------+-------+
2 rows in set (0.00 sec)

The semi-synch replication has worked. However, if we query both master and slaves, only the master has the new record. The slaves have it only in their relay logs, which you can easily ascertain with mysqlbinlog.

Enhanced partitioning syntax

About one year ago, I briefly announced that this feature was in the making. With some interface improvement, it is now part of the regular partitioning. It's an extension of partitioning BY RANGE. As you know, you can only partition on one column value, and you can only partition on INTEGER columns. Both these restrictions were lifted in 5.5, with a syntax change that makes the code more readable and the overall feature more usable.
You can now partition by date, datetime, varchar, and char columns, not just integers, and you can use more than one column in your list. The most immediate usage of this extension is the ability of using dates without resorting to functions that convert the dates into integers. For example:

CREATE TABLE t2 
(dt date,a int, b int, c int)  
PARTITION BY RANGE COLUMNS (dt)  
(
PARTITION p0 VALUES LESS THAN ('2007-01-01'),
PARTITION p1 VALUES LESS THAN ('2008-01-01'),
PARTITION p2 VALUES LESS THAN ('2009-01-01'),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);

The COLUMNS keyword does the trick. The manual has more examples.

The partition helper has been updated to handle this new feature and generate partitions accordingly.

SIGNAL and RESIGNAL

If you have used stored routines extensively, you will certainly have asked yourself "why isn't there any way of raising an exception?" In the SQL standard, exception handling is implemented using the SIGNAL and RESIGNAL keywords, which were notably missing in MySQL 5.0 and 5.1 stored routines.
There have been many clever hacks by several community members to emulate the missing SIGNAL, but none were quite satisfactory. After long waiting here we have SIGNAl and RESIGNAL, which make stored routines programming much more robust and easier to debug. An authoritative example on how to use the new syntax is available in Roland Bouman's blog.

There is more. For the complete list of features, have a look at the official manual.
Happy hacking!

UPDATE Added more partitions to the example, as suggested by Jon.

⁽¹⁾ For the more technologically savvy, here's how Philip Stoev, one of my distinguished QA colleagues, describes the enhancements:

Historically, most of the MySQL tests have been manually created, however a modern database is so complex that it is impossible to test manually even a tiny percentage of the available functionality. Therefore for Betony [codename for MySQL 5.5], and the upcoming Celosia [5.6], the majority of our testing effort was concentrated around stochastic testing, using random data and millions of random queries to validate the behavior of the server across a wide range of scenarios and workloads.
For each new feature, and some existing ones, we automatically generated tests that attempt to cover all relevant SQL constructs, including the interaction between the feature being tested and existing code within the server. For features that have concurrency implications, we ran the random queries as a stress test or along with concurrent DDL statements. For areas such as the partitioning, we used the random queries to functionally validate the new code, by comparing the result from each query to a reference source, such as a previous version of the server.

Monday, December 07, 2009

MySQL user groups in Dubai and Sydney, on my way to NZ

In January 2010 I will attend Linux.Conf.Au, which this year is held in Wellington, New Zealand.
It's a long way from Europe to New Zealand, and so I will take a few stops.
On January 13 I will be in Dubai, UAE. If you are around, I would love to organize a MySQL meeting. I haven heard back from the local user group and it seems that a meeting will take place. Stay tuned for more.
On January 15th I will be in Sydney. The organizers are already at work. We will definitely have an user group meeting. I am open to suggestions about the topics.

From Sydney, I will continue to Wellington, where I will attend LCA2009 and then DrupalSouth before coming back to my usual time zone.

Friday, December 04, 2009

Gearman: distributed computing and Codebits pictures

The first Codebits> day lasted until long past midnight. So the attendees were a bit sleepy today, but they were brave and got up early enough for my session.
The presentation covered the basics of Gearman, some advanced magic to install remote MySQL servers, and more magic to enable MySQL users to shoot themselves in the foot repeatedly by combining a gearman/MySQL UDF and some clever scripts.
As usual, the slides are available on slideshare.
Some pictures from codebits 2009 are on Flickr.

Worshiping Technology.

Cool technology to worship.

José, the mastermind	Josette, the culture provider
Gonçalo, the friendly face who met me at the airport	&nbs; Pedro, magnificent guide of town and technology.

MySQL schema maintenance

At CodeBits I had my first session about MySQL schema maintenance. I covered the basic command line possibilities before coming to the recommended tool, MySQL Workbench.
The slides are available at slideshare.

Interesting questions: ([updated] with answers from the development team

[Q] Are there plans to administer MySQL Cluster with Workbench?
[A] Not that we know of.

[Q] Can Workbench deal with user permission maintenance across servers? (especially in cases where development and production users can't have the same privileges)
[A] YES. It's in the roadmap

[Q] Can MySQL Workbench help editing stored routines? Apparently, you can't change the routine code with ALTER PROCEDURE/FUNCTION, but you need to drop it, recreate it, and eventually change privileges.
[A] Not that we know of, although something can be done with Proxy to ease this problem.

[Q] Is the code for the plugin documented for contributors?
[A] Not yet, but it's going to be soon.

Sunday, November 29, 2009

Poor man's schema comparison

Comparing database schemas is one of those DBA tasks that occur all the time.
If you can afford the luxury of having a GUI, then MySQL Workbench is the tool for you. It will help you compare structures and eventually synchronizing the differences.
If the only thing at your disposal is a command line interface, without the possibility of installing anything (a common occurrence when you are consulting), then this Poor man's schema comparison tool may come handy.

It's a Perl script that doesn't require module installations beyond the most common ones, and produces a simple output, fit for further examination with the tools that you have at your disposal when working at the command line.
By default, it invokes vimdiff, to let you know visually what the differences are. If you change the command to diff, you get a simple output that can tell you at a glance what differences you are dealing with.
For example, after executing

schema_diff mysql://root:mysql@127.0.0.1:3306/crosstab \
mysql://msandbox:msandbox@127.0.0.1:5139/crosstab

I got the difference (a column that I introduced in the first database but not in the second one) clearly visible on screen.
vimdiff

I will be talking about this and other resources for the DBA during my session on schema maintenance at CodeBits in Lisbon (December 3-5, 2009).

Codebits 2009, coders conference and competition in Lisbon

Codebits is approaching. Form December 3rd to 5th, this gathering of 600 developers for a conference, which is also and foremost a competition, will occupy the mind of the best coders in Europe.
I will be a speaker, with two sessions:

Schema maintenance in MySQL, focusing mainly on MySQL Workbench, but with an introduction to the other choices.
Introduction to Gearman - a distributed, language and OS independent system for dynamic programming.

Also Lenz will be there, and quite busy. He will also have two sessions:

Why you should be using a distributed version control system (DVCS) for your project, with advice that will benefit every developer.
MySQL High Availability Solutions - Best practices and commonly used HA setups for MySQL

The event is hardly like any other conference. It will be a momentous show, with a part that start like a conference but goes on as a competition.
If you like coding, you must show up!

Thursday, November 12, 2009

Gearman for MySQL

If you haven't yet heard about Gearman, it's time to have a look at it. Its distributed client server architecture are a perfect match for today's cloud oriented applications.
When talking about Gearman, much stress is often given on its scalability features, such as map/reduce and distributed loads. But Gearman has also a distinctive advantage in the feature department, because of its design that I like to describe as cooperation of the fittest.

While a traditional application is written entirely in a given language, or it must find ways of integrating multiple languages parts, Gearman encourages easy cooperation between parts written in different languages, eliminating the need for library translation, and allowing the best implementation of a feature to be used by another part of the application.
Today, November 12, at 14:00 UTC, there will be a MySQL University presentation on Gearman for MySQL.
One of the components created by Gearman enthusiasts is a MySQL Gearman UDF. Using this extension, you can easily define functions in any language you are familiar with, and then call that function from MySQL. Some examples shown in the presentation include shell access, regular expressions enhancements, storing results incrementally in a file, using language-specific library extensions, and your imagination can do the rest.
Using this sample worker code, you can then call the UDF function from MySQL with something like thge following:


select gman_do('reverse','abcd') as test;
+------+
| test |
+------+
| dcba |
+------+

SELECT gman_do('shell',
concat(' ls -lh ', 
  (select variable_value from information_schema.global_variables 
  where variable_name = "datadir" )))\G 

total 40976
-rw-rw----  1 gmax  staff   5.0M Nov 11 13:34 ib_logfile0
-rw-rw----  1 gmax  staff   5.0M Nov 11 13:34 ib_logfile1
-rw-rw----  1 gmax  staff    10M Nov 11 13:34 ibdata1
-rw-rw----  1 gmax  staff   1.2K Nov 11 13:34 msandbox.err
drwx------  2 gmax  staff   2.4K Nov 11 13:34 mysql
-rw-rw----  1 gmax  staff     6B Nov 11 13:34 mysql_sandbox5140.pid
drwx------  2 gmax  staff    68B Nov 11 13:34 test


select gman_do('eval','2 * 3') ;
+-------------------------+
| gman_do('eval','2 * 3') |
+-------------------------+
| 6                       |
+-------------------------+


select gman_do('eval',
  concat('$_="',host,'";tr/a-z/b-za/; $_')) 
as test from mysql.user;
+-------------+
| test        |
+-------------+
| %           |
| mpdbmiptu   |
+-------------+

Thursday, October 15, 2009

Spider and vertical partition engines with new goodies

The Spider storage engine should be already known to the community. Its version 2.5 has recently been released, with new features, the most important of which is that you can execute remote SQL statements in the backend servers. The method is quite simple. Together with Spider, you also get an UDF that executes SQL code in a remote server. You send a query with parameters saying how to connect to the server, and check the result (1 for success, 0 for failure). If the SQL involves a SELECT, the result can be sent to a temporary table. Simple and effective.

In addition to the Spider engine, Kentoku SHIBA has also created the vertical partitioning engine. Instead of splitting tables by record, you split them by columns. You can define a table with column A and column B, with primary key K, and another table with column C and column D, with primary key K. The vertical partition engine allows you to define a table with columns K, A, B, C, D, which looks to the user like a regular column. The backend tables can be of any engine.
There is a MySQL University session about the Spider and VP engines on November 26th at 15:00 CEST. Free attendance!
The slides are online: Sharding for the masses

Wednesday, October 07, 2009

Introduction to Gearman at the Italian Research Council

I was invited to contribute some technological views at the Italian National Research Center, during the Internet Governance Forum.
My contribution was ahigh level introduction to Gearman, which sparked a debate about the impact of the cloud on the future of open source. Indeed, cloud computing technologies have the potential of harming open source adoption. If this is a threat and how much it can affect the future of open source depends on the business model behind the cloud.

More interesting topics were discussed both during the scheduled sessions and in open gathering. During dinner, for example, I got some disturbing statistics on database teaching in Italian universities. It's very common to require MS Access and .NET as supporting technologies for most of the IT related exams, with little or no room for open alternatives like MySQL and PostgreSQL.
Seeing a newsstand inside the compound of the NRC makes me think that progress slow in the institution that is supposed to be the guide of the Italian technology. The same thought crossed my mind seeing that the conference hall, a beautiful 200 seat hall well equipped for multimedia has only one power socket, but it was off limits for users.

Thursday, September 24, 2009

Jeremy's article on MySQL Sandbox in Linux Magazine

Jeremy Zawodny of Craiglist has written a kind article about MySQL Sandbox.
The article, MySQL Sandbox: Treat MySQL Instances like Virtual Machines, is a practical test of MySQL Sandbox with usage examples and warm appreciation.
Thanks, Jeremy!

The article was published in July but I noticed it only today. I guess I should pay more attention to my favorite topics when I travel.

Tuesday, September 15, 2009

Partitioning with non integer values using triggers

Looking at Bug#47310, which is a feature request that I hear frequently when I talk about partitions, I wrote a comment, suggesting triggers to work around the limitation.
The reason for the limitation is that allowing arbitrary functions for partitioning was too complex and it was provoking crashes and other unpleasant side effects (see the discussion under bug#18198).
But if you use a trigger, the resulting column is a plain integer, and many of the side effects disappear. The drawback is that you need to add a column to your table, and you need to use that column when searching for data. With that in mind, you can implement the workaround quite easily.


USE test;
DROP TABLE IF EXISTS users;

CREATE TABLE users (
        user_id int(10) NOT NULL,
        username varchar(25) DEFAULT NULL,
        dummy INT not null,
        PRIMARY KEY (user_id, dummy),
        UNIQUE KEY username(username,dummy)
) ;

CREATE TRIGGER users_bi
BEFORE INSERT ON users
FOR EACH ROW
SET NEW.dummy = ASCII(LOWER(LEFT(NEW.username,1))); 

ALTER TABLE users PARTITION BY RANGE (dummy) (
        PARTITION p0 VALUES LESS THAN  (96),  #being f
        PARTITION p1 VALUES LESS THAN (109),  #being m
        PARTITION p2 VALUES LESS THAN (115),  #being s
        PARTITION p3 VALUES LESS THAN (122)   #being z
); 

INSERT INTO users (user_id, username)
VALUES (1,'Joe'), (2,'Sam'),(3,'Abe'),(4,'Rich');

EXPLAIN PARTITIONS SELECT * FROM users 
where username = 'Abe'; 
# This simple query doesn't use partition pruning. 
# This is to be expected.

EXPLAIN PARTITIONS SELECT * FROM users 
where dummy = ASCII('a') and username = 'Abe'; 
# Here, the partition pruning kicks in, at the price of an extra
# condition in the query.

Saturday, September 12, 2009

Snow Leopard blues

On Friday afternoon, I went to give a presentation about MySQL advanced features at the Sardegna Ricerche technology park. The presentation included a quick introduction to MySQL Sandbox, something that I have been doing for years, and I thought I could do blindfold, if required. However, something didn't go as expected.

Just when I was showing off how easy is it to install a MySQL sandbox from a tarball, I was faced by an unexpected error. The tar application was not among the recognized ones. As soon as I saw the error, I immediately knew what had happened. That morning I upgraded my Mac OSX to Snow Leopard. And, unknown to me, Apple has changed the default tar, which is not a symlink to the GNU tar, but points at bsdtar. Not a difficult fix (I released a new version of MySQL Sandbox this morning) but not something that you want to deal with during a live demo either.
Apart from that, Snow Leopard seems to behave nicely. Somebody else had nasty compatibility problems but so far I have been spared.
One annoying problem is that Safari crashes with Java applications, because of the faulty Java 1.6 package released with Snow Leopard. Apparently, Java 1.5 and 1.6 both point at the same binaries, on the grounds that 1.6 is backward compatible. Well, it is, but it crashes the browser, so I looked around for a remedy, and I found out that someone else has fixed the issue and shared the recipe. Thanks, folks. It worded for me as well.

Thursday, September 10, 2009

The Open Source Events Calendar

Kudos to Lenz, who, acting on an idea launched by Ronald Bradford some time ago, has put together a comprehensive calendar of open source events. Most of them are somehow related to the MySQL ecosystem, but there is no limitation to what the calendar contains.
Here is the announcement, with the instructions to use and contribute to the calendar.

In addition to informing you about the events, this calendar does also tell you when a deadline is approaching. Using this tool, you won't miss a call for participation anymore.
You can simply subscribe to the iCal feed (it's a Google calendar) or see it online.
And of course, we want to improve the calendar. Feel free to submit new events using the event submission form.

We're looking into ways of improving the service. It would be nice to have a widget to show on your blog. Using Google APIs, it's easy to create such a widget, but the events are shown in insertion order, rather than chronological order. If anyone knows how to fix this issue, please contact me or Lenz.

Sponsoring OpenSQLCamp

The next OpenSQLCamp will be held in Portland, Oregon, USA. It is being organized by Eric Day, well known to the open source community for his active and productive participation to several projects (especially Drizzle and Gearman).

The event is public and free. Therefore, it needs public sponsoring. I don't know yet if I can attend, but I have already donated something to the organizers, and I am officially a sponsor. You can be one too. Simply go to the sponsors page and donate a minimum of $100 as in individual or $250 as an organization.
And of course, if you plan to participate, register yourselfand eventually propose a session.
OpenSQLCamp is a fun, equal level event. If you have something to say, write a proposal, and the other participants will tell you if they want to hear it or not. Either way, you will learn something.

Tuesday, August 25, 2009

First Kosovo Free Software Conference

The first Free Software Conference in Kosovo is being held on August 29-30 in Prishtina, Kosovo.
The newborn state, which declared independence in 2008, is growing in several ways. Besides economic and political growth, also technology is expanding. In this fresh environment, free and open source software can have an important role.
The conference schedule includes local and international speakers.

I have a strong professional and personal relationship with Kosovo. Before and after the war, from 1998 to 2002, I was part of the OSCE-UN mission that helped the country in its first steps towards independence. And there I found my wife. Thus, it's with great pleasure that I will do a keynote on the theme of Freedom beyond free of charge.
And although the talk won't be about MySQL, it will nonetheless have a role in the story. Some of the freedom that was achieved during the UN Mission in Kosovo days is also due a combination of Linux, FreeBSD, MySQL, and many more free software projects.

Thursday, August 20, 2009

MySQL Sandbox and Spider at FrOSCon and OpenSQLCamp

FrOSCon and the OpenSQLCamp are about to start.
I am packing for Sankt Augustin, where I will attend the fourth edition of FrOSCon and the second OpenSQLCamp. I will have two sessions, Sharding for the masses, about the Spider storage engine and MySQL Sandbox 3, about one of my favorite tools.

The program is very rich. There will be several tracks in the main event and in the associated conferences. If you have any involvement or simply some curiority in open source matters, You will find something interesting at FrOSCon.

Tuesday, August 18, 2009

Testing the InnoDB plugin with MySQL snapshots

The cat is out of the bag.
MySQL 5.1 will include the InnoDB plugin, and thanks to
labs.mysql.com you can try the new version right away.
Here is a step-by-step guide to testing the InnoDB plugin with MySQL snapshot 5.1.39 and MySQL Sandbox.

1. Install MySQL::Sandbox

This is a straightforward part. Please refer to the manual for the details.

2. get the binaries

Check the list of available binaries and download the one that matches your architecture and operating system.

3. Install the sandbox

Since we want to use the InnoDB plugin, we need to start the Sandbox with the builtin innodb engine disabled.

make_sandbox \
/path/to/mysql-5.1.39-snapshot20090812-osx10.5-i386.tar.gz \
-c ignore-builtin-innodb

The option passed with "-c" will be written to the options file.
Make sure that the sandbox is installed and the server starts. If it doesn't, check the error log at $HOME/sandboxes/msb_5_1_39/data/msandbox.err and try to figure out what happened.

4. Check the available engines

~/sandboxes/msb_5_1_39/use
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.39-snapshot20090812 Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

select engine, support from information_schema.engines;
+------------+---------+
| engine     | support |
+------------+---------+
| MyISAM     | DEFAULT |
| MRG_MYISAM | YES     |
| BLACKHOLE  | YES     |
| CSV        | YES     |
| MEMORY     | YES     |
| FEDERATED  | NO      |
| ARCHIVE    | YES     |
+------------+---------+

As you can see, InnoDB is not in the list.

5. Install the innodb plugin

install plugin innodb soname 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.85 sec)

select @@innodb_version;
+------------------+
| @@innodb_version |
+------------------+
| 1.0.4            |
+------------------+

6. Install the additional INFORMATION SCHEMA tables

INSTALL PLUGIN INNODB_TRX SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

INSTALL PLUGIN INNODB_LOCKS SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

INSTALL PLUGIN INNODB_LOCK_WAITS SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

INSTALL PLUGIN INNODB_CMP SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

INSTALL PLUGIN INNODB_CMP_RESET SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

INSTALL PLUGIN INNODB_CMPMEM SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

INSTALL PLUGIN INNODB_CMPMEM_RESET SONAME 'ha_innodb_plugin.so';
Query OK, 0 rows affected (0.00 sec)

7. Finally, check the results

select plugin_name, plugin_type, plugin_status from information_schema.plugins;
+---------------------+--------------------+---------------+
| plugin_name         | plugin_type        | plugin_status |
+---------------------+--------------------+---------------+
| binlog              | STORAGE ENGINE     | ACTIVE        |
| partition           | STORAGE ENGINE     | ACTIVE        |
| ARCHIVE             | STORAGE ENGINE     | ACTIVE        |
| BLACKHOLE           | STORAGE ENGINE     | ACTIVE        |
| CSV                 | STORAGE ENGINE     | ACTIVE        |
| FEDERATED           | STORAGE ENGINE     | DISABLED      |
| MEMORY              | STORAGE ENGINE     | ACTIVE        |
| MyISAM              | STORAGE ENGINE     | ACTIVE        |
| MRG_MYISAM          | STORAGE ENGINE     | ACTIVE        |
| InnoDB              | STORAGE ENGINE     | ACTIVE        |
| INNODB_TRX          | INFORMATION SCHEMA | ACTIVE        |
| INNODB_LOCKS        | INFORMATION SCHEMA | ACTIVE        |
| INNODB_LOCK_WAITS   | INFORMATION SCHEMA | ACTIVE        |
| INNODB_CMP          | INFORMATION SCHEMA | ACTIVE        |
| INNODB_CMP_RESET    | INFORMATION SCHEMA | ACTIVE        |
| INNODB_CMPMEM       | INFORMATION SCHEMA | ACTIVE        |
| INNODB_CMPMEM_RESET | INFORMATION SCHEMA | ACTIVE        |
+---------------------+--------------------+---------------+

Now you can read the InnoDB plugin manual and have as much fun as you can.

Sunday, August 09, 2009

Beefing up community feeds

If you weren't paying attention, you may have missed the latest news in Planet MySQL. A few months ago, the site has added the ability of voting for posts (in addition to searching and tagging, which make it much more useful than it was before).
However, if you, like many, read Planet MySQL using a feed reader, the recent benefits are not easily usable.
To allow everyone to vote on the posts they like, even from a feed reader, the feeds now include two links at the end of for each post, to vote on a deserving post even from a feed reader.

The feed may look like this one:

Since I liked this post, I can click on Vote UP, and the link opens in my default browser, showing the result of the voting.

The same happens for the MySQL Librarian feed. At the end of each post, you can click on vote UP or vote DOWN directly from your feed reader.
In both cases, the vote works only if you are already logged in. If you aren't, you will be taken to the login page.
This is not the end of the story. More goodies will come. Stay tuned.

Monday, July 27, 2009

Automating MySQL Librarian tasks

The MySQL Librarian is a collection of community generated content, easy to search and to extend. If you have missed the announcement, you may look at Introducing the MySQL Librarian..
To add a new link, you have several ways. You can just use the Add A Link feature inside the Librarian. If the link to be added comes from Planet MySQL, every post has a link to facilitate the insertion to the Librarian. For everything else, adding a link is a manual task. Until today.

Adding easily to the Librarian

Diego Medina, who should be well known in the community as a very active promoter of MySQL Proxy, and one of the most creative bug finders, has made two additions that have already been incorporated into the Librarian.
If you go to the MySQL Librarian now, you will find a "bookmarklet" that you can drag and drop to your browser toolbar (if your browser is either FireFox or Safari, that is). After that, you have a button in your toolbar, which you can use to quickly add a new link.

For example, let's consider an article that we may want to add to the Librarian. Something like MockLoad on Launchpad - MySQL Proxy.

If you have already dragged and dropped the bookmarklet, you can click on your new button, and get to the Librarian page, with almost all the detailed already filled in for you.

Search engines

Diego didn't limit his contribution to adding links. He has also created a new search engine plugin for FireFox, which lets you use the MySQ Librarian as your search engine.


Adding the Librarian engine	Using the Librarian engine

If your default search engine is the Librarian, your search will be sent to MySQL Librarian, as easily as you search in Google or in any other engine.

Thanks, Diego!

Wednesday, July 22, 2009

OpenSQLCamp democracy

We have seen this before. Actually, we got the idea from Drupal, where talk proposals are public, and the most voted ones get in the schedule. Nonetheless, it's a pleasure to see that a transparent voting system is accepted and used.

The OpenSQLCamp 2009 European edition, is under scrutiny. There are 27 session proposals, from which we will need to get 12 in the schedule.
The open voting is done via Twitter or the mailing list.
I have a good feeling about it. Since I am proposing a public talk, I must be prepared for public scrutiny. It's only fair that my proposal is evaluated by the same audience that will later decide to come see my presentation. It's an incentive to write a clear and compelling proposal. I can't get away with "I am well known. I will say something about replication."
If you are attending OpenSQLCamp at FrOSCon, let us know what you would like to see. Let the best proposals win.

Monday, July 06, 2009

Sharding for the masses: the spider storage engine

In my previous article about the Spider storage engine, I made some tests and I saw that the engine has potential. I did also identify some problems, which were promptly addressed by the author. I have looked at the latest version (0.12), and the results are very much encouraging.

Installing the Spider storage engine is not trivial but not extremely difficult either. My previous article has a step-by-step explanation. Before installing, though, you need to apply one patch to the server, to enable condition pushdown to the partitions engine. What is it about? In short, when you issue a query from the Spider table, with a WHERE clause, the query is passed to a backend database without the WHERE clause. True, it's filtered by partition, but each partition may have a large data set, and this means a full table scan on the backend server. In a few cases, this is not a big deal, because the general performance of the Spider engine is better than a non-partitioned table, but it can be improved.
The lack of conditions pushdown is a problem already known in the Federated engine. With the above patch applied, the Spider engine pushes the condition down to the back end, and the query is executed with the WHERE clause intact.
Let's consider this simple query:

select * from salaries where from_date='1996-01-10' limit 10;

Before applying the patch, the backend database would execute this:

select `emp_no`,`salary`,`from_date`,`to_date`
from `employees`.`salaries` limit 0,9223372036854775807

After the patch, the backend receives and execute the more sensible

select `emp_no`,`salary`,`from_date`,`to_date`
from `employees`.`salaries`
where (`from_date` = '1996-01-10')
limit 0,9223372036854775807

A final observation. While I was testing the latest version, I noticed something that should have been clear from the beginning, i.e. that a spider engine set of backends can have multiple front-ends. Meaning that you can create a Spider table from a different host, and access the same backends that another host is already using. In theory, it means more flexibility when dealing with large amount of data. In practice, we'll have to test it more and see what happens.

Sunday, June 28, 2009

Remote replication setup with Gearman and MySQL Sandbox

A few months ago, Brian Aker invited me to have a look at Gearman, saying that I could find interesting combinations with MySQL Proxy. I did not forget, and I kept thinking about interesting ways of using it. The first idea that I managed to apply is not related to Proxy, but to a practical problem that I have been keeping in reserve for years, i.e. installing replication systems from remote, without effort.

After some fiddling around with the alternatives, I convinced myself that Gearman is the way to go. Before I proceed to show what I did, though, perhaps it's useful if I spend a few words about Gearman itself.

Gearman overview

It's a strange word. If you don't pay close attention, you read "German" and you may think "why should an administration tool be German? Doesn't Open Source apply across country boundaries?" And when you realize that there is an "a" after the "e", you are as baffled as before. What's this? Gearman is an anagram of manager. Because managers, so the explanation goes, don't do anything, but only direct workers towards their jobs. And this could be a good explanation of what Gearman is: it's a system to organize distributed jobs.

This picture explains the concept. You, as a client, need to have a job done. You ask the gearman server which somehow asks all the registered workers for their ability to perform such job. If any of the function names advertised by the workers matches the client request, the server assigns the job to the worker, together with the workload sent by the client, and then returns to the client the results that the worker has provided.
The system is flexible, because you can set as many servers and workers as needed, in several hosts. And it is fault tolerant, because the same worker can register its functions to several servers, and several workers can advertise the same functions to the same or a different server. Depending on the nature of the task, a client can ask several workers for a portion of the task, or it can just ask the first available one to perform the whole job. Since the workers can be located in several servers, you can distribute the load of a heavy job among many hosts. Your imagination is the limit.

Installing a replication system with Gearman

If you are familiar with MySQL Sandbox, you know that its main aim is to install servers quickly and efficiently, and to save valuable time by providing shortcut scripts for each sandbox. You can create a replication sandbox with one master and two slaves in the same server in a handful of seconds. However, they are all in the same server. This means that you can use this sandbox system to test the correctness of your application, but you can't test efficiency and scalability very well.
On one side, we have a flexible method to install single sandboxes. On the other side, we have a general purpose distributed system that can do jobs in remote servers. It is a logical choice to put these two frameworks together.

Requirements

Before you start, you need to install MySQL Sandbox and Gearman in all the servers that will host a database node.You also need to have downloaded the appropriate tarball in each host. If the boxes are compatible, you can download once and copy from box to box (I am playing with the idea of doing the copy through Gearman), but that is not always a safe assumption. If one of the boxes has a different operating system or architecture (as in my case: two Macs and one Tux), you need a different tarball.
Another task to perform before installing is opening the necessary ports. You will need to open the Gearman port (4730) and the port(s) used by MySQL master and slaves.

Gluing pieces together

Installing the Gearman framework is not trivial. For many users, it is not as difficult as setting up a MySQL replicated system, but it may have some pitfalls.
Unless you are comfortable programming in C, in addition to the Gearman server you may want to install the API for your favorite language. I used Gearman::XS, a Perl module wrapped around the C API. It installs easily on some operating systems, but it requires some manual help in some others.
Anyway, after one hour, I had it installed in six servers with five operating systems and I am confident that I can install it almost anywhere (if that anywhere looks like Unix).

Unlike the sample framework of the previous picture, I can't use a single Gearman server. Or, actually, I can, but it would require quite a lot of additional traffic and code complexity in the client, and therefore I used one server for each host where I had to install something.
Once that decision was taken, everything went smoothly. Each host has a server and one or two workers that registered the "make_sandbox" function with a remote installer script. The client knows which role is assigned to each host. Although it is not strictly necessary: the IPs could be shuffled around, and the final result would not change. Anyway, using the make_distributed_replication scripts, the client installs (or, rather, asks a worker to install) the master in the first host, and the slaves in the second and third host. At each step, there is a rudimentary error check, to make sure that the installation is being performed as expected. During the operation, the client produces a few scripts that make the administration easy.
At the end of the installation, you can connect to the master and the slaves using the familiar m, s1, and s2 scripts, and do mass operations with use_all and check_slaves.
To the user, the layout looks like the result of make_replication_sandbox in one machine, but you are dealing with a system spread among several hosts.


$ ./make_distributed_replication.pl
installing master
installing slave 0
initializing slave 0
installing slave 1
initializing slave 1

$ ./check_slaves
slave 1:
          Slave_IO_Running: Yes
         Slave_SQL_Running: Yes
slave 2:
          Slave_IO_Running: Yes
         Slave_SQL_Running: Yes

$ ./m
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.0.83-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master> exit
Bye

My first test was to load some heavy data, using the employees test database, and I am happy to report that the execution is much faster than a replication in the same host.

TO DO list

This is just a proof of concept. It seems to work well, but there is still much work to do. The error checking must be improved a lot both in the client and in the worker. Before registering the function, the worker must check that the external applications are available. When receiving a request, the worker can check if the desired tarball exists and if there is no conflict with already installed servers.
The client should be able to accept IPs, ports, and tarball versions from the command line. Some functions to cleanup and diagnose the remote servers should be implemented as well.
But I am very pleased about this first step.
Comments are welcome. This is my first usage of Gearman, and any way of improving on the basic concepts of this framework will be welcome.

Sunday, June 21, 2009

Failing by choice. Another bug-vs-feature debate ends

A long standing bug

Among the many outstanding bugs for MySQL, there is one that has sparked a fierce discussion, not only in the bug report itself, but also in blogs, forums, mailing lists.
Bug #19027: MySQL 5.0 starts even with Fatal InnoDB errors was neglected for long time, until finally it got fixed, and it is available in MySQL 5.1.36.
First off, what is it about?
In short, if an engine doesn't initialize correctly, the MySQL server starts anyway, without the offending engine. Depending on how you use the failing engine, this could be either a minor annoyance or a complete disaster.
Annoyance: ARCHIVE fails to initialize, and you create tables with the default engine (usually MyISAM), and after a while you realize that the disk is filling up faster than you expected. You will find out eventually.
Disaster: ARCHIVE fails to initialize, and you copy 300 GB of data into a table that you believed was ARCHIVE and instead is MyISAM, filling the disk and halting your server completely.
Annoyance: InnoDB fails to initialize on your new slave, and you create MyISAM tables instead. After a while you realize that this slave is serving queries way more slowly than the others.
Disaster: InnoDB fails to initialize on your master, which was already using InnoDB tables. None of you application work anymore. You can't read or write from your tables. You are completely stuck.

The same old story: bug or feature?

While I admit that some cases could be classified as annoyances, I am convinced, by personal experience coupled with gut feeling, that disaster is a more frequent outcome of this situation. When the engine fails and the server happily starts without your valuable data you are in trouble. True, you can monitor the error log and have a cron job sending you an alert if this happens, but by then it may be already too late.
So, why did it take this long to fix it? The bug was filed in April 2006.
The reason is that this bug was labeled between low priority and feature request. The behavior was documented and consistent with other similar cases, and many developers felt that by satisfying the request of the bug reporter we were adding a new feature. If this were a philosophical discussion, yes, it would be so. But this is about real life usage of databases, and it's my strong opinion that it is a bug, because it can harm the user, and it feels wrong.
I wrote about this matter before. A bug is not justified by being documented. A documented wrong behavior is a bug nonetheless.

How it works

Let's have a look at the practicalities. In MySQL 5.1.35, you create a InnoDB table and then restart the server with a mistake in your my.cnf. You inadvertently changed the size of the InnoDB file. For example, you set innodb_data_file_path=ibdata1:2000M while the original size was 200 MB.


$ $HOME/sandboxes/msb_5_1_35/restart --innodb_data_file_path=ibdata1:2000M
. sandbox server started
# the server starts
$ $HOME/sandboxes/msb_5_1_35/use -e "show engines" | grep -i innodb

Your query for the InnoDB engine is met by Unix nothingness. InnoDB failed to initialize, but the server started, leaving your data crippled and the DBAs blissfully unaware, unless they were checking for this specific occurrence.
What happens with 5.1.36? By default, it does exactly the same. If you think that this kind of failure could be a minor annoyance, you leave things as they are, and the failure of an engine, from CSV to InnoDB, won't bother you more than usually.
However, if you want to force the server to abort when your engine fails, now you can say so. You need to specify the desired behavior in the options file.

command	meaning	example
engine_name=OFF	The specified engine will not be included. The server will start without it.	innodb=off
engine_name=ON	The specified engine will be included. If it fails, the server will start.	innodb=on
engine_name=FORCE	The specified engine will be included. If it fails, the server will NOT start.	innodb=force

Testing the above assumptions with the latest version shows a different result.

$ cd $HOME/sandboxes/msb_5_1_36
./restart --innodb=force --innodb_data_file_path=ibdata1:2000M
................... sandbox server not started yet

With the choice to fail now available, we can decide if an engine is so important that we can't live without, and instruct the server that it's better to fail than starting in crippled mode.
Another usability bug bites the dust.

Tuesday, June 16, 2009

Giving away one copy of MySQL Administration Bible in Madrid

I will be in Madrid on June 18-19, to participate to the Sun Open Communities Forum. I will have two presentations on my own, and I will be a guest speaker during Victor Jimenez's session. There will be some ancillary activities, among which a lunch with the MySQL community and a MySQL workshop.

During one of these activities, I will give away one copy of the MySQL Administrator’s Bible. If you are a MySQL enthusiast and you are in Madrid, that's an extra reason to attend the forum. For the ones who still don't know how good this book is, you can have a look at this Review.
During the forum, there will be an announcement about the MySQL community in Spain.

Friday, June 12, 2009

A quick look at Google Fusion Tables

I was curious about Google Fusion Tables, and gave it a try.
I uploaded the employees table from the employees test database, 16 MB of data, about 300,000 rows. Since the maximum limit per table is 100 MB, I expected interesting results.
However, one of my first tests, with aggregation was quite disappointing.
A simple group by gender was executed in about 30 seconds.

InnoDB on my laptop did a much better job:


select gender , count(*) from employees group by gender;
+--------+----------+
| gender | count(*) |
+--------+----------+
| M      |   179973 | 
| F      |   120051 | 
+--------+----------+
2 rows in set (0.32 sec)

Here's the link to my test table, if you want to give it a try. You need to have a Google account to see it.

Thursday, June 11, 2009

MySQL Sandbox and laziness

Laziness strikes again.
MySQL Sandbox was created with the intent of avoiding repetitive work when creating and using several servers. Turns out that even the current framework, which many say that is really time saving and enhances productivity, was not enough. So my desire for laziness, which is, as everybody should know a chief virtue for a programmer has made me code a shortcut script, which can joggle sandboxes as never before.

Enter the sb script (available in version 3.0.03). Now I can shorten my typing experience with sandboxes quite a lot:


  $ sb 5135
  # same as calling 
  # $SANDBOX_HOME/msb_5_1_35/use

Nice, isn't it? It saves me quite a lot of keys. Not only that. If this sandbox is not active, the script will start it for me. That's yet a few keystrokes more that I am saving.
What if I want a replication sandbox? Easy:


  $ sb r5135
  # same as calling 
  # $SANDBOX_HOME/rsandbox_5_1_35/m

For a multiple sandbox instead of a replication one, use "m5135."
Now, for the really amazing stuff. Let's suppose that the sandbox doesn't exist yet.
No problem.


  $ sb 5.1.35
  # same as calling 
  # make_sandbox 5.1.35
  # and
  # $SANDBOX_HOME/msb_5_1_35/use

And if you don't have a pre-expanded directory for 5.1.35 under $SANDBOX_BINARY as I do? Also this is easily done:


  $ sb mysql-5.1.35-YOUR_OS.tar.gz
  # same as calling 
  # make_sandbox mysql-5.1.35-YOUR_OS.tar.gz
  # and
  # $SANDBOX_HOME/msb_5_1_35/use

Creating and using a replication sandbox is really easy as well:


  $ sb r5.1.35
  # same as calling 
  # make_replication_sandbox 5.1.35
  # and
  # $SANDBOX_HOME/rsandbox_5_1_35/m

The manual explains the above cases and some more (Yes: there is actually more).
Go on! get lazy!

Tuesday, June 09, 2009

Speaking at FrOSCon 2009 and getting ready to OpenSQLCamp-Europe

For the fourth time in a row, I will be speaking at FrOSCon, one of the most charming open source events in Europe.
Hosted in the bright environment of the Department of Computer Science of the University of Applied Sciences Bonn-Rhein-Sieg, this event will get you hooked from the beginning. The organization is done by volunteers, who have always done an amazing job, with even better results than more expensive and famous conferences.
This year, there will be some more action than ever before. In addition to the main event, the organizers have given away a few developers rooms, to let some projects build their own event within the main one. There will be a Java subconference, and, closer to my interests, the European edition of the OpenSQLCamp 2009, which applies to all open database related projects, such as MySQL, PostgreSQL, JavaDB, and many others that are listed in the announcements page. The principal organizer is Lenz Grimmer, who is looking for volunteers to share the burden.
The organizers have also launched a creative contest, which most geeks will find irresistible. Check it out!
FrOSCon started as one of the many LAMP events, and in its fourth year it has emerged as an example for others. I warmly recommend it to all people interested in open source.

Joining the PHP BBQ in Hamburg and Kiel

There must be something that gets coders and barbecue together. In Australia, I was invited to a hackers barbecue, while at FrOSCon, the Saturday evening meal is a grill fest.
It seems just right that the PHP coders in Germany are getting together with a BBQ Tour, starting on Monday, June 15, and touching seven German towns in one week.

I will join the happy campers on Saturday, June 20, in Hamburg, and finish off the tour in Kiel the next day.
The event is open. However, for logistic reasons, a registration is required in some of them. Check the schedule for more details.
Being informal events, we may or may not have presentations and demos, but being inter-geek meetings, there will be some fun and mutual learning for sure. I look forward for it. Most of my "aha" technical moments came during serendipitous social gatherings. Beer and food are great science conductors!
Follow FrOSCon on Twitter and OpenSQLCamp on Twitter for more updates.

Saturday, June 06, 2009

Unreliable androids

There is an article with a cute title that is gaining popularity:
Do Androids Count Electric Sheep with DB2 or MySQL?
Allegedly, DB2 is demonstrated as incredibly faster than MySQL, with a benchmark that repeats the same COUNT query 100 times.

This is a naive (at best) benchmark that doesn’t tell me nothing about the database potential. But anyway, if you enable the query cache in MySQL, the repetition of 100 queries is at least three times faster than DB2.

Try
set global query_cache_size=1024*1024;
and repeat that test.

Moreover: the table structure doesn’t correspond to the data from the freebase project.


CREATE TABLE `people` (
 `id` int(11) DEFAULT NULL auto_increment PRIMARY KEY,
 `name` varchar(255) NOT NULL,
 `fbid` varchar(255) NOT NULL,
 `gender` varchar(255),
 `profession` varchar(255)
) ENGINE=InnoDB

Too bad there is no integer ID in that table, and there are 23 fields instead of 5.
These androids don’t strike me as extremely reliable.

A cross-system benchmark is reliable only if you either (a) know both systems very well or (b) you invite one expert of the systems involved to make sure you are using both of them in a fair way.
Oh, and if you can make a table that has smarter fields than VARCHAR(255), your benchmark may look more credible.

Wednesday, June 03, 2009

MySQL Sandbox has a dedicated home

MySQL Sandbox is a mature project (recently released version 3.0), and as such I thought that it deserved a dedicated home.
Thus, I went shopping, and I bought a few domains, all pointing to the same place:

Web design is not (to say it mildly) my strongest ability, and thus the result is nothing that would trouble Facebook or eBay's sleep. But I will work at it in my spare time to improve its features and possibly its looks.

This is my 300th post in this blog!

Open Communities in Madrid - June 18-19

I am going to Madrid, Spain, to participate in the Open Communities Forum on June 18-19.
As usual with Sun open events, the agenda includes several actors in the open source arena, such as MySQL, Java, Open Solaris, with topics ranging from web development to mobile integration.
It looks promising, and I look forward to this event, which is my first one in Spain.
On the first day, I will do an introductory talk about MySQL, and a more technical workshop for advanced users. The second day will be mainly dedicated to meeting users in the area, but I will also do a guest appearance in a talk about database testing.

Monday, June 01, 2009

MySQL University - Boosting performance with partitions

Mark your calendars: A MySQL University session about Boosting performance with MySQL 5.1 will take place on Thursday, June 4th at 13:00 UTC ( 8am CDT (Central) / 9am EDT (Eastern) / 14:00 BST / 15:00 CET / 17:00 MDT (Moscow) / 18:30 IST (India))
The session will be conducted through DimDim, a system that allows you to follow the audio and visuals of a presentation from your browser, without any additional settings.

Attendance is free. Please follow the instructions given in the MySQL University main page.

Sunday, May 24, 2009

The "Chemistry" of MySQL

Searching for MySQL with WolframAlpha, the latest entry in the search engines arena, I had a surprise:


Interpreting "mysql" as "mycil"
Input interpretation:
 chlorphenesin

A further search for chlorphenesin, explains that it is a drug pertaining to the "central muscle relaxants" category.