The Data Charmer: 5.7

Showing posts with label 5.7. Show all posts

Sunday, June 30, 2019

From an empty box to MySQL custom replication in 3 minutes

Starting with version 1.32.0, dbdeployer has the ability of downloading a selection of MySQL tarballs from several sources.

This means that, when working in an empty box, you can populate it with database servers using

dbdeployer.

The “empty box” mentioned in the title is not really empty. It’s a Linux (or MacOS) host that is able to run a MySQL server. As such, it needs to have at least the prerequisites to run MySQL server (such as the libnuma and libaio packages), and a bash shell to run the scripts created by dbdeployer.

To try the thrill of an empty box that quickly becomes a working environment, we can use a docker image datacharmer/mysql-sb-base that I have created for this purpose.

$ docker pull datacharmer/mysql-sb-base
Using default tag: latest
latest: Pulling from datacharmer/mysql-sb-base
6b98dfc16071: Pull complete
4001a1209541: Pull complete
6319fc68c576: Pull complete
b24603670dc3: Pull complete
97f170c87c6f: Pull complete
b78c78fcfc94: Pull complete
379084573ce7: Pull complete
0afd193b699a: Pull complete
dfb4eecd399a: Pull complete
Digest: sha256:492c38b8662d393436141de5b3a9ad5b3994a7b095610b43896033fd740523ef
Status: Downloaded newer image for datacharmer/mysql-sb-base:latest

We can start a container from this image, and we won’t need anything else from the host computer.

$ docker run -ti --hostname dbtest datacharmer/mysql-sb-base bash
msandbox@dbtest:~$

The container runs as a regular user. Given that dbdeployer is designed specifically to run without root access (although it can run as root), this is the perfect scenario.

dbdeployer is already installed, but mysql is not.

msandbox@dbtest:~$ dbdeployer --version
dbdeployer version 1.34.0

msandbox@dbtest:~$ mysql
bash: mysql: command not found

Thus, we start getting our software from the locations that dbdeployer knows.

$ dbdeployer downloads list
Available tarballs
                          name                             OS     version   flavor     size   minimal
-------------------------------------------------------- ------- --------- -------- -------- ---------
 tidb-master-linux-amd64.tar.gz                           Linux     3.0.0   tidb      26 MB
 mysql-8.0.16-linux-glibc2.12-x86_64.tar.xz               Linux    8.0.16   mysql    461 MB
 mysql-8.0.16-linux-x86_64-minimal.tar.xz                 Linux    8.0.16   mysql     44 MB   Y
 mysql-5.7.26-linux-glibc2.12-x86_64.tar.gz               Linux    5.7.26   mysql    645 MB
 mysql-5.6.44-linux-glibc2.12-x86_64.tar.gz               Linux    5.6.44   mysql    329 MB
 mysql-5.5.62-linux-glibc2.12-x86_64.tar.gz               Linux    5.5.62   mysql    199 MB
 mysql-8.0.15-linux-glibc2.12-x86_64.tar.xz               Linux    8.0.15   mysql    376 MB
 mysql-8.0.13-linux-glibc2.12-x86_64.tar.xz               Linux    8.0.13   mysql    394 MB
 mysql-5.7.25-linux-glibc2.12-x86_64.tar.gz               Linux    5.7.25   mysql    645 MB
 mysql-5.6.43-linux-glibc2.12-x86_64.tar.gz               Linux    5.6.43   mysql    329 MB
 mysql-5.5.61-linux-glibc2.12-x86_64.tar.gz               Linux    5.5.61   mysql    199 MB
 mysql-5.1.73-linux-x86_64-glibc23.tar.gz                 Linux    5.1.73   mysql    134 MB
 mysql-5.0.96.tar.xz                                      Linux    5.0.96   mysql    5.5 MB   Y
 mysql-5.1.72.tar.xz                                      Linux    5.1.72   mysql     10 MB   Y
 mysql-5.5.61.tar.xz                                      Linux    5.5.61   mysql    6.6 MB   Y
 mysql-5.5.62.tar.xz                                      Linux    5.5.62   mysql    6.6 MB   Y
 mysql-5.6.43.tar.xz                                      Linux    5.6.43   mysql    9.0 MB   Y
 mysql-5.6.44.tar.xz                                      Linux    5.6.44   mysql    9.1 MB   Y
 mysql-5.7.25.tar.xz                                      Linux    5.7.25   mysql     23 MB   Y
 mysql-5.7.26.tar.xz                                      Linux    5.7.26   mysql     23 MB   Y
 mysql-5.0.96-linux-x86_64-glibc23.tar.gz                 Linux    5.0.96   mysql    127 MB
 mysql-4.1.22.tar.xz                                      Linux    4.1.22   mysql    4.6 MB   Y
 mysql-cluster-gpl-7.6.10-linux-glibc2.12-x86_64.tar.gz   Linux    7.6.10   ndb      916 MB
 mysql-cluster-8.0.16-dmr-linux-glibc2.12-x86_64.tar.gz   Linux    8.0.16   ndb      1.1 GB

The above command shows all the tarballs that are available for the current operating system. You see that in addition to vanilla MySQL, there are also NDB and TiDB packages.

We start by getting the latest MySQL version using the command get-unpack that is available since version 1.33.0. This command downloads the tarball, compares the checksum, and unpacks it into the expected place.

$ dbdeployer downloads get-unpack mysql-8.0.16-linux-x86_64-minimal.tar.xz
Downloading mysql-8.0.16-linux-x86_64-minimal.tar.xz
....  44 MB
File /home/msandbox/mysql-8.0.16-linux-x86_64-minimal.tar.xz downloaded
Checksum matches
Unpacking tarball mysql-8.0.16-linux-x86_64-minimal.tar.xz to $HOME/opt/mysql/8.0.16
.........100.........200.219
Renaming directory /home/msandbox/opt/mysql/mysql-8.0.16-linux-x86_64-minimal to /home/msandbox/opt/mysql/8.0.16

The same operation for 5.7 gives us the second version available.

+ dbdeployer downloads get-unpack mysql-5.7.26.tar.xz
Downloading mysql-5.7.26.tar.xz
..  23 MB
File /home/msandbox/mysql-5.7.26.tar.xz downloaded
Checksum matches
Unpacking tarball mysql-5.7.26.tar.xz to $HOME/opt/mysql/5.7.26
.........99
Renaming directory /home/msandbox/opt/mysql/mysql-5.7.26 to /home/msandbox/opt/mysql/5.7.26

Now there are two versions that can be used for operations.

$ dbdeployer  versions
Basedir: /home/msandbox/opt/mysql
5.7.26  8.0.16

And we are going to deploy one sandbox from each version, because we want to put them in replication.

$ dbdeployer deploy single 5.7.26 --master
Creating directory /home/msandbox/sandboxes
Database installed in $HOME/sandboxes/msb_5_7_26
run 'dbdeployer usage single' for basic instructions'
. sandbox server started

$ dbdeployer deploy single 8.0.16 --master
Database installed in $HOME/sandboxes/msb_8_0_16
run 'dbdeployer usage single' for basic instructions'
.. sandbox server started


$ dbdeployer sandboxes --full-info
.------------.--------.---------.---------------.--------.-------.--------.
|    name    |  type  | version |     ports     | flavor | nodes | locked |
+------------+--------+---------+---------------+--------+-------+--------+
| msb_5_7_26 | single | 5.7.26  | [5726 ]       | mysql  |     0 |        |
| msb_8_0_16 | single | 8.0.16  | [8016 18016 ] | mysql  |     0 |        |
'------------'--------'---------'---------------'--------'-------'--------'

This are our active assets. The sandboxes are independent, but each sandbox has the ability of becoming the receiver of replication. In this case we want to replicate from version 5.7 to version 8.0, as it is always recommended to replicate from earlier to later version.

$ ~/sandboxes/msb_8_0_16/replicate_from msb_5_7_26
Connecting to /home/msandbox/sandboxes/msb_5_7_26
--------------
CHANGE MASTER TO master_host="127.0.0.1",
master_port=5726,
master_user="rsandbox",
master_password="rsandbox"
, master_log_file="mysql-bin.000001", master_log_pos=4089
--------------

--------------
start slave
--------------

              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 4089
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 4089
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0

Replication is active. We can now quickly check that it is working:

$ ~/sandboxes/msb_5_7_26/use -e 'create table test.t1(id int not null primary key, msg1 varchar(50), msg2 varchar(50)) default charset=utf8mb4'

$ ~/home/msandbox/sandboxes/msb_5_7_26/use -e 'insert into test.t1 values (1, @@version, @@server_uuid)'

We create a table in 5.7, taking care of using a character set that agrees with 8.0 defaults (we could also use utf8, but this is the one that presents less potential problems. We fill the table with server specific information (its version and UUID).

Now we can check that the slave is working

$ ~/sandboxes/msb_8_0_16/use -e 'SHOW SLAVE STATUS\G' | grep 'Running\|Master_\|Log_'
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 5726
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 4636
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 868
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 4636
              Relay_Log_Space: 1072
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
Master_SSL_Verify_Server_Cert: No
             Master_Server_Id: 5726
                  Master_UUID: 00005726-0000-0000-0000-000000005726
             Master_Info_File: mysql.slave_master_info
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Master_TLS_Version:
       Master_public_key_path:

And finally we retrieve from the 8.0 slave the data that was created in 5.7

$ ~/sandboxes/msb_8_0_16/use -e 'show tables from test'
+----------------+
| Tables_in_test |
+----------------+
| t1             |
+----------------+

$  ~/sandboxes/msb_8_0_16/use -e 'select * from test.t1'
+----+------------+--------------------------------------+
| id | msg1       | msg2                                 |
+----+------------+--------------------------------------+
|  1 | 5.7.26-log | 00005726-0000-0000-0000-000000005726 |
+----+------------+--------------------------------------+

QED.

Friday, March 29, 2019

dbdeployer cookbook - usability by example

When I designed dbdeployer, I wanted to eliminate most of the issues that the old MySQL-Sandbox had:

dependencies during installation
mistaken tarballs
clarity of syntax
features (un)awareness.

Dependencies during installation did go away right from the start, as the dbdeployer executable is ready to be used without additional components. The only dependency is to have a host that can run MySQL. There is little dbdeployer can do about detecting whether or not your system can run MySQL. It depends on which version and flavor of MySQL you are running. It should not be a big deal as I assume that anyone in need of dbdeployer has already the necessary knowledge about MySQL prerequisites. This is not always the case, but the issue goes beyond dbdeployer’s scope.

Mistaken tarballs are a common problem for users who have never seen a binaries tarball. Here dbdeployer can help, up to a point, to guide the user. It recognizes most cases where you are trying to use a source tarball or the wrong operating system tarball. It still does not recognize when you try to run a sandbox for a Linux 64bits out of a 32bit tarball, but to be honest I haven’t tried to solve this problem yet. There are still cases where users are a great risk of picking the wrong tarball (Percona Server download page is a minefield and the one for MariaDB is not picnic either) but I feel that I have given dbdeployer users a big help on this count.

Clarity of syntax is probably the biggest issue with the previous tool. It’s a consequence of the tool being developed over several years, slowly adapting to changing circumstances. I made dbdeployer clearer from the beginning, when I decided to avoid piling up many actions as the consequence of the same command. In dbdeployer, you need to unpack the tarball explicitly before running a deployment, and this gives dbdeployer users the visibility on the architecture that eluded many MySQL-Sandbox users. The architecture of dbdeployer is such that adding new features, commands, and options is easy and fits within an easily detectable paradigm. Thus, the operations are easier to spot and use.

Features awareness is still a problem. There is a lengthy description of everything dbdeployer can do, but, as often happens with even the best tools, users don’t read manuals.

The biggest obstacle about reading manuals is that dbdeployer executable is installed without any need to take the README file along. If you need it, you should go online and read it, and given that dbdeployer is built to be used mostly without internet connection, there are cases when you want to know how to do something, but you can’t get the manual right away.

There is the tool integrated help (dbdeployer [command] -h), which gives you a lot of information, but this tells you how to do something that you know already exists, not what you can do in general.

To help on this count, I added a collection of samples (the cookbook) that were initially released in a directory of the GitHub project, but then the sample scripts suffered of the same invisibility that plagues the README file. There is one more problem: when you tried using the generic cookbook scripts (now removed) you had to use the same environment as I did when defining them, or they would fail.

The current solution is to include cookbook files right within the tool, using templates (same as for regular sandbox scripts) with the double advantage that the scripts are available anywhere the dbdeployer executable is, and the scripts are adapted to the environment, since dbdeployer knows how to search for available binaries and can suggest the best parameters for the scripts.

Introducing dbdeployer cookbook

The command dbdeployer cookbook (with aliases recipes or samples) has three subcommands:

list displays a list of available samples.
show displays the contents of a recipe.
create (with alias make) will build the recipe script.

We should try the list first:

$ dbdeployer cookbook list
.----------------------------------.-------------------------------------.--------------------------------------------------------------------.--------.
|              recipe              |             script name             |                            description                             | needed |
|                                  |                                     |                                                                    | flavor |
+----------------------------------+-------------------------------------+--------------------------------------------------------------------+--------+
| all-masters                      | all-masters-deployment.sh           | Creation of an all-masters replication sandbox                     | mysql  |
| delete                           | delete-sandboxes.sh                 | Delete all deployed sandboxes                                      |        |
| fan-in                           | fan-in-deployment.sh                | Creation of a fan-in (many masters, one slave) replication sandbox | mysql  |
| group-multi                      | group-multi-primary-deployment.sh   | Creation of a multi-primary group replication sandbox              | mysql  |
| group-single                     | group-single-primary-deployment.sh  | Creation of a single-primary group replication sandbox             | mysql  |
| master-slave                     | master-slave-deployment.sh          | Creation of a master/slave replication sandbox                     |        |
| ndb                              | ndb-deployment.sh                   | Shows deployment with ndb                                          | ndb    |
| prerequisites                    | prerequisites.sh                    | Shows dbdeployer prerequisites and how to make them                |        |
| pxc                              | pxc-deployment.sh                   | Shows deployment with pxc                                          | pxc    |
| remote                           | remote.sh                           | Shows how to get a remote MySQL tarball                            |        |
| replication-restart              | repl-operations-restart.sh          | Show how to restart sandboxes with custom options                  |        |
| replication-operations           | repl-operations.sh                  | Show how to run operations in a replication sandbox                |        |
| replication_between_groups       | replication-between-groups.sh       | Shows how to run replication between two group replications        | mysql  |
| replication_between_master_slave | replication-between-master-slave.sh | Shows how to run replication between two master/slave replications |        |
| replication_between_ndb          | replication-between-ndb.sh          | Shows how to run replication between two NDB clusters              | ndb    |
| show                             | show-sandboxes.sh                   | Show deployed sandboxes                                            |        |
| single                           | single-deployment.sh                | Creation of a single sandbox                                       |        |
| single-reinstall                 | single-reinstall.sh                 | Re-installs a single sandbox                                       |        |
| tidb                             | tidb-deployment.sh                  | Shows deployment and some operations with TiDB                     | tidb   |
| upgrade                          | upgrade.sh                          | Shows a complete upgrade example from 5.5 to 8.0                   | mysql  |
'----------------------------------'-------------------------------------'--------------------------------------------------------------------'--------'

The recipe that seems to be the simplest one is single. We can try to see what is in there:

$ dbdeployer  cookbook show single
#!/bin/bash

[...]
# Generated by dbdeployer 1.26.0 using template single on Fri Mar 29 12:27:53 UTC 2019
cd $(dirname $0)
source cookbook_include.sh

version=$1
[ -z "$version" ] && version=NOTFOUND_mysql
check_version $version

if [ -n "$(dbdeployer sandboxes | grep 'single\s*'$version)" ]
then
    echo "single version $version is already installed"
else
    header "Deploying a single sandbox for version $version"
    run dbdeployer deploy single $version
fi

What looks odd is the line that says NOTFOUND. If we try creating that script and then running it, it won’t work, and rightfully so.

There is a recipe named prerequisites that could probably help us.

$ dbdeployer cookbook create prerequisites
recipes/prerequisites.sh created

So, now, we have a starting point. Let’s run it:

$ ./recipes/prerequisites.sh

# ****************************************************************************
# Creating Sandbox binary directory ($HOME/opt/mysql)
# ****************************************************************************

## HOW TO GET  binaries for dbdeployer
# FOR REGULAR MYSQL
# run the commands:

1. dbdeployer remote list
2. dbdeployer remote get mysql-5.7.25
3. dbdeployer unpack mysql-5.7.25.tar.xz

4. dbdeployer versions

# FOR MySQL forks, MySQL Cluster, PXC:

# 1. Get the binaries from the maker download pages
# 2. run the command
 dbdeployer unpack FlavorName-X.X.XX-OS.tar.gz  --prefix=FlavorName
3. dbdeployer versions

The first thing we see is that the sandbox binary directory was created, and then we see a series of steps to fill it in.

Let’s try:

$ dbdeployer remote list
Files available in https://raw.githubusercontent.com/datacharmer/mysql-docker-minimal/master/dbdata/available.json
5.5 -> [mysql-5.5.61 mysql-5.5.62]
5.6 -> [mysql-5.6.41 mysql-5.6.43]
5.7 -> [mysql-5.7.24 mysql-5.7.25]
8.0 -> [mysql-8.0.13 mysql-8.0.15]
4.1 -> [mysql-4.1.22]
5.0 -> [mysql-5.0.15 mysql-5.0.96]
5.1 -> [mysql-5.1.72]

$ dbdeployer remote get mysql-5.7.25
File /home/msandbox/mysql-5.7.25.tar.xz downloaded

$ dbdeployer unpack mysql-5.7.25.tar.xz
Unpacking tarball mysql-5.7.25.tar.xz to $HOME/opt/mysql/5.7.25
[...]
Renaming directory $HOME/opt/mysql/mysql-5.7.25 to $HOME/opt/mysql/5.7.25

$ dbdeployer versions
Basedir: /home/msandbox/opt/mysql
5.7.25

If we repeat the show command now, we get a different result:

$ dbdeployer  cookbook show single
#!/bin/bash
[...]
# Generated by dbdeployer 1.26.0 using template single on Fri Mar 29 12:37:26 UTC 2019
cd $(dirname $0)
source cookbook_include.sh

version=$1
[ -z "$version" ] && version=5.7.25
check_version $version

if [ -n "$(dbdeployer sandboxes | grep 'single\s*'$version)" ]
then
    echo "single version $version is already installed"
else
    header "Deploying a single sandbox for version $version"
    run dbdeployer deploy single $version
fi

There! instead of the NOTFOUND we saw before, it now shows the version that we just downloaded. If we repeat the same procedure (remote list, remote get, unpack) for MySQL 8.0.15, we would see 8.0.15 as the recommended version.

Now we can create the single recipe. Or even better, since we want to try several ones, we can create all of them.

$ dbdeployer  cookbook create all
recipes/replication-between-master-slave.sh created
recipes/single-reinstall.sh created
recipes/fan-in-deployment.sh created
recipes/group-multi-primary-deployment.sh created
recipes/repl-operations.sh created
recipes/tidb-deployment.sh created
recipes/remote.sh created
recipes/upgrade.sh created
recipes/ndb-deployment.sh created
recipes/cookbook_include.sh created
recipes/master-slave-deployment.sh created
recipes/prerequisites.sh created
recipes/replication-between-groups.sh created
recipes/replication-between-ndb.sh created
recipes/pxc-deployment.sh created
recipes/single-deployment.sh created
recipes/show-sandboxes.sh created
recipes/delete-sandboxes.sh created
recipes/all-masters-deployment.sh created
recipes/group-single-primary-deployment.sh created
recipes/repl-operations-restart.sh created

Now it’s time to try one:

msandbox@505969e46289:~$ ./recipes/single-deployment.sh

# ****************************************************************************
# Deploying a single sandbox for version 5.7.25
# ****************************************************************************
+ dbdeployer deploy single 5.7.25
Creating directory /home/msandbox/sandboxes
Database installed in $HOME/sandboxes/msb_5_7_25
run 'dbdeployer usage single' for basic instructions'
. sandbox server started
0

Looking at the list, we see a single-reinstall recipe. If we run it, we will get a mini tutorial on how to use a single sandbox:

$ ./recipes/single-deployment.sh

# ****************************************************************************
# Deploying a single sandbox for version 5.7.25
# ****************************************************************************
+ dbdeployer deploy single 5.7.25
Creating directory $HOME/sandboxes
Database installed in $HOME/sandboxes/msb_5_7_25
run 'dbdeployer usage single' for basic instructions'
. sandbox server started
0
msandbox@505969e46289:~$ ./recipes/show-sandboxes.sh
+ dbdeployer sandboxes --full-info
.------------.--------.---------.---------.--------.-------.--------.
|    name    |  type  | version |  ports  | flavor | nodes | locked |
+------------+--------+---------+---------+--------+-------+--------+
| msb_5_7_25 | single | 5.7.25  | [5725 ] | mysql  |     0 |        |
'------------'--------'---------'---------'--------'-------'--------'
0
msandbox@505969e46289:~$ ./recipes/single-reinstall.sh

# ****************************************************************************
# Deploying the same sandbox again, with different parameters
# We need to use --force, as we are overwriting an existing sandbox
# Incidentally, the new deployment will run a query before and after the grants
# ****************************************************************************
+ dbdeployer deploy single 5.7.25 '--pre-grants-sql=select host, user from mysql.user' '--post-grants-sql=select host, user from mysql.user' --force
Overwriting directory $HOME/sandboxes/msb_5_7_25
stop $HOME/sandboxes/msb_5_7_25
Database installed in $HOME/sandboxes/msb_5_7_25
run 'dbdeployer usage single' for basic instructions'
. sandbox server started
+-----------+---------------+
| host      | user          |
+-----------+---------------+
| localhost | mysql.session |
| localhost | mysql.sys     |
| localhost | root          |
+-----------+---------------+
+-----------+---------------+
| host      | user          |
+-----------+---------------+
| 127.%     | msandbox      |
| 127.%     | msandbox_ro   |
| 127.%     | msandbox_rw   |
| 127.%     | rsandbox      |
| localhost | msandbox      |
| localhost | msandbox_ro   |
| localhost | msandbox_rw   |
| localhost | mysql.session |
| localhost | mysql.sys     |
| localhost | root          |
+-----------+---------------+

# ****************************************************************************
# Deploying the same sandbox with a different directory.
# No --force is necessary, as dbdeployer will choose a different port
# ****************************************************************************
+ dbdeployer deploy single 5.7.25 --sandbox-directory=msb_5_7_25_new
Database installed in $HOME/sandboxes/msb_5_7_25_new
run 'dbdeployer usage single' for basic instructions'
. sandbox server started
0

+ dbdeployer sandboxes --header
            name               type    version    ports
---------------------------- -------- --------- ---------
 msb_5_7_25               :   single   5.7.25    [5725 ]
 msb_5_7_25_new           :   single   5.7.25    [5726 ]
0

# ****************************************************************************
# Removing the second sandbox
# ****************************************************************************
+ dbdeployer delete msb_5_7_25_new
List of deployed sandboxes:
$HOME/sandboxes/msb_5_7_25_new
Running $HOME/sandboxes/msb_5_7_25_new/stop
stop $HOME/sandboxes/msb_5_7_25_new
Running rm -rf $HOME/sandboxes/msb_5_7_25_new
Directory $HOME/sandboxes/msb_5_7_25_new deleted
0

This script teaches us the basics of starting and restarting a sandbox, with useful twists as running an SQL command before granting privileges.

There is a similar tutorial for replication operations, but we’ll have a look at a slightly different one.

$ ./recipes/master-slave-deployment.sh
+ dbdeployer deploy replication 5.7.25 --concurrent
$HOME/sandboxes/rsandbox_5_7_25/initialize_slaves
initializing slave 1
initializing slave 2
Replication directory installed in $HOME/sandboxes/rsandbox_5_7_25
run 'dbdeployer usage multiple' for basic instructions'

$ ./recipes/repl-operations.sh

# ****************************************************************************
# Running a simple command with the master in the sandbox.
# Notice the usage of the '-e', as if we were using the 'mysql' client
# ****************************************************************************
+ $HOME/sandboxes/rsandbox_5_7_25/m -e 'SHOW MASTER STATUS'
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |     4089 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+

# ****************************************************************************
# Creating a table in the master
# ****************************************************************************
+ $HOME/sandboxes/rsandbox_5_7_25/m -e 'DROP TABLE IF EXISTS test.t1'
+ $HOME/sandboxes/rsandbox_5_7_25/m -e 'CREATE TABLE test.t1(id int not null primary key)'

# ****************************************************************************
# Inserting 3 lines into the new table
# ****************************************************************************
+ $HOME/sandboxes/rsandbox_5_7_25/m -e 'INSERT INTO test.t1 VALUES(1)'
+ $HOME/sandboxes/rsandbox_5_7_25/m -e 'INSERT INTO test.t1 VALUES(2)'
+ $HOME/sandboxes/rsandbox_5_7_25/m -e 'INSERT INTO test.t1 VALUES(3)'

# ****************************************************************************
# Getting the table contents from one slave
# ****************************************************************************
+ $HOME/sandboxes/rsandbox_5_7_25/s1 -e 'SELECT * FROM test.t1'
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+

# ****************************************************************************
# Getting the table count from all nodes (NOTE: no '-e' is needed)
# $HOME/sandboxes/rsandbox_5_7_25/use_all 'SELECT COUNT(*) FROM test.t1'
# master
COUNT(*)
3
# server: 1
COUNT(*)
3
# server: 2
COUNT(*)
3

# ****************************************************************************
# Checking the status of all slaves
# ****************************************************************************
+ $HOME/sandboxes/rsandbox_5_7_25/check_slaves
master
port 19226 - server_id 100
             File: mysql-bin.000001
         Position: 5213
Executed_Gtid_Set:
slave1
port 19227 - server_id 200
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 5213
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 5213
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
slave2
port 19228 - server_id 300
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 5213
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 5213
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
0

# ****************************************************************************
# Running a multiple query in all slaves
# ****************************************************************************
+ $HOME/sandboxes/rsandbox_5_7_25/use_all_slaves 'STOP SLAVE; SET GLOBAL slave_parallel_workers=3; START SLAVE;show processlist '
# server: 1
Id User Host db Command Time State Info
11 msandbox localhost NULL Query 0 starting show processlist
12 system user  NULL Connect 0 Checking master version NULL
13 system user  NULL Connect 0 System lock NULL
14 system user  NULL Connect 0 Waiting for an event from Coordinator NULL
15 system user  NULL Connect 0 Waiting for an event from Coordinator NULL
16 system user  NULL Connect 0 Waiting for an event from Coordinator NULL
# server: 2
Id User Host db Command Time State Info
10 msandbox localhost NULL Query 0 starting show processlist
11 system user  NULL Connect 0 Checking master version NULL
12 system user  NULL Connect 0 System lock NULL
13 system user  NULL Connect 0 Waiting for an event from Coordinator NULL
14 system user  NULL Connect 0 Waiting for an event from Coordinator NULL
15 system user  NULL Connect 0 Waiting for an event from Coordinator NULL

By studying the commands mentioned in these samples, you will become proficient in dbdeployer components, allowing you to use it for advanced testing operations.

Monday, March 12, 2018

dbdeployer release candidate

The latest release of dbdeployer is possibly the last one with a leading 0. If no serious bugs are found in the next two weeks, the next release will bear a glorious 1.0.

Latest news

The decision to get out of the stream of pre-releases that were published until now comes because I have implemented all the features that I wanted to add: mainly, all the ones that I wished to add to MySQL-Sandbox but it would have been too hard:

concurrent deployment and deletion
group replication;
multi-source replication;
Full customisation
Centralised administration.
Container-based tests;
Mock tests that can run without any real database server.

The latest addition is the ability of running multi-source topologies. Now we can run four topologies:

master-slave is the default topology. It will install one master and two slaves. More slaves can be added with the option --nodes.
group will deploy three peer nodes in group replication. If you want to use a single primary deployment, add the option --single-primary. Available for MySQL 5.7 and later.
fan-in is the opposite of master-slave. Here we have one slave and several masters. This topology requires MySQL 5.7 or higher.
all-masters is a special case of fan-in, where all nodes are masters and are also slaves of all nodes.

It is possible to tune the flow of data in multi-source topologies. The default for fan-in is three nodes, where 1 and 2 are masters, and 2 are slaves. You can change the predefined settings by providing the list of components:

$ dbdeployer deploy replication \
    --topology=fan-in \
    --nodes=5 \
    --master-list="1 2 3" \
    --slave-list="4 5" \
    8.0.4 \
    --concurrent

In the above example, we get 5 nodes instead of 3. The first three are master (--master-list="1 2 3") and the last two are slaves (--slave-list="4 5") which will receive data from all the masters. There is a test automatically generated to test replication flow. In our case it shows the following:

$ ~/sandboxes/fan_in_msb_8_0_4/test_replication
# master 1
# master 2
# master 3
# slave 4
ok - '3' == '3' - Slaves received tables from all masters
# slave 5
ok - '3' == '3' - Slaves received tables from all masters
# pass: 2
# fail: 0

The first three lines show that each master has done something. In our case, each master has created a different table. Slaves in nodes 5 and 6 then count how many tables they found, and if they got the tables from all masters, the test succeeds.
Note that for all-masters topology there is no need to specify master-list or slave-list. In fact, those lists will be auto-generated, and they will both include all deployed nodes.

What now?

Once I make sure that the current features are reasonably safe (I will only write more tests for the next 10~15 days) I will publish the first (non-pre) release of dbdeployer. From that moment, I'd like to follow the recommendations of the Semantic Versioning:

The initial version will be 1.0.0 (major, minor, revision);
The spects for 1.0 will be the API that needs to be maintained.
Bug fixes will increment the revision counter.
New features that don't break compatibility with the API will increment the minor counter;
New features or changes that break compatibility will trigger a major counter increment.

Using this method will give users a better idea of what to expect. If we get a revision number increase, it is only bug fixes. An increase in the minor counter means that there are new features, but all previous features work as before. An increase in the major counter means that something will break, either because of changed interface or because of changed behavior.
In practice, the tests released with 1.0.0 should run with any 1.x subsequent version. When those tests need changes to run correctly, we will need to bump up the major version.

Let's see if this method is sustainable. So far, I haven't had need to do behavioural changes, which are usually provoked by new versions of MySQL that introduce incompatible behavior (definitely MySQL does not follow the Semantic Versioning principles.) When the next version becomes available, I will see if this RC of dbdeployer can stand its ground.

Sunday, March 11, 2018

Concurrent sandbox deployment

Version 0.3.0 of dbdeployer has gained the ability of deploying multiple sandboxes concurrently. Whenever we deploy a group of sandboxes (replication, multiple) we can use the --concurrent flag, telling dbdeployer that it should run operations concurrently.

What happens when a single sandbox gets deployed? There are six sets of operations:

Create the sandbox directory and write down its scripts;
Run the initialisation script;
Start the database server;
Run the pre-grants SQL commands (if any;)
Load the grants;
Run the post-grants SQL commands (if any;)

When several sandboxes are deployed concurrently, dbdeployer runs only the first step, and then creates a list of commands with an associated priority index. These commands are assembled for every sandbox, and then executed concurrently for every step.
The sequence of events for a deployment of three sandboxes in replication would be like this:

Create the sandbox skeleton for every sandbox;
Initialise all database servers;
start all the servers;
run the pre-grants, grants, post-grants scripts.
Runs the group initialisation script (start master and slaves, or setup group replication).

Depending on the computer architecture, the server version, and the number of nodes, the speed of deployment can increase from 2 to 5 times.

Let's see an example:

$ time dbdeployer deploy replication 5.7.21
[...]
real    0m13.789s
user    0m1.143s
sys 0m1.873s

$ time dbdeployer deploy replication 5.7.21 --concurrent
[...]
real    0m7.780s
user    0m1.329s
sys 0m1.811s

There is a significant speed increase. The gain rises sharply if we use an higher number of nodes.

$ time dbdeployer deploy replication 5.7.21 --nodes=5
[...]
real    0m23.425s
user    0m1.923s
sys 0m3.106s

$ time dbdeployer deploy replication 5.7.21 \
    --nodes=5 --concurrent
[...]
real    0m7.686s
user    0m2.248s
sys 0m2.777s

As we can see, the time for deploying 5 nodes is roughly the same used for 3 nodes. While the sequential operations take time proportionally with the number of nodes, the concurrent task stays almost constant.

Things a re a bit different for group replication, as the group initialisation (which happens after all the servers are up and running) takes more time than the simple master/slave deployment, and can't be easily reduced using the current code.

A similar optimisation happens when we delete multiple sandboxes. Here the operation is at sandbox level (1 replication cluster = 1 sandbox) not at server level, and for that reason the gain is less sharp. Still, operations are noticeably faster.

There is room for improvement, but I have seen that the total testing time for dbdeployer test suite has dropped from 26 to 15 minutes. I think it was a week end well spent.

Monday, March 05, 2018

MySQL security for real users

Security features overview

One of Oracle's tenets is the focus on security. For this reason, when it took over the stewardship of MySQL, it started addressing the most common issues. It was not quick acting, but we have seen real progress:

MySQL 5.7 has removed the anonymous accounts, which was the greatest threat to security. Because of those accounts, and the default privileges granted to them, users without any privileges could access the "test" database and do serious damage. Additionally, because of the way the privilege engine evaluates accounts, anonymous users could hijack legitimate users, by preventing them to work properly.
The "root" account now comes with a password defined during initialization. This is good news for security, but bad news for how the change was implemented.
There is a new way of setting an options file for connection credentials: the mysql_config_editor paired with option --login-path allows users to store encrypted credentials for secure use. Also here, while we should rejoice for the added security, we can't help feeling that the implementation is yet again far from meeting users needs.
There is an useful warning (introduced in MySQL 5.6) when using a password on the command line, telling users that it is a risk. Also in this case, we have a usability issue: while users care about their production deployments and use option files to avoid using passwords on the command line, there are, nonetheless, a lot of testing scripts, used in safe environment or with non-valuable data, where a password in the command line was not an issue, and the new warning simply screws up the result of those carefully crafted tests. This change, which can't be overcome without modifying the MySQL clients code, needs users to change their existing tests to adapt to the new behavior.
MySQL 8 introduces roles, which simplify the accounts management. There are some minor usability issues, although in general the feature meets expectations.

This is the scenario of the main enhancements in MySQL since 5.6. Each one of them has some usability problems, some minor, some really bad.
We will first have a look at the problems mentioned above, and then examine the root cause for why they have arisen.

Usability issues

I start by noticing that some developers in the MySQL team have been working there for many years, starting with the time when MySQL was a different database and was used really differently.
In those times, managing the database meant that a human (the DBA) would run operations manually, take a look at the result, and adjust when needed. And then, when things went wrong, the same human explored the database system to find out what happened, took action, and went back to sleep.

Human-centered management leads to human problems: lazy DBA left their databases without password, using the root account, and exposing the server to uninspired attacks; they used passwords on the command line, without caring for options files (or without knowing about them.) Careless DBAs did not deal with anonymous users, leaving a dangerous backdoor in their server.

Some of the new functionalities introduced in the latest MySQL versions are aimed at this type of users: when you install MySQL, you get a message saying: your root password is ************, and the lazy DBAs have no option but to take note and use it. When they use the password on the command line, the annoying warning forces them to start using an options file or the mysql_config_editor.

This is all good, but the main problem here is that the DBAs of 10 years ago are on the verge of extinction. They are replaced by a new breed of DBAs who are not lazy, because they can't afford to be, and need to use dozens, hundreds, thousands of databases at once, using configuration management tools that don't require manual intervention, and actually abhor it. In the land of automation, some of the MySQL security enhancements are not seen as a solution, but as new problems.

Let's see an interesting example: docker containers.

Using Docker, MySQL images are deployed using a password on the command line. This is done for compatibility with the first implementation of the image maintained by the Docker team, where you deploy with this syntax:

docker run -e MYSQL_ROOT_PASSWORD=secret -d mysql

The MYSQL_ROOT_PASSWORD is a directive that becomes an environment variable inside the container, and the server uses it during initialization. As you can imagine, this is not recommended for a secure environment. Then, what's the MySQL team recommendation? They suggest the same strategy used for manual installation: set a directive MYSQL_RANDOM_ROOT_PASSWORD that results in a random password being generated, then collected by the DBA and used. Alternatively, the directive MYSQL_ONETIME_PASSWORD will force the root user to change the password on first connection.

The above suggestions were designed with the ancient DBA still in mind, while container deployment is even more automated than VMs, and it is based on the principle of immutable objects, i.e. containers that spring up from the cloud ready to run, with no configuration needed, and especially no configuration that requires someone (or some tool) to extract a new password from a log. I proposed a different solution, that would never show passwords on the command line and while it was implemented, but it still feels like a hack to circumvent an inadequate design.

As a result, the implementation inside the MySQL recommended Docker image uses "--initialize-insecure" to start the server. This is an implicit recognition of the bad design of the initialization feature. What was designed to overcome DBA's laziness becomes an obstacle towards automation.

We have a similar problem with mysql_config_editor: the tool will create a safe configuration file with credentials for multiple instances, but the password must be inserted manually. Consequently, this potentially useful feature doesn't get adopted, because it would be too difficult or impossible to automate properly.

We have seen that, of the security features that were introduced lately, only a few can be used safely in an automated environment, and all of them have at least one small usability quirk. I have talked about a confusing issue related to the removal of anonymous users where in their eagerness of removing the vulnerability the MySQL team removed also the "test" database, which was a consequence, not the cause of the problem. And I have recently talked about roles usability where there are still open problems, like the ability of telling roles from users which are apparently not considered a bug by the MySQL team.

All the above considerations led me to ask: how did we get to this point? There is an active community, and feedback is offered often with plenty of detail. How come we have such an abundance of usability issues? Don't the developers spend time with users at conferences to learn what they expect? Don't they read articles and blog posts about how a new feature meets expectations? Don't they talk to customers who have adopted new features? They certainly do. Then, why the usability problems persist?

What follows is my observation and speculation on this matter.

Disconnection between MySQL developers and users community

My experience working with system providers has put me in contact with many users. I have seen that in most cases users are very much protective of their current deployment, because it took them long time to get it right, and they don't upgrade unless they don't have another choice. I've seen users test the newer versions, realize that they would break some of their procedures, and defer the upgrade to better times that never come. I remember last year a user with a not so large set of servers was considering an upgrade to MySQL 5.6, while 5.7 had been GA for two years. The reason was a set of incompatibilities that made the upgrade too difficult.

For companies that deal with thousands of servers, the problem is similar, but exacerbated by the quantity of servers to upgrade and the need to do it without stopping operations. This latest requirement has made some users decide not to use GTID, because it required offline time for a master, and they hadn't had time enough to test the upgrade to MySQL 5.7 that would solve that problem.

For one reason or the other, many companies upgrade only two or three years after a given version became GA. And this is the main problem: until they use it in production, or at least test the version for a projected upgrade, users can't give valuable feedback, the one that is related to usage in production, and when they do, the version for which they provide feedback has been GA for long time, and can't be changed, while the next one is already close to GA, and as such will be untouchable.

The MySQL team gets feedback on a release from a handful of curious users who don't delay testing until the new version is GA, but don't provide the kind of important feedback that get the development team attention, such as deployment in production by large customers. In many cases, large customers are the ones that upgrade several years after GA, and by then their input is difficult to act upon.

We have then a curious situation, where the warnings given by the early software explorers are confirmed years later by the users to which the MySQL team listens more closely, but by then the next version of the server is already locked in a release schedule that nobody wants to alter to fix usability issues.

How can we solve this problem? Simple: listen to early software explorers and try to fix problems before GA.

Thursday, March 01, 2018

Using MySQL 8.0: what to expect

MySQL 8.0 will be GA soon (just my assumption: Oracle doesn't tell me anything about its release plans) and it's time to think about having a look at it.
If this is your first try of MySQL 8, get prepared for several impacting differences from previous versions.

In this article I won't tell you what you can do with MySQL 8: there is plenty of material about this, including in this very blog. I will instead concentrate on differences from previous versions that users need to know if they want to avoid surprises.

Data Directory

Let's start with an observation of the data directory.
After a standard installation, without any additional options, I see the following:

Files that I expected to see

auto.cnf
ib_buffer_pool
ib_logfile0
ib_logfile1
ibdata1
ibtmp1
(dir) mysql
(dir) performance_schema
(dir) sys

These files are also present in 5.7.

Files that are new in 8.0

binlog.000001
binlog.index

log-bin is ON by default. You need to remember this if you are using a MySQL server for a benchmark test that used to run without binary logs.

ca-key.pem
ca.pem
client-cert.pem
client-key.pem
private_key.pem
public_key.pem
server-cert.pem
server-key.pem

Now the MySQL generates all the certificates needed to run connections securely. This will greatly simplify your task when setting up a new instance.

mysql.ibd

This was completely unexpected! The mysql database has now its own tablespace. This is probably due to the new Data Dictionary, which is implemented in InnoDB. You will notice that all the InnoDB tables in MySQL use this tablespace, not only dictionary tables. This will help keeping administrative data separate from operational data in the rest of the server.

undo_001
undo_002

The undo logs have now their own tablespace by default.

Global variables

There are a lot of changes in global variables. Here's the list of what will impact your work when you use MySQL 8.0 for the first time:

character_set_client        utf8mb4
character_set_connection    utf8mb4
character_set_database      utf8mb4
character_set_results       utf8mb4
character_set_server        utf8mb4

All character sets are now utf8mb4. In MySQL 5.7, the default values are a mix of utf8 and latin1.

default_authentication_plugin   caching_sha2_password

This is huge. Using this plugin, passwords are stored in a different way, which guarantees more security, but will probably break several workflows among the users. The bad thing about this change implementation is that this password format contains characters that don't display well on screen, and you can see garbled output when inspecting the "user" table.

local_infile    OFF

Loading local files is now prevented by default. If you have a workflow that requires such operations, you need to enable it.

log_bin ON
log_slave_updates   ON

We've seen from an inspection of the local directory that binary logging is enabled by default. But also very important is that log_slave_update is enabled. This is important to have slaves ready to replace a master, but will severely affect performance in those scenarios where some slaves were supposed to run without that feature.

master_info_repository  TABLE
relay_log_info_repository   TABLE

Also impacting performance is the setting for replication repositories, which are now on TABLE by default. This is something that should have happened already in MySQL 5.6 and was long overdue.

Surprisingly, something that DOES NOT get enabled by default is Global Transaction Identifiers (GTID). This is also a legacy from decisions taken in MySQL 5.6. Due to the GTID implementation, enabling them by default is not possible when upgrading from a previous version. With new data in a fresh installation, it is safe to enable GTID from the start.

Users

There are two new users when the server is created:

mysql.infoschema
mysql.session

Theoretically, mysql.session also exists in 5.7, but it was introduced long after GA, so it still qualifies as a novelty.

Then, when the server starts, you get a grand total of 4 users (root and mysql.sys are inherited from MySQL 5.7.)

Mixed oddities

When MySQL initializes, i.e. when the server starts for the first time and creates the database, you will notice some slowness, compared to previous versions. This is in part due to the data dictionary, which needs to create and fill 30 tables, but it is not a big deal in terms of performance. In some systems, though, the slowness is so acute that you start worrying about the server being stuck.

I noticed this problem in my Intel NUC running with SSD storage. In this box, the initialization time took a serious hit:

Version	time
5.0.96	1.231s
5.1.72	1.346s
5.5.52	2.441s
5.6.39	5.540s
5.7.21	6.080s
8.0.3	7.826s
8.0.4	38.547s

There is no mistype. The initialization for 8.0.4 lasts 6 times more than 5.7.
This doesn't happen everywhere. On a Mac laptop running on SSD the same operation takes almost 9 seconds, while 5.7 deploys in less than 5. It is still a substantial difference, one that has totally disrupted my regular operations in the NUC. I investigated the matter, and I found the reason. In 8.0, we have a new (hidden) table in the data dictionary, called st_spatial_reference_systems. Up to MySQL 8.0.3, this table was filled using a single transaction containing roughly 5,000 REPLACE INTO statements. It is a lot of data, but it happens quickly. For comparison, in MySQL 8.0.3 the initialization is only 2 seconds slower than 5.7.
The reason for the slowness in 8.0.4 is that there was a new command added to the syntax: CREATE SPATIAL REFERENCE SYSTEM, which is now used 5,000 times to fill the table that was previously filled with a single transaction. I don't know why someone in the MySQL team thought that changing this operation that is hidden from users was a good idea. The data is contained in the server itself and it goes into a data dictionary table, also not visible to users. I am sure I can find at least two methods to load the data faster. I was told that this glitch will be fixed in the next release. I'm waiting.

Speaking of initialization, the mysql_install_db script has been removed for good in 8.0. If you are still using it instead of the recommended mysqld --initialize, you should adapt asap.

This list is far from being exhaustive. I recommend reading What's new in MySQL 8 before upgrading.
If you are impatient, dbdeployer can help you test MySQL 8 quickly and safely.

Wednesday, February 28, 2018

The confusing strategy for MySQL shell

Where the hell is it?

The MySQL shell is a potentially useful tool that has been intentionally made difficult to use properly.

It was introduced, with much fanfare, with the MySQL Document Store, as THE tool to bridge the SQL and no-SQL worlds. The release was less than satisfactory, though: MySQL 5.7.12 introduced a new feature (the X-protocol plugin) bundled with the server. The maturity of the plugin was unclear, as it popped out of the unknown into a GA release, without any public testing. It was allegedly GA quality, although the quantity of bug reports that were filed soon after the release proved otherwise. The maturity of the shell was known as "development preview", and so we had a supposedly GA feature that could only be used with an alpha quality tool.

The situation with the MySQL shell got worse in a few months. A new product was brewing (MySQL Group Replication) and went rapidly from something released in the Labs without docs to being part of the regular server distribution, and it was evolving into a more complex and ambitious project (the InnoDB Cluster) which used the MySQL shell as its main tool.

Since the announcement of InnoDB Cluster, using the MySQL shell has been a nightmare. You saw examples in blog posts and presentations, and when you tried them at home, they did not work. There were different releases of MySQL shell with the same version number but different capabilities, depending on whether they were released through the main downloads site or through the labs.

When I asked why the shell wasn't distributed with the server, like the other tools, I was told that a non-GA product could not be released with a GA server. Considering that the Document Store is still walking around with a Pre-Production status legal notice, this was an odd excuse.

Still, I kept waiting, trying to figure out how to pair a given version of MySQL shell with a given version of the server. Unlike the server, there are no release notes for the shell, so every release was a surprising experience.

Eventually, the MySQL shell reached the GA state, with which merit I can't tell. Given the obstacles in the path to its usage, I doubt it has had any serious testing from the community. Despite the state being GA, it keeps being released separately, leaving the puzzled users with the ungrateful task of determining with which server version that shell could be used safely.

With the upcoming release of MySQL 8.0, a new version of MySQL shell appeared, with a colorful prompt and new features that the GA shell doesn't have. The public perception of the tool keeps getting more confused. In the presentations given by the MySQL team we see the new shell doing wonders, while the GA shell keeps its monochromatic features. Shall I use the 8.0.x shell with a 5.7 server or should I stick with the 1.0 version?

In MySQL 8.0, the situation is still divided. Both products (the server and the shell) are, as of today, not GA yet. It would make sense to finally end the craziness and put the two things together, so that users don't have to hunt around for the right shell version. But the two products are still released separately.

How can I do stuff with MySQL shell?

So far, we have only seen the availability of the shell. What about the functionality?

I have heard that Oracle wants to convert the shell into the only tool to deal with MySQL. I can't prove it, as Oracle doesn't release its development plans to the public, but I can see the emphasis on the shell in talks and articles authored by MySQL team engineers. If this is the plan, I think it needs a lot more work.

If you try to use MySQL shell the same way as the regular "mysql" client, you get in trouble soon.

mysqlsh --user root --password=msandbox --port=5721 --host 127.0.0.1
mysqlx: [Warning] Using a password on the command line interface can be insecure.
Creating a Session to 'root@127.0.0.1:5721'
Your MySQL connection id is 38
Server version: 5.7.21 MySQL Community Server (GPL)
No default schema selected; type \use  to set one.
MySQL Shell 1.0.11

Copyright (c) 2016, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.

Currently in JavaScript mode. Use \sql to switch to SQL mode and execute queries.

I see two problems here:

The warning about the password on the command line is legitimate. The trouble is that there is no alternative. mysqlsh does not support --defaults-file, and there is no way of giving a password other than directly at invocation. There is an option "--passwords-from-stdin" which does not seem to work, and even if it did, I can't see the advantage of using the password from a pipe.
The default mode is Javascript. I can see that this makes operations simpler when you want to perform setup tasks for InnoDB Cluster, but certainly doesn't help me to use this tool as the primary drive for database management. There is a "--sql" option that does what I expect, but if this is not the default, I can't see this replacement being very successful.
Due to the previous items, using the tool in batch mode (with -e "SQL commands") is impossible, as every invocation will start with the freaking password warning.

I'm afraid that it's too late to take action for MySQL 8.0. The MySQL team is probably packaging the GA release while I write these notes. But I offer some suggestions nonetheless.

Wish list

Package MySQL shell with the server. Past experience shows that the MySQL team keeps adding features into a GA release, thus exposing users to the risk of getting the wrong tool for the job. Having the shell and the server in the same tarball will help users pick the right version for the task. This is similar to what happens with mysqldump: using the tool from 5.5 with a 5.7+ server will not work properly. There is no reason for mysqlsh to be treated differently.
Make sure that all the features of the mysql client work seamlessly in mysqlsh. Perhaps run the test suite replacing mysql with mysqlsh and pick up from there.
Make the MySQL shell compatible with other tools. Specifically, it should support option files (--defaults-file, --defaults-extra-file, --defaults-group-suffix, --no-defaults)

In short, if the plan is to replace mysql with mysqlsh, put the thing in the open, and please make sure it can do what users can reasonably expect.

Wednesday, February 21, 2018

Meet dbdeployer: the new sandbox maker

How it happened

A few years ago I started thinking about refactoring MySQL-Sandbox. I got lots of ideas and a name for the project (dbdeployer) but went no further. The initial idea (this was 2013!) was to rewrite the project in Ruby: I had been using Ruby at work and it looked like a decent replacement for Perl. My main problem was the difficulty of installation in an uncontrolled environment. If you have control over your environment (it's your laptop or you are in charge of the server configuration via Puppet or similar) then the task is easy. But if you ever need to deploy somewhere with little or no notice, it becomes a problem: there are servers where Perl is not installed, and is common that the server also have a policy forbidding all scripting languages from being deployed. Soon I found out that Ruby has the same problem as Perl. In the meantime, my work also required heavy involvement with Python, and I started thinking that maybe it would be a better choice than Ruby.
My adventures with deployment continued. In some places, I would find old versions of Perl, Ruby, Python, and no way of replacing them easily. I also realized that, if I bit the bullet and wrote my tools in C or C++, my distribution problems would not end, as I had to deal with library dependencies and conflict with existing ones.
At the end of 2017 I finally did what I had postponed for so long: I took a serious look at Go, and I decided that it was the best candidate for solving the distribution problem. I had a few adjustment problems, as the Go philosophy is different from my previously used languages, but the advantages were so immediate that I was hooked. Here's what I found compelling:

Shift in responsibility: with all the other languages I have used, the user is responsible for providing the working environment, such as installing libraries, the language itself, solve conflicts, and so on, until the program can work. With Go, the responsibility is on the developers only: they are supposed to know how to collect the necessary packages and produce a sound executable. Users only need to download the executable and run it.
Ease of deployment. A Go executable doesn't have dependencies. Binaries can be compiled for several platforms from a single origin (I can build Linux executables in my Mac and vice versa) and they just work.
Ease of development. Go is a strongly typed language, and has a different approach at code structure than Perl or Python. But this doesn't slow down my coding: it forces me to write better code, resulting in something that is at the same time more robust and easy to extend.
Wealth of packages. Go has an amazingly active community, and there is an enormous amount of packages ready for anything.

What is dbdeployer?

UPDATE 28-Feb-2018: The commands "single", "replication", and "multiple" are now subcommand of "deploy". Also, "templates" is now a subcommand of "defaults".

The first goal of dbdeployer is to replace MySQL-Sandbox completely. As such, it has all the main features of MySQL Sandbox, and many more (See the full list of features at the end of this text.)

You can deploy a single sandbox, or multiple unrelated sandboxes, or several servers in replication. That you could do also with MySQL-Sandbox. The first difference is in the command structure:

$ dbdeployer
dbdeployer makes MySQL server installation an easy task.
Runs single, multiple, and replicated sandboxes.

Usage:
  dbdeployer [command]

Available Commands:
  admin       administrative tasks
  delete      delete an installed sandbox
  global      Runs a given command in every sandbox
  help        Help about any command
  multiple    create multiple sandbox
  replication create replication sandbox
  sandboxes   List installed sandboxes
  single      deploys a single sandbox
  templates   Admin operations on templates
  unpack      unpack a tarball into the binary directory
  usage       Shows usage of installed sandboxes
  versions    List available versions

Flags:
      --base-port int                 Overrides default base-port (for multiple sandboxes)
      --bind-address string           defines the database bind-address  (default "127.0.0.1")
      --config string                 configuration file (default "$HOME/.dbdeployer/config.json")
      --custom-mysqld string          Uses an alternative mysqld (must be in the same directory as regular mysqld)
  -p, --db-password string            database password (default "msandbox")
  -u, --db-user string                database user (default "msandbox")
      --expose-dd-tables              In MySQL 8.0+ shows data dictionary tables
      --force                         If a destination sandbox already exists, it will be overwritten
      --gtid                          enables GTID
  -h, --help                          help for dbdeployer
  -i, --init-options strings          mysqld options to run during initialization
      --keep-auth-plugin              in 8.0.4+, does not change the auth plugin
      --keep-server-uuid              Does not change the server UUID
      --my-cnf-file string            Alternative source file for my.sandbox.cnf
  -c, --my-cnf-options strings        mysqld options to add to my.sandbox.cnf
      --port int                      Overrides default port
      --post-grants-sql strings       SQL queries to run after loading grants
      --post-grants-sql-file string   SQL file to run after loading grants
      --pre-grants-sql strings        SQL queries to run before loading grants
      --pre-grants-sql-file string    SQL file to run before loading grants
      --remote-access string          defines the database access  (default "127.%")
      --rpl-password string           replication password (default "rsandbox")
      --rpl-user string               replication user (default "rsandbox")
      --sandbox-binary string         Binary repository (default "$HOME/opt/mysql")
      --sandbox-directory string      Changes the default sandbox directory
      --sandbox-home string           Sandbox deployment direcory (default "$HOME/sandboxes")
      --skip-load-grants              Does not load the grants
      --use-template strings          [template_name:file_name] Replace existing template with one from file
      --version                       version for dbdeployer

Use "dbdeployer [command] --help" for more information about a command.

MySQL-Sandbox was created in 2006, and its structure changed as needed, without a real plan. dbdeployer, instead, was designed to have a hierarchical command structure, similar to git or docker, to give users a better feeling. As a result, it has a leaner set of commands, a non-awkward way of using options, and offers a better control of the operations out of the box.

For example, here's how we would start to run sandboxes:

$ dbdeployer --unpack-version=8.0.4 unpack mysql-8.0.4-rc-linux-glibc2.12-x86_64.tar.gz
Unpacking tarball mysql-8.0.4-rc-linux-glibc2.12-x86_64.tar.gz to $HOME/opt/mysql/8.0.4
.........100.........200.........292

The first (mandatory) operation is to expand binaries from a tarball. By default, the files will be expanded to $HOME/opt/mysql. Once this is done, we can create sandboxes at will, with simple commands:

$ dbdeployer single 8.0.4
Database installed in $HOME/sandboxes/msb_8_0_4
run 'dbdeployer usage single' for basic instructions'
. sandbox server started

$ dbdeployer replication 8.0.4
[...]
Replication directory installed in /$HOME/sandboxes/rsandbox_8_0_4
run 'dbdeployer usage multiple' for basic instructions'

$ dbdeployer multiple 8.0.4
[...]
Multiple directory installed in $HOME/sandboxes/multi_msb_8_0_4
run 'dbdeployer usage multiple' for basic instructions'

$ dbdeployer sandboxes
msb_8_0_4            : single                    8.0.4 [8004]
multi_msb_8_0_4      : multiple                  8.0.4 [24406 24407 24408]
rsandbox_8_0_4       : master-slave              8.0.4 [19405 19406 19407]

Three differences between dbdeployer and MySQL-Sandbox:

There is only one executable, with different commands;
After each deployment, there is a suggestion on how to get help about the sandbox usage.
There is a command that displays which sandboxes were installed, the kind of deployment, and the ports in use. This will be useful when the ports increase, as in group replication.

Here's another take, after deploying group replication:

$ dbdeployer sandboxes
group_msb_8_0_4      : group-multi-primary   8.0.4 [20405 20530 20406 20531 20407 20532]
group_sp_msb_8_0_4   : group-single-primary  8.0.4 [21405 21530 21406 21531 21407 21532]
msb_8_0_4            : single                8.0.4 [8004]
multi_msb_8_0_4      : multiple              8.0.4 [24406 24407 24408]
rsandbox_8_0_4       : master-slave          8.0.4 [19405 19406 19407]

A few more differences from MySQL-Sandbox are the "global" and "delete" commands.
The "global" command can broadcast a command to all the sandboxes. You can start, stop, restart all sandboxes at once, or run a query everywhere.

$ dbdeployer global use "select @@server_id, @@port, @@server_uuid"
# Running "use_all" on group_msb_8_0_4
# server: 1
@@server_id @@port  @@server_uuid
100 20405   00020405-1111-1111-1111-111111111111
# server: 2
@@server_id @@port  @@server_uuid
200 20406   00020406-2222-2222-2222-222222222222
# server: 3
@@server_id @@port  @@server_uuid
300 20407   00020407-3333-3333-3333-333333333333

# Running "use_all" on group_sp_msb_8_0_4
# server: 1
@@server_id @@port  @@server_uuid
100 21405   00021405-1111-1111-1111-111111111111
# server: 2
@@server_id @@port  @@server_uuid
200 21406   00021406-2222-2222-2222-222222222222
# server: 3
@@server_id @@port  @@server_uuid
300 21407   00021407-3333-3333-3333-333333333333

# Running "use" on msb_8_0_4
@@server_id @@port  @@server_uuid
1   8004    00008004-0000-0000-0000-000000008004
[...]

You can run the commands manually. dbdeployer usage will show which commands are available for every sandbox.

$ dbdeployer usage single

    USING A SANDBOX

Change directory to the newly created one (default: $SANDBOX_HOME/msb_VERSION
for single sandboxes)
[ $SANDBOX_HOME = $HOME/sandboxes unless modified with flag --sandbox-home ]

The sandbox directory of the instance you just created contains some handy
scripts to manage your server easily and in isolation.

"./start", "./status", "./restart", and "./stop" do what their name suggests.
start and restart accept parameters that are eventually passed to the server.
e.g.:

  ./start --server-id=1001

  ./restart --event-scheduler=disabled

"./use" calls the command line client with the appropriate parameters,
Example:

    ./use -BN -e "select @@server_id"
    ./use -u root

"./clear" stops the server and removes everything from the data directory,
letting you ready to start from scratch. (Warning! It's irreversible!)

When you don't need the sandboxes anymore, you can dismiss them with a single command:

$ dbdeployer delete ALL
Deleting the following sandboxes
$HOME/sandboxes/group_msb_8_0_4
$HOME/sandboxes/group_sp_msb_8_0_4
$HOME/sandboxes/msb_8_0_4
$HOME/sandboxes/multi_msb_8_0_4
$HOME/sandboxes/rsandbox_8_0_4
Do you confirm? y/[N]

There is an option to skip the confirmation, which is useful for scripting unattended tests.

Customization

One of the biggest problems with MySQL-Sandbox was that most of the functioning is hard-coded, and the scripts needed to run the sandboxes are generated in different places, so that extending or modifying features became more and more difficult. When I designed dbdeployer, I gave myself the goal of making the tool easy to change, and the code easy to understand and extend.

For this reason, I organized everything related to code generation (the scripts that initialize and run the sandboxes) in a collection of templates and default variables that are publicly visible and modifiable.

$ dbdeployer templates -h
The commands in this section show the templates used
to create and manipulate sandboxes.

Usage:
  dbdeployer templates [command]

Aliases:
  templates, template, tmpl, templ

Available Commands:
  describe    Describe a given template
  export      Exports all templates to a directory
  import      imports all templates from a directory
  list        list available templates
  reset       Removes all template files
  show        Show a given template

You can list the templates on the screen.

$ dbdeployer templates list single
    [single]      replication_options       : Replication options for my.cnf
    [single]      load_grants_template      : Loads the grants defined for the sandbox
    [single]      grants_template57         : Grants for sandboxes from 5.7+
    [single]      grants_template5x         : Grants for sandboxes up to 5.6
    [single]      my_template               : Prefix script to run every my* command line tool
    [single]      show_binlog_template      : Shows a binlog for a single sandbox
    [single]      use_template              : Invokes the MySQL client with the appropriate options
    [single]      clear_template            : Remove all data from a single sandbox
    [single]      restart_template          : Restarts the database (with optional mysqld arguments)
    [single]      start_template            : starts the database in a single sandbox (with optional mysqld arguments)
    [single]      stop_template             : Stops a database in a single sandbox
    [single]      send_kill_template        : Sends a kill signal to the database
    [single]      show_relaylog_template    : Show the relaylog for a single sandbox
    [single]      Copyright                 : Copyright for every sandbox script
    [single]      expose_dd_tables          : Commands needed to enable data dictionary table usage
    [single]      init_db_template          : Initialization template for the database
    [single]      grants_template8x         : Grants for sandboxes from 8.0+
    [single]      add_option_template       : Adds options to the my.sandbox.cnf file and restarts
    [single]      test_sb_template          : Tests basic sandbox functionality
    [single]      sb_include_template       : TBD
    [single]      gtid_options              : GTID options for my.cnf
    [single]      my_cnf_template           : Default options file for a sandbox
    [single]      status_template           : Shows the status of a single sandbox

Then it's possible to examine template contents:

$ dbdeployer templates describe --with-contents init_db_template
# Collection    : single
# Name          : init_db_template
# Description   : Initialization template for the database
# Notes         : This should normally run only once
# Length        : 656
##START init_db_template
#!/bin/bash
        {{.Copyright}}
        # Generated by dbdeployer {{.AppVersion}} using {{.TemplateName}} on {{.DateTime}}
        BASEDIR={{.Basedir}}
        export LD_LIBRARY_PATH=$BASEDIR/lib:$BASEDIR/lib/mysql:$LD_LIBRARY_PATH
        export DYLD_LIBRARY_PATH=$BASEDIR/lib:$BASEDIR/lib/mysql:$DYLD_LIBRARY_PATH
        SBDIR={{.SandboxDir}}
        DATADIR=$SBDIR/data
        cd $SBDIR
        if [ -d $DATADIR/mysql ]
        then
            echo "Initialization already done."
            echo "This script should run only once."
            exit 0
        fi

        {{.InitScript}} \
            {{.InitDefaults}} \
            --user={{.OsUser}} \
            --basedir=$BASEDIR \
            --datadir=$DATADIR \
            --tmpdir={{.Tmpdir}} {{.ExtraInitFlags}}

##END init_db_template

The one above is the template that generates the initialization script. In MySQL-Sandbox, this was handled in the code, and it was difficult to figure out what went wrong when the initialization failed. The Go language has an excellent support for code generation using templates, and with just a fraction of its features I implemented a few dozen scripts which I am able to modify with ease. Here's what the deployed script looks like

#!/bin/bash

#    DBDeployer - The MySQL Sandbox
#    Copyright (C) 2006-2018 Giuseppe Maxia
#
#    Licensed under the Apache License, Version 2.0 (the "License");
#    you may not use this file except in compliance with the License.
#    You may obtain a copy of the License at
#
#        http://www.apache.org/licenses/LICENSE-2.0
#
#    Unless required by applicable law or agreed to in writing, software
#    distributed under the License is distributed on an "AS IS" BASIS,
#    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#    See the License for the specific language governing permissions and
#    limitations under the License.

# Generated by dbdeployer 0.1.24 using init_db_template on Tue Feb 20 14:45:29 CET 2018
BASEDIR=$HOME/opt/mysql/8.0.4
export LD_LIBRARY_PATH=$BASEDIR/lib:$BASEDIR/lib/mysql:$LD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$BASEDIR/lib:$BASEDIR/lib/mysql:$DYLD_LIBRARY_PATH
SBDIR=$HOME/sandboxes/msb_8_0_4
DATADIR=$SBDIR/data
cd $SBDIR
if [ -d $DATADIR/mysql ]
then
 echo "Initialization already done."
 echo "This script should run only once."
 exit 0
fi

$HOME/opt/mysql/8.0.4/bin/mysqld \
    --no-defaults \
    --user=$USER \
    --basedir=$BASEDIR \
    --datadir=$DATADIR \
    --tmpdir=$HOME/sandboxes/msb_8_0_4/tmp \
    --initialize-insecure --default_authentication_plugin=mysql_native_password

Let's see the quick-and-dirty usage. If you want to change a template and use it just once, do the following:

$ dbdeployer templates show init_db_template
Save it to a file init_db.txt and edit it. Be careful, though: removing or altering essential labels may block the sandbox initialization.
Use the template file in the next command:

$ dbdeployer single 8.0.4 --use-template=init_db_template:init_db.txt

For more permanent results, when you'd like to change a template, or several ones, permanently, you can use the export/import commands

List the templates related to replication (dbdeployer templates list replication)
Export the templates to the directory "mydir" $ dbdeployer templates export replication mydir
edit the templates you want to change inside "mydir/replication"
Import the templates dbdeployer templates import replication mydir

The templates will end inside $HOME/.dbdeployer/templates_$DBDEPLOYER_VERSION and dbdeployer will load then instead of using the ones stored internally. The next time that one of those templates will be needed, it will be collected from the file. If you run dbdeployer templates list or describe, the ones saved to file will be marked with {F}.
To go back to the built-in behavior, simply run dbdeployer templates reset

In addition to templates, dbdeployer uses a set of values when creating sandboxes. Like templates, this set is used from internal store, but it can be exported to a configuration file.

$ dbdeployer admin show
# Internal values:
{
  "version": "0.1.24",
  "sandbox-home": "$HOME/sandboxes",
  "sandbox-binary": "$HOME/opt/mysql",
  "master-slave-base-port": 11000,
  "group-replication-base-port": 12000,
  "group-replication-sp-base-port": 13000,
  "multiple-base-port": 16000,
  "group-port-delta": 125,
  "sandbox-prefix": "msb_",
  "master-slave-prefix": "rsandbox_",
  "group-prefix": "group_msb_",
  "group-sp-prefix": "group_sp_msb_",
  "multiple-prefix": "multi_msb_"
}

The values named *-base-port are used to calculate the port for each node in a multiple deployment. The calculation goes:

sandbox_port + base_port + (revision_number * 100)

So, for example, when deploying replication for 5.7.21, the sandbox port would be 5721, and the final base port will be calculated as follows:

5721 + 11000 + 21 * 100 = 18821

This number will be incremented for each node in the cluster, so that the master will get 18822, and the first slave 18823.

Using the commands dbdeployer admin export and import you can customize the default values in a way similar to what we saw for the templates.

Thanks

I'd like to thank:

Ronald Bradford and René Cannaò, for priceless advice on the usability when the tool was in its early stage of development;
Shlomi Noach, for adopting dbdeployer even before it was feature complete;
Frédéric "lefred" Descamps, for allowing me to present dbdeployer at the pre-FOSDEM MySQL event;
The Go community, for the exciting environment offered to newcomers.

A note about unpacking MySQL tarball

When using MySQL tarballs, we may have some problems due to the enormous size that the tarballs have reached. Look at this:

690M    5.5.52
1.2G    5.6.39
2.5G    5.7.21
3.6G    8.0.0
1.3G    8.0.1
1.5G    8.0.2
1.9G    8.0.3
1.9G    8.0.4

This becomes a serious problem when you want to unpack the tarball inside a low-resource virtual machine or a Docker container. I have asked the MySQL team to provide reduced tarballs, possibly in a fixed location, so that sandboxes creation could be fully automated. I was told that something will be done soon. In the meantime, I provide such reduced tarballs, which have a more reasonable size:

 49M    5.5.52
 61M    5.6.39
346M    5.7.21
447M    8.0.0
462M    8.0.1
254M    8.0.2
270M    8.0.3
244M    8.0.4

Using these reduced tarballs, which are conveniently packed in a docker container (datacharmer/mysql-sb-full contains all major MySQL versions), I have automated dbdeployer tests with minimal storage involvement, and that improves the test speed as well.

Detailed list of features

Feature	MySQL-Sandbox	dbdeployer	dbdeployer planned
Single sandbox deployment	yes	yes
unpack command	sort of ¹	yes
multiple sandboxes	yes	yes
master-slave replication	yes	yes
"force" flag	yes	yes
pre-post grants SQL action	yes	yes
initialization options	yes	yes
my.cnf options	yes	yes
custom my.cnf	yes	yes
friendly UUID generation	yes	yes
global commands	yes	yes
test replication flow	yes	yes
delete command	yes ²	yes
group replication SP	no	yes
group replication MP	no	yes
prevent port collision	no	yes ³
visible initialization	no	yes ⁴
visible script templates	no	yes ⁵
replaceable templates	no	yes ⁶
configurable defaults	no	yes ⁷
list of source binaries	no	yes ⁸
list of installed sandboxes	no	yes ⁹
test script per sandbox	no	yes ¹⁰
integrated usage help	no	yes ¹¹
custom abbreviations	no	yes ¹²
version flag	no	yes ¹³
fan-in	no	no	yes ¹⁴
all-masters	no	no	yes ¹⁵
Galera/PXC/NDB	no	no	yes ¹⁸
finding free ports	yes	no	yes
pre-post grants shell action	yes	no	maybe
getting remote tarballs	yes	no	yes
circular replication	yes	no	no ¹⁶
master-master (circular)	yes	no	no
Windows support	no	no	no ¹⁷

It's achieved using --export_binaries and then abandoning the operation. ↩
Uses the sbtool command ↩
dbdeployer sandboxes store their ports in a description JSON file, which allows the tool to get a list of used ports and act before a conflict happens. ↩
The initialization happens with a script that is generated and stored in the sandbox itself. Users can inspect the init_db script and see what was executed. ↩
All sandbox scripts are generated using templates, which can be examined and eventually changed and re-imported. ↩
See also note 5. Using the flag --use-template you can replace an existing template on-the-fly. Group of templates can be exported and imported after editing. ↩
Defaults can be exported to file, and eventually re-imported after editing. ↩
This is little more than using an O.S. file listing, with the added awareness of the source directory. ↩
Using the description files, this command lists the sandboxes with their topology and used ports. ↩
It's a basic test that checks whether the sandbox is running and is using the expected port. ↩
The "usage" command will show basic commands for single and multiple sandboxes. ↩
The abbreviations file allows user to define custom shortcuts for frequently used commands. ↩
Strangely enough, this simple feature was never implemented for MySQL-Sandbox, while it was one of the first additions to dbdeployer. ↩
Will use the multi source technology introduced in MySQL 5.7. ↩
Same as n. 13. ↩
Circular replication should not be used anymore. There are enough good alternatives (multi-source, group replication) to avoid this old technology. ↩
I don't do Windows, but you can fork the project if you do. ↩
For Galera/PXC and MySQL Cluster I have ideas, but I may need help to implement. ↩