Query Optimization With MySQL 8.0 and MariaDB 10.3 - The Basics - FileId - 160092 PDF
Query Optimization With MySQL 8.0 and MariaDB 10.3 - The Basics - FileId - 160092 PDF
Query Optimization With MySQL 8.0 and MariaDB 10.3 - The Basics - FileId - 160092 PDF
3: The Basics
dbahire.com/pleu18
Agenda
1. Introduction 6. Joins
5. Break
Application-Level profiling
pt-query-digest
• It is a 3rd party tool written in Perl, originally
created by Baron Schwartz
• It requires activation of the slow log: Be careful
with extra
– SET GLOBAL slow_query_log = 1; IO and
– SET long_query_time = 0; latency!
• In Percona Server and MariaDB it can provide extra
information:
– SHOW GLOBAL VARIABLES like 'log_slow_verbosity';
PERFORMANCE_SCHEMA
• Monitoring schema (engine) enabled by
default since MySQL 5.6
– performance_schema = 1 (it is not dynamic)
• Deprecates the old query profiling
• It is way more user-friendly when combined
with the SYS schema/ps_helper (a set of
views and stored procedures created by Mark
Leith)
– Included by default since 5.7.7
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 17
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
mysql-5.7.8 (osm) > EXPLAIN SELECT 'node' as type, node_id as id FROM node_tags WHERE k='amenity'
and v='cafe' UNION SELECT 'way' as type, way_id as id FROM way_tags WHERE k='amenity' and v='cafe'
UNION SELECT 'relation' as type, relation_id as id FROM relation_tags WHERE k='amenity' and
v='cafe';
+----+--------------+---------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+---------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+
| 1 | PRIMARY | node_tags | NULL | ALL | NULL | NULL | NULL | NULL | 851339 | 0.00 | Using where |
| 2 | UNION | way_tags | NULL | ALL | NULL | NULL | NULL | NULL | 1331016 | 0.00 | Using where |
| 3 | UNION | relation_tags | NULL | ALL | NULL | NULL | NULL | NULL | 63201 | 0.00 | Using where |
| NULL | UNION RESULT | <union1,2,3> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+---------------+------------+------+---------------+------+---------+------+---------+----------+-----------------+
4 rows in set, 1 warning (0.01 sec)
EXPLAIN
●
Essential to understand the execution plan of our
queries
– Works on SELECTs, INSERTs, UPDATEs,
REPLACEs, DELETEs and connections
– Fully documented on:
https://dev.mysql.com/doc/refman/8.0/en/expl
ain-output.html
EXPLAIN Example
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM page
WHERE page_title = 'German';
+------+-------------+-------+------+---------------
+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------
+------+---------+------+--------+-------------+
| 1 | SIMPLE | page | ALL | NULL |
NULL | NULL | NULL | 778885 | Using where |
+------+-------------+-------+------+---------------
Difficult to see
+------+---------+------+--------+-------------+
something
1 row in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 25
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
●
Let's add an index on page.page_title:
Types of indexes
●
BTREE
– B-TREE in MyISAM, B+TREE in InnoDB
●
HASH
– Only available for MEMORY and NDB
●
FULLTEXT
– Inverted indexes in MyISAM and InnoDB
●
SPATIAL
– RTREEs in MyISAM and InnoDB
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 38
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
TABLE
Iceberg
Landscape
Leek
Maelstrom
Nasty
Rover
School
Walrus
TABLE
Iceberg
Landscape
Leek
Maelstrom
Nasty
Rover
School
Walrus
It is a range
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM page WHERE
page_title like 'Spa%'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: page
type: range Despite not being an
possible_keys: page_title equality, we can use the
key: page_title index to find the values
key_len: 257
ref: NULL quickly
rows: 563
Extra: Using index condition
1 row in set (0.00 sec)
TABLE
Iceberg
Landscape
Leek
Maelstrom
Nasty
Rover
School
Walrus
BTREE Index
Avast
Boss
German
Goal
Golf
Etch
Harlem
TABLE
Iceberg
Landscape
Leek
Maelstrom
? Nasty
Rover
School
Walrus
type: const
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM revision WHERE
rev_id = 2\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: revision 'const' is a special case of 'ref',
type: const when the index can assure
possible_keys: PRIMARY that only 1 result can be
key: PRIMARY returned (equality + primary
key_len: 4
ref: const key or unique key). It is faster.
rows: 1
Extra:
1 row in set (0.00 sec)
type: NULL
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM revision WHERE
rev_id = -1\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: NULL 'NULL' is not really a plan,
type: NULL just an optimization that
possible_keys: NULL allows immediately
key: NULL discarding impossible
key_len: NULL
ref: NULL conditions
rows: NULL
Extra: Impossible WHERE noticed after reading const
tables
1 row in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 50
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
type: ref_or_null
MariaDB [osm]> EXPLAIN SELECT * FROM nodes WHERE tile = 1 or
tile is null\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: nodes Equivalent to 'ref', but
type: ref_or_null also takes into account
possible_keys: nodes_tile_idx NULL values
key: nodes_tile_idx
key_len: 5
ref: const
rows: 2
Extra: Using index condition; Using where
1 row in set (0.00 sec)
TABLE
Iceberg
Landscape
Leek
Maelstrom
Nasty
Rover
School
Walrus
London
Place
JOSM 234234344549 6 access uncontrolled
Borough of
Southwark
Southwark London Plane 234234344550 1 editor JOSM
Manor
Manor Place
Place
Northolt Road 234234344551 9 name Big Ben
recycling
survey
survey 234234344552 1 source survey
tree
tree
uncontrolled 234234344557 1 name London Plane
MariaDB [dewiktionary]> EXPLAIN SELECT count(DISTINCT rev_user) FROM revision WHERE rev_page =
31579\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: revision
Adding an index on rev_page
type: ref increases the speed due to
possible_keys: revision_rev_page
key: revision_rev_page improved filtering
key_len: 4
ref: const
rows: 4038
Extra:
1 row in set (0.00 sec)
MariaDB [dewiktionary]> SELECT count(DISTINCT rev_user) FROM revision WHERE rev_page = 31579\G
*************************** 1. row ***************************
count(DISTINCT rev_user): 1
1 row in set (0.04 sec)
revision. = <
... revision
rev_id
revision. (string
revision. (int constant)
rev_timest constant)
rev_page 31579
amp '2008'
Index on (rev_page)
MariaDB [dewiktionary]> ALTER TABLE revision ADD INDEX rev_page (rev_page);
Query OK, 0 rows affected (11.87 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM revision WHERE rev_page = 31579 and
rev_timestamp < '2008'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: revision Query time improves
type: ref significantly with this
possible_keys: rev_page index
key: rev_page
key_len: 4
ref: const
rows: 4038
Fewer rows are
Extra: Using where scanned
1 row in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 80
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Adding (rev_timestamp)
MariaDB [dewiktionary]> ALTER TABLE revision ADD INDEX rev_timestamp
(rev_timestamp);
Query OK, 0 rows affected (17.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
Adding (rev_page,
rev_timestamp)
MariaDB [dewiktionary]> ALTER TABLE revision ADD INDEX
rev_page_rev_timestamp(rev_page, rev_timestamp);
Query OK, 0 rows affected (14.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
Is (rev_timestamp, rev_page) a
better option?
MariaDB [dewiktionary]> ALTER TABLE revision ADD INDEX rev_timestamp_rev_page
(rev_timestamp, rev_page);
Query OK, 0 rows affected (16.80 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM revision WHERE rev_page = 31579 and
rev_timestamp < '2008'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: revision
type: range
possible_keys: rev_page,rev_timestamp,rev_page_rev_timestamp,rev_timestamp_rev_page
key: rev_page_rev_timestamp
key_len: 18
ref: NULL Previous index is still
rows: 530 preferred, why?
Extra: Using index condition
1 row in set (0.00 sec)
Forcing (rev_timestamp,
rev_page)
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM revision FORCE
INDEX(rev_timestamp_rev_page) WHERE rev_page = 31579 and
rev_timestamp < '2008'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: revision
type: range
possible_keys: rev_timestamp_rev_page
key: rev_timestamp_rev_page
key_len: 18
ref: NULL Only the 1st column* is being
rows: 688325 used effectively for filtering
Extra: Using index condition
1 row in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 85
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
TABLE
(23305, '2006')
(23443,
'2016') (30024, '2003')
(31579,
'2003')
(31579, (31579, '2004')
'2005')
(31579, '2008')
(31579,
(rev_page, '2017')
(428105, '2015')
rev_timestamp)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 86
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
TABLE
('2009', 201)
('2011',
13605) ('2012', 791)
('2014', 999)
('2015', 31579)
('2016', 24)
('2018', 1784)
('2018',
(rev_timestamp, 31579)
('2018', 79405)
rev_page)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 87
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Adding an index on
(rev_comment, rev_timestamp)
MariaDB [dewiktionary]> ALTER TABLE revision ADD INDEX
rev_comment_rev_timestamp (rev_comment, rev_timestamp);
Query OK, 0 rows affected (16.19 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [dewiktionary]> EXPLAIN SELECT * FROM revision WHERE rev_comment=''
ORDER BY rev_timestamp ASC\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: revision
type: ref Both ‘type: ALL’
possible_keys: rev_comment_rev_timestampand ‘filesort’ have
key: rev_comment_rev_timestamp
key_len: 769 disappeared
ref: const
rows: 266462
Extra: Using index condition; Using where
1 row in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 90
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
ICP optimizations
●
Differences in execution time are more significant
when the extra column condition is very selective
(getting 5x the original performance)
●
ICP is ignored when using covering Index,
potentially making the performance worse
Index on (rev_page)
MariaDB [dewiktionary]> SHOW STATUS like 'Hand%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_delete | 0 |
… Using the index, request the
| Handler_read_first | 0 | first row with rev_page=31579
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 4038 |
| Handler_read_prev | 0 | Then, scan them one by one in
| Handler_read_rnd | 0 | index order
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
…
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
27 rows in set (0.01 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 103
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Index on (rev_timestamp)
MariaDB [dewiktionary]> SHOW STATUS like 'Hand%';
+----------------------------+---------+ ICP will be explained later,
| Variable_name | Value |
+----------------------------+---------+ let's ignore it for now
| Handler_commit | 1 |
| Handler_delete | 0 |
… Using the index, request the first
| Handler_read_first | 0 | row where rev_timestamp<2008
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 199155 |
| Handler_read_prev | 0 | Then, scan them one by one in
| Handler_read_rnd | 0 | index order (more are matched)
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
…
+----------------------------+---------+
27 rows in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 104
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Index on (rev_page,
rev_timestamp)
MariaDB [dewiktionary]> SHOW STATUS like 'Hand%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+ With both conditions covered, we
| Handler_commit | 1 |
| Handler_delete | 0 | can find the actual first row that
… matches the condition using the
| Handler_read_first | 0 | index
| Handler_read_key | 1 |
| Handler_read_last | 0 |
| Handler_read_next | 530 |
| Handler_read_prev | 0 | Rows scanned == Rows returned
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
…
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+-------+
25 rows in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 105
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Index on (rev_timestamp,
rev_page), no ICP
MariaDB [dewiktionary]> SHOW STATUS like 'Hand%';
+----------------------------+---------+
| Variable_name | Value |
+----------------------------+---------+
| Handler_commit | 1 | Assuming no ICP, exact same
| Handler_delete | 0 | results as with (rev_timestamp).
… The extra column does not help.
| Handler_read_first | 0 |
| Handler_read_key | 1 | Also, EXPLAIN's row count was
| Handler_read_last | 0 | very off.
| Handler_read_next | 452539 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
…
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+---------+
27 rows in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 106
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Index on (rev_timestamp,
rev_page), with ICP
MariaDB [dewiktionary]> SHOW STATUS like 'Hand%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| Handler_commit | 1 |
| Handler_delete | 0 |
…
| Handler_icp_attempts | 452539 | ICP reduces the number of
| Handler_icp_match | 530 | ‘ENGINE API calls’ significantly,
…
| Handler_read_first | 0 | although making it work more
| Handler_read_key | 1 |
| Handler_read_last | 0 | internally
| Handler_read_next | 530 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 0 |
| Handler_rollback | 0 |
…
| Handler_update | 0 |
| Handler_write | 0 |
+----------------------------+--------+
27 rows in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 107
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Redundant Indexes
●
Creating all 4 previous indexes in production is
not a great idea
– "Left-most index prefix" allows, for example
(rev_page, rev_timestamp) doing everything
you can do with (rev_page)
– If two indexes have equal selectivity, MySQL
chooses the shortest one
Duplicate Indexes
●
It is very easy to create indexes with the same
exact definition (same columns and ordering)
– Set a convention for index naming (e.g
tablename_column1_column2_idx) – MySQL does
not allow 2 indexes with the same identifier
– Since MySQL 5.6, an warning is thrown if a
duplicate index is created: Still a warning on 8.0
Duplicate index 'page_random2' defined on the table
'dewiktionary.page'. This is deprecated and will be disallowed
in a future release.
pt-duplicate-index-checker
$ pt-duplicate-key-checker h=localhost,u=root,D=dewiktionary
[…]
# rev_timestamp is a left-prefix of rev_timestamp_rev_page
# Key definitions: Simple tool to check
# KEY `rev_timestamp` (`rev_timestamp`),
# KEY `rev_timestamp_rev_page` (`rev_timestamp`,`rev_page`) redundant and
# Column types:
#
#
duplicate indexes
`rev_timestamp` binary(14) not null default '\0\0\0\0\0\0\0\0\0\0\0\0\0\0'
`rev_page` int(10) unsigned not null
# To remove this duplicate index, execute:
ALTER TABLE `dewiktionary`.`revision` DROP INDEX `rev_timestamp`;
INDEX_MERGE Issues
●
Sometimes it is faster to to execute the
sentence using UNION:
– This is especially true with (UNION ALL)
since MySQL 5.7, if you do not care or
expect duplicates
●
There are also intersection merges, but
multi-column indexes are preferred
MyISAM Internals
Index (part of revision.MYI) Data (revision.MYD)
node_id versio k v
7 105 version k v
6 6 605
Secondary
605 6 access uncontrolled
1
476
indexes 3 771
244
476 1 editor JOSM
1 name
survey
London Plane
208
value 1 811
105 7 amenity pub
1 476
1 208
Consequences of using
InnoDB (II)
●
Inserting in primary key order is much faster
– Less fragmentation/page-split
– Usage of "batch" mode, improving insert speed
●
Using auto-increment keys as primary keys can be
a good idea for InnoDB
Consequences of using
InnoDB (III)
• A very long primary key may increment
substantially the size of secondary keys
– Int or bigint types are recommended instead of UUIDs
or other long strings
Differences in size
mysql (osm) > CREATE TABLE
pk_int (id int PRIMARY KEY .ibd size (no secondary indexes) .ibd size (with secondary indexes)
auto_increment,
42500000
a int,
b int,
c int,
d int); 34000000
Query OK, 0 rows affected 32,505,856
(0.16 sec)
25500000
bytes
node_id version k v
JOIN Optimization
●
Two main goals:
– Perform an effective filtering on each table
access, if possible using indexes
– Perform the access in the most efficient table
order
●
When joining 3 or more tables in a star schema,
the "covering index" strategy can have a huge
impact
Creating an index
on way_tags.k
Multi-range read
• This optimization orders results obtained
from a secondary key in primary key/physical
order before accessing the rows
– It may help execution time of queries when disk-bound
– It requires tuning of the read_rnd_buffer_size (size of
the buffer used for ordering the results)
• BKA JOINs are based on the mrr optimization
Hash Joins
• Only work for equi-joins
Hash table
for faster
access to node_id version k v
MySQL Configuration
• BKA requires changes of default optimizer
configuration:
mysql-8.0 (osm) > SET optimizer_switch= 'mrr=on';
mysql-8.0 (osm) > SET optimizer_switch= 'mrr_cost_based=off';
mysql-8.0 (osm) > SET optimizer_switch= 'batch_key_access=on';
– Additionally, configuring the join_buffer_size adequately
MariaDB configuration
mariadb-10.3 (osm) > SET optimizer_switch = 'join_cache_incremental=on';
mariadb-10.3 (osm) > SET optimizer_switch = 'join_cache_hashed=on';
mariadb-10.3 (osm) > SET optimizer_switch = 'join_cache_bka=on';
- Enabled by default
Access types:
unique_subquery/index_subquery
mysql-8.0 (osm) > EXPLAIN SELECT * ************** 2. row **************
FROM node_tags WHERE v = 'Big Ben' id: 2
and node_id NOT IN (SELECT node_id select_type: DEPENDENT SUBQUERY
FROM nodes WHERE tile < 100000000)\G table: nodes
************** 1. row ************** type: index_subquery
id: 1
select_type: PRIMARY possible_keys: PRIMARY,nodes_tile_idx
table: node_tags key: PRIMARY
type: ref key_len: 8 Unique
possible_keys: v_idx ref: func subquery
key: v_idx rows: 1 is similar,
key_len: 767 Extra: Using where but using a
ref: const 2 rows in set (0.00 sec)
rows: 1 unique or
Extra: Using where; Using primary
index key
Subqueries in MySQL
●
MySQL versions traditionally had very bad press
regarding subquries
– It was common to recommend rewriting them
(when possible) into JOINS
●
Since MySQL 5.6, its query execution plans have
improved significantly
generated
| 2 | DERIVED | nodes | ALL | NULL | NULL | NULL | NULL | 2853846 | Using
where |
not
+----+-------------+------------+-------+---------------+-------------+---------+-------------------+---------
+-------------+
3 rows in set (0.00 sec) executed index
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 164
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Semijoin Optimization
●
The only way to execute certain IN subqueries was to
execute them with poor strategy
– This forced rewriting of certain queries into JOINS or scalar subqueries,
when possible
●
There are now several additional options (many
automatic):
– Convert to a JOIN
– Materialization (including index creation)
– FirstMatch
– LooseScan
– Duplicate Weedout
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 166
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Fulltext Index
mysql (osm) > ALTER TABLE way_tags add FULLTEXT index(v);
Query OK, 0 rows affected (53.53 sec)
Records: 0 Duplicates: 0 Warnings: 0
Alternatives
●
Apache Lucene
– Solr
– Elasticsearch
●
Sphinx
– SphinxSE
mysql-8.0 (osm) > EXPLAIN SELECT ... FROM nodes n FORCE INDEX(latitude) ...;
+----+-------------+-------+------------+-------+----------+----------+------+---------------+--------+----------+------------------+
| id | select_type | table | partitions | type | possible | key | key_ | ref | rows | filtered | Extra |
| | | | | | _keys | | len | | | | |
+----+-------------+-------+------------+-------+----------+----------+------+---------------+--------+----------+------------------+
| 1 | SIMPLE | n | NULL | range | latitude | latitude | 8 | NULL | 949370 | 11.11 | Using where; |
| | | | | | | | | | | | Using index; |
| | | | | | | | | | | | Using filesort |
| 1 | SIMPLE | n_t1 | NULL | ref | PRIMARY | PRIMARY | 8 | osm.n.node_id | 1 | 1.41 | Using where |
| 1 | SIMPLE | n_t2 | NULL | ref | PRIMARY | PRIMARY | 8 | osm.n.node_id | 1 | 1.41 | Using where |
Performance Most of the gain
+----+-------------+-------+------------+-------+----------+----------+------+---------------+--------+----------+------------------+
3 rows in set, 1 warning (0.00 sec) Still many comes from the
comes from the
improvement is
Mysql-8.0 (osm) > SELECT ... FROM nodes n FORCE INDEX(latitude) ...;
not great
rows are covering index, not
0 rows in set (0.26 sec) the filtering
examined
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 181
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
New Query
mysql> SET @area := ST_Envelope(linestring(POINT(@lon - 500/71520.91, @lat - 500/111231.29), POINT(@lon +
500/71520.91, @lat + 500/111231.29)));
mysql> SELECT n.node_id,
x(n.coord) as longitude, We can use any
y(n.coord) as latitude,
st_distance(POINT(@lon, @lat), coord) as distance
shape we want
FROM nodes n thanks to 5.6
JOIN node_tags n_t1 improvements
ON n.node_id = n_t1.node_id
JOIN node_tags n_t2
ON n.node_id = n_t2.node_id
WHERE Also substitute
n_t1.k = 'amenity' and
functions like
n_t1.v = 'cafe' and
n_t2.k = 'name' and Envelope with
n_t2.v like 'Starbucks%' and ST_Envelope
st_within(coord, @area)
since 5.7
ORDER BY st_distance(POINT(@lon, @lat), coord) ASC
LIMIT 1;
Better Performance
mysql (osm) > SELECT ...;
+-----------+------------+------------+-----------------------+
| node_id | longitude | latitude | distance |
+-----------+------------+------------+-----------------------+
| XXXXXXXXX | XXXXXXXXXX | XXXXXXXXXX | 0.0014631428672541478 |
+-----------+------------+------------+-----------------------+
This field used to
Empty set (0.11 sec)
be almost useless
(wait for it)
mysql (osm) > EXPLAIN SELECT ...;
+----+--------+-------+-------+-------+----------+---------+------+---------+------+----------+----------------+
| id | select | table | parti | type | possible | key | key | ref | rows | filtered | Extra |
| | _type | | tions | | _keys | | _len | | | | |
+----+--------+-------+-------+-------+----------+---------+------+---------+------+----------+----------------+
| 1 | SIMPLE | n | NULL | range | PRIMARY | coord | 34 | NULL | 2 | 100.00 | Using where; |
| | | | | | ,coord | | | | | | Using filesort |
| 1 | SIMPLE | n_t1 | NULL | ref | PRIMARY | PRIMARY | 8 | osm.n. | 3 | 1.41 | Using where |
| | | | | | | | | node_id | | | |
| 1 | SIMPLE | n_t2 | NULL | ref | PRIMARY | PRIMARY | 8 | osm.n. | 3 | 1.41 | Using where |
| | | | | | | | | node_id | | | |
+----+--------+-------+-------+-------+----------+---------+------+---------+------+---------------------------+
3 rows in set (0.00 sec)
Better Filtering
mysql-8.0 (osm) > SHOW STATUS LIKE 'Hand%';
Not using the index: Using the BTREE index: Using the SPATIAL index:
+----------------------------+--------+ +----------------------------+--------+ +----------------------------+-------+
| Variable_name | Value | | Variable_name | Value | | Variable_name | Value |
+----------------------------+--------+ +----------------------------+--------+ +----------------------------+-------+
| Handler_commit | 1 | | Handler_commit | 1 |
| Handler_commit | 1 | | Handler_delete | 0 | | Handler_delete | 0 |
| Handler_delete | 0 | | Handler_discover | 0 | | Handler_discover | 0 |
| Handler_discover | 0 | | Handler_external_lock | 6 | | Handler_external_lock | 6 |
| Handler_external_lock | 6 | | Handler_mrr_init | 0 | | Handler_mrr_init | 0 |
| Handler_mrr_init | 0 | | Handler_prepare | 0 | | Handler_prepare | 0 |
| Handler_read_first | 0 | | Handler_read_first | 0 |
| Handler_prepare | 0 | | Handler_read_key | 274 | | Handler_read_key | 522 |
| Handler_read_first | 1 | | Handler_read_last | 0 | | Handler_read_last | 0 |
| Handler_read_key | 1914 | | Handler_read_next | 246540 | | Handler_read_next | 5254 |
| Handler_read_last | 0 | | Handler_read_prev | 0 | | Handler_read_prev | 0 |
| Handler_read_next | 1954 | | Handler_read_rnd | 0 | | Handler_read_rnd | 259 |
| Handler_read_rnd_next | 0 | | Handler_read_rnd_next | 0 |
| Handler_read_prev | 0 | | Handler_rollback | 0 | | Handler_rollback | 0 |
| Handler_read_rnd | 1 | | Handler_savepoint | 0 | | Handler_savepoint | 0 |
| Handler_read_rnd_next | 833426 | | Handler_savepoint_rollback | 0 | | Handler_savepoint_rollback | 0 |
| Handler_rollback | 0 | | Handler_update | 0 | | Handler_update | 0 |
| Handler_write | 0 | | Handler_write | 0 |
| Handler_savepoint | 0 | +----------------------------+--------+ +----------------------------+-------+
| Handler_savepoint_rollback | 0 | 18 rows in set (0.00 sec) 18 rows in set (0.00 sec)
| Handler_update | 0 |
| Handler_write | 1 |
+----------------------------+--------+
18 rows in set (0.00 sec)
© 2018 Jaime Crespo. http://jynus.com. License: CC-BY-SA-4.0 186
Query Optimization with MySQL 8.0 and MariaDB 10.3: The Basics
Geohash Functions
mysql (osm) > SELECT ST_GeoHash(@lon, @lat, 10);
+----------------------------+
| ST_GeoHash(@lon, @lat, 10) |
+----------------------------+
| u0yjh79gr9 |
+----------------------------+
1 row in set (0.00 sec)
• Useful to index coordinates with a BTREE
– It could be specially useful combined with indexed STORED
columns (emulating quadtrees)
GeoJSON Functions
mysql (osm) > SELECT nm.v, ST_AsGeoJson(n.coord)
FROM node_tags n_t
JOIN nodes n USING (node_id, version)
JOIN node_tags nm USING (node_id, version)
WHERE n_t.k='tourism' AND
n_t.v='attraction' AND
nm.k='name';
+-------------------------+------------------------------------------------------------+
| v | ST_AsGeoJson(n.coord) |
+-------------------------+------------------------------------------------------------+
| Schießbuckel | {"type": "Point", "coordinates": [8.4938556, 49.6351754]} |
| AKW Informationszentrum | {"type": "Point", "coordinates": [8.4175585, 49.7060227]} |
| Burg Gleiberg | {"type": "Point", "coordinates": [8.6344314, 50.6150329]} |
| Brüderkirche | {"type": "Point", "coordinates": [9.5042975, 51.3149351]} |
| Römer | {"type": "Point", "coordinates": [8.6816587, 50.1104684]} |
.
.
http://geojsonlint.com/
Older Issues
• Before MySQL 8.0, SRID could be set and retrieved, but all
operations were done in squared euclidean coordinates:
mysql-5.7.5 (osm) > SET @p1 := GeomFromText('POINT(8 50)', 4326); mysql-5.7.5 (osm) > SELECT st_distance(@p1, @p2);
Query OK, 0 rows affected (0.00 sec) +-----------------------+
| st_distance(@p1, @p2) |
+-----------------------+
mysql-5.7.5 (osm) > SET @p2 := GeomFromText('POINT(7 50)', 4326); | 1 |
Query OK, 0 rows affected (0.00 sec) +-----------------------+
1 row in set (0.00 sec)
mysql-5.7.5 (osm) > SET @p3 := GeomFromText('POINT(8 51)', 4326);
Query OK, 0 rows affected (0.00 sec) mysql-5.7.5 (osm) > SELECT st_distance(@p1, @p3);
+-----------------------+
mysql-5.7.5 (osm) > SELECT srid(@p1); | st_distance(@p1, @p3) |
+-----------------------+
+-----------+ | 1 |
| srid(@p1) | +-----------------------+
+-----------+ 1 row in set (0.00 sec)
| 4326 |
+-----------+
1 row in set (0.00 sec)
JSON functions
●
MySQL includes almost all functions to
manipulate JSON that you may think of:
– Validation test: JSON_TYPE
– Object creation: JSON_ARRAY, JSON_MERGE, ...
– Searching: JSON_EXTRACT
– Modifying: JSON_SET, JSON_INSERT, ...
Indexing JSON
●
JSON Columns cannot be indexed:
mysql [localhost] {msandbox} (test) > ALTER TABLE
json_test ADD INDEX(content);
ERROR 3152 (42000): JSON column 'content' cannot be
used in key specification.
●
However, they can be compared with regular
fields and use indexes thanks to virtual columns
Benchmarks
●
Do not trust first party
benchmarks
– In fact, do not trust 3rd
party benchmarks either
●
Only care about the
performance of your
application running on your hardware
Q&A
Not to Miss
●
Operations track:
TLS for MySQL at Large Scale:
How we do relational data on-
the-wire encryption at the
Wikimedia Foundation
●
Do you want to do query
optimization for a website with
20 Billion views per month?
https://wikimediafoundation.org/about/jobs/