Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Geir Høydalsvik, MySQL Engineering
Pre-FOSDEM Days, Brussels,
January 30, 2020
Simplifying MySQL
The State of the Source
Copyright © 2020 Oracle and/or its affiliates1
Safe harbor statement
The following is intended to outline our general product direction. It is intended for information
purposes only, and may not be incorporated into any contract. It is not a commitment to deliver
any material, code, or functionality, and should not be relied upon in making purchasing
decisions.
The development, release, timing, and pricing of any features or functionality described for
Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
Copyright © 2020 Oracle and/or its affiliates2
Prologue
Copyright © 2020 Oracle and/or its affiliates3
2008 @ Sun Microsystems
• Emerging concensus about the overall MySQL direction:
• Modularize, interfaces
• Remove functionality from core, e.g. Query cache
• Add functionality as plugins
• Transactional storage engines, i.e. InnoDB
• UTF8 only
• Modernizing the tool chain
Copyright © 2020 Oracle and/or its affiliates4
But two active projects…
• MySQL
• Slow, organic change
• Compatibility
• Respect the user base
• Drizzle
• Radical change
• Break compatibility
• Force user base to adapt
Copyright © 2020 Oracle and/or its affiliates5
Drizzle database server
“Drizzle is a re-designed version of the MySQL v6.0 codebase and is designed around a central
concept of having a microkernel architecture. Features such as the
query cache and authentication system are now plugins to the database, which follow the
general theme of "pluggable storage engines" that were introduced in MySQL 5.1. It
supports PAM, LDAP, and HTTP AUTH for authentication via plugins it ships. Via its plugin
system it currently supports logging to files, syslog, and remote services such
as RabbitMQ and Gearman. Drizzle is an ACID-compliant relational database that supports
transactions via an MVCC design.[14]”
[2008-2011, Brian Aker et. al., from https://en.wikipedia.org/wiki/Drizzle_(database_server)]
Copyright © 2020 Oracle and/or its affiliates6
Let's rewrite everything from scratch (Drizzle eulogy)
Earlier this year I performed my last act as Drizzle liaison to SPI by requesting that Drizzle be
removed from the list of active SPI member projects and that about 6000 USD of donated funds
be moved to the SPI general fund.
...
I usually have a strong preference for fixing the production code and rejecting the lure of
rewriting everything from scratch. Forcing developers to work on code that is actually running
in somebody's production, helps keep everyone honest.
[Henrik Ingo: https://openlife.cc/blogs/2019/december/lets-rewrite-everything-scratch-
drizzle-eulogy]
Copyright © 2020 Oracle and/or its affiliates7
The 10 last years of MySQL development
• Refactoring
• Modularizing, interfaces
• Remove functionality from core, e.g. Query cache
• Add functionality as plugins
• Transactional storage engines, i.e. InnoDB
• UTF8
• Modernizing the tool chain
• ...and much more...
Copyright © 2020 Oracle and/or its affiliates8
Optimizer
Copyright © 2020 Oracle and/or its affiliates9
Character Sets and Collations
• UTF8MB4 is the default in 8.0
• Unicode 9.0 collations
• 4 byte comparisons
• A major rewrite and optimization effort
• Moving towards UTF8 only
Copyright © 2020 Oracle and/or its affiliates10
From SQL (input) to results (output)
Parse phase
• Parse step
• Contextualization step
• Abstract Syntax Tree
Optimize phase
• Range optimizer
• Join optimizer
• Physical plan
Prepare phase
• Resolve step
• Transform step
• Logical Plan
Execute phase
• Produce iterator tree
• Volcano iterator model
• Resultset
Copyright © 2020 Oracle and/or its affiliates11
• Started ~10 years ago
• Considered finished now
• A clear separation between query processing phases
• Fixed a large number of bugs
• Improved stability
• Faster feature development
• Fewer surprises and complications during
development
Parse
Prepare
Optimize
Execute
SQL
Resolve
Transform
Abstract syntax tree
Logical plan
Physical plan
MySQL refactoring: Separating phases
Copyright © 2020 Oracle and/or its affiliates12
MySQL refactoring: Parsing and preparing
• Still ongoing
• Implemented piece by piece
• Separating parsing and resolving phases
• Eliminate semantic actions that do too much
• Get a true bottom-up parser
• Makes it easier to extend with new SQL syntax
• Parsing doesn't have unintended side effects
• Consistent name and type resolving
• Names resolved top-down
• Types resolved bottom-up
• Transformations done in the prepare phase
• Bottom-up
Parse
Prepare
Optimize
Execute
SQL
Resolve
Transform
Abstract syntax tree
Logical plan
Physical plan
Copyright © 2020 Oracle and/or its affiliates13
• Volcano iterator model
• Possible because phases were separated
• Ongoing for ~1,5 year
• Much more modular exeuctor
• Common iterator interface for all operations
• Each operation is contained within an iterator
• Able to put together plans in new ways
• Immediate benefit: Removes temporary tables in some cases
• Join is just an iterator
• Nested loop join is just an iterator
• Hash join is just an iterator
• Your favorite join method is just an iterator
Parse
Prepare
Optimize
Execute
SQL
Resolve
Transform
Abstract syntax tree
Logical plan
Physical plan
MySQL refactoring: Iterator executor
Copyright © 2020 Oracle and/or its affiliates14
Old executor
Nested loop focused
Hard to extend
Code for one operation spread out
Different interfaces for each operation
Combination of operations hard coded
Iterator executor
Modular
Easy to extend
Each iterator encapsulates one operation
Same interface for all iterators
All operations can be connected
Old MySQL executor vs. iterator executor
Copyright © 2020 Oracle and/or its affiliates15
• EXPLAIN FORMAT=TREE
• Print the iterator tree
• EXPLAIN ANALYZE
• Insert intstrumentation nodes in the tree
• Execute the query
• Print the iterator tree
• Hash join
• Just another iterator type
Parse
Prepare
Optimize
Execute
SQL
Resolve
Transform
Abstract syntax tree
Logical plan
Physical plan
MySQL 8.0 features based on the iterator executor
Copyright © 2020 Oracle and/or its affiliates16
InnoDB
Copyright © 2020 Oracle and/or its affiliates17
Scaling: Fixing the InnoDB RW-lock
"I enjoyed reading the source code -- it is well written with useful
comments. [...] the code has aged well and been significantly improved
by the InnoDB team."
[Mark Callaghan, http://smalldatum.blogspot.com/2019/12/fixing-innodb-rw-
lock.html]
Copyright © 2020 Oracle and/or its affiliates18
InnoDB Refactoring
• The IO layer has been rewritten
• The B-Tree has been rewritten
• The redo log has been redesigned and rewritten
• The lock manager is undergoing a redesign and rewrite
• BLOB handling has been rewritten
Copyright © 2020 Oracle and/or its affiliates19
Other Storage Engines
• Added TempTable SE
• Replaced Memory SE
• Added Performance Schema SE
• MySQL no longer depends upon MyISAM (becomes optional)
• MySQL becomes transactional all the way
• MySQL will assume transactional, thus also transactional
Storage Engines in the future
Copyright © 2020 Oracle and/or its affiliates20
Data Dictionary
Copyright © 2020 Oracle and/or its affiliates21
MySQL Data Dictionary before MySQL 8.0
Copyright © 2020 Oracle and/or its affiliates
Data Dictionary
Files
FRM TRG OPT
System Tables (mysql.)
user procevents
InnoDB System Tables
MyISAM
File system
InnoDB
SQL
22
Transactional Data Dictionary in MySQL 8.0
Copyright © 2020 Oracle and/or its affiliates23
Data Dictionary
InnoDB
SQL
DD TableDD TableDD TableDD Table
Atomic DDL in MySQL 8.0
• Ensure we have a consistent state after a DDL operation
• Prevent slave drift
• Prevent internal inconsistencies
• Possible due to transactional storage of meta data
• Enables implemention of crash-safe DDL
• Enables addressing longstanding issues
Copyright © 2020 Oracle and/or its affiliates24
NEW INFORMATION_SCHEMA in MySQL 8.0
Copyright © 2020 Oracle and/or its affiliates25
MySQL Client
I_S Query Results
MySQL Server
Optimizer prepares
execution plan.
Executor reads metadata from
data dictionary tables.
Return rows to user.
MySQL Client
I_S Query Results
MySQL Server
Create temporary table.
Heuristic optimization.
Read metadata from File system or
from MyISAM/InnoDB engine.
.
TEMP TABLE
Return rows to user.
File system / MyISAM
/ InnoDB engine
5.x 8.0
Services and Components
Copyright © 2020 Oracle and/or its affiliates26
Service Infrastructure
We have decided to abandon the old plugin infrastructure and replace it
with the new service infrastructure.
A service is a named interface, a component can implement one or
many services and will announce its services in the registry.
Each component can use all services available in the registry, including
the ones provided by other components.
Copyright © 2020 Oracle and/or its affiliates27
Minimal Chassis
• Linked library
• Independent of the Server
• Service Registry
• Load and Unload Components
Copyright © 2020 Oracle and/or its affiliates28
Services provided by the Server
• Existing:
• UDF registration services
• System variable services
• Performance Schema services
• Future:
• Keyring service
• Authentication service
• SQL service (session, transaction, statement)
Copyright © 2020 Oracle and/or its affiliates29
Components
• Existing:
• Validate password component
• Error logging components
• Soon:
• Keyring
• Future:
• Make all plugins components
• E.g. X Plugin and Replication components
Copyright © 2020 Oracle and/or its affiliates30
Replication
Copyright © 2020 Oracle and/or its affiliates31
Replication Refactoring
• Binary log decoder as a library (binlogevents)
• Binary log commit pipeline (group commit)
• New replication metadata infrastructure (store data in
InnoDB tables)
• Binary log storage layer
• Replication dump thread refactored and performance
optimizations
• Reduced replication receiver/applier contentions to
improve perf.
Copyright © 2020 Oracle and/or its affiliates32
Replication Refactoring
• Mirco-kernel, Interfaces:
• Interfaces for server, transaction and statement lifecycles
• Message Delivery Service, Membership Service
• libmysqlgcs: standalone library for group communication based on
paxos
• Binary log storage interfaces
Copyright © 2020 Oracle and/or its affiliates33
Replication Refactoring
• Remove functionality from core:
• Removed old binary log events support
• Removed old binary log conversion procedures
• Add functionality as plugins and modules/services:
• Group Replication
• GCS based services: Membership and Message Delivery
Copyright © 2020 Oracle and/or its affiliates34
Multi-server Environments
Copyright © 2020 Oracle and/or its affiliates35
MySQL Shell & Router
• Move functionality from server to other components, such
as removing Query Cache away from the server and into
the Router
• MySQL Shell for developer ease of use, flexibility, and
power
Copyright © 2020 Oracle and/or its affiliates36
Tool Chain
Copyright © 2020 Oracle and/or its affiliates37
MySQL Modernized the tool chain
• C++ 14 , Use of standard constructs, e.g. std::atomic
• Cleaning up header files dependencies, e.g. my_global.h
gone
• Warning-free with GCC 8 and Clang 6
• Asan and Ubsan clean
• Google C++ Style Guide
• MySQL Source Code Documentation
Copyright © 2020 Oracle and/or its affiliates38
https://mysqlserverteam.com/mysql-8-0-source-code-improvements/
Summary
Copyright © 2020 Oracle and/or its affiliates39
Summary
• MySQL and Drizzle had similar goals, but chose different
approaches
• MySQL has changed in a way and in a pace which is
acceptable to the user base
• Each and every step is regression tested and validated in
production environments
• (Can it be challenging? Yes. Is there an alternative? No)
Copyright © 2020 Oracle and/or its affiliates40
MySQL on Social Media
Copyright © 2020 Oracle and/or its affiliates41
https://www.facebook.com/mysql
https://twitter.com/mysql
https://www.linkedin.com/company/mysql
MySQL Community on Slack
Copyright © 2020 Oracle and/or its affiliates42
https://lefred.be/mysql-community-on-slack/
Simplifying MySQL, Pre-FOSDEM MySQL Days, Brussels, January 30, 2020.

More Related Content

Simplifying MySQL, Pre-FOSDEM MySQL Days, Brussels, January 30, 2020.

  • 1. Geir Høydalsvik, MySQL Engineering Pre-FOSDEM Days, Brussels, January 30, 2020 Simplifying MySQL The State of the Source Copyright © 2020 Oracle and/or its affiliates1
  • 2. Safe harbor statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation. Copyright © 2020 Oracle and/or its affiliates2
  • 3. Prologue Copyright © 2020 Oracle and/or its affiliates3
  • 4. 2008 @ Sun Microsystems • Emerging concensus about the overall MySQL direction: • Modularize, interfaces • Remove functionality from core, e.g. Query cache • Add functionality as plugins • Transactional storage engines, i.e. InnoDB • UTF8 only • Modernizing the tool chain Copyright © 2020 Oracle and/or its affiliates4
  • 5. But two active projects… • MySQL • Slow, organic change • Compatibility • Respect the user base • Drizzle • Radical change • Break compatibility • Force user base to adapt Copyright © 2020 Oracle and/or its affiliates5
  • 6. Drizzle database server “Drizzle is a re-designed version of the MySQL v6.0 codebase and is designed around a central concept of having a microkernel architecture. Features such as the query cache and authentication system are now plugins to the database, which follow the general theme of "pluggable storage engines" that were introduced in MySQL 5.1. It supports PAM, LDAP, and HTTP AUTH for authentication via plugins it ships. Via its plugin system it currently supports logging to files, syslog, and remote services such as RabbitMQ and Gearman. Drizzle is an ACID-compliant relational database that supports transactions via an MVCC design.[14]” [2008-2011, Brian Aker et. al., from https://en.wikipedia.org/wiki/Drizzle_(database_server)] Copyright © 2020 Oracle and/or its affiliates6
  • 7. Let's rewrite everything from scratch (Drizzle eulogy) Earlier this year I performed my last act as Drizzle liaison to SPI by requesting that Drizzle be removed from the list of active SPI member projects and that about 6000 USD of donated funds be moved to the SPI general fund. ... I usually have a strong preference for fixing the production code and rejecting the lure of rewriting everything from scratch. Forcing developers to work on code that is actually running in somebody's production, helps keep everyone honest. [Henrik Ingo: https://openlife.cc/blogs/2019/december/lets-rewrite-everything-scratch- drizzle-eulogy] Copyright © 2020 Oracle and/or its affiliates7
  • 8. The 10 last years of MySQL development • Refactoring • Modularizing, interfaces • Remove functionality from core, e.g. Query cache • Add functionality as plugins • Transactional storage engines, i.e. InnoDB • UTF8 • Modernizing the tool chain • ...and much more... Copyright © 2020 Oracle and/or its affiliates8
  • 9. Optimizer Copyright © 2020 Oracle and/or its affiliates9
  • 10. Character Sets and Collations • UTF8MB4 is the default in 8.0 • Unicode 9.0 collations • 4 byte comparisons • A major rewrite and optimization effort • Moving towards UTF8 only Copyright © 2020 Oracle and/or its affiliates10
  • 11. From SQL (input) to results (output) Parse phase • Parse step • Contextualization step • Abstract Syntax Tree Optimize phase • Range optimizer • Join optimizer • Physical plan Prepare phase • Resolve step • Transform step • Logical Plan Execute phase • Produce iterator tree • Volcano iterator model • Resultset Copyright © 2020 Oracle and/or its affiliates11
  • 12. • Started ~10 years ago • Considered finished now • A clear separation between query processing phases • Fixed a large number of bugs • Improved stability • Faster feature development • Fewer surprises and complications during development Parse Prepare Optimize Execute SQL Resolve Transform Abstract syntax tree Logical plan Physical plan MySQL refactoring: Separating phases Copyright © 2020 Oracle and/or its affiliates12
  • 13. MySQL refactoring: Parsing and preparing • Still ongoing • Implemented piece by piece • Separating parsing and resolving phases • Eliminate semantic actions that do too much • Get a true bottom-up parser • Makes it easier to extend with new SQL syntax • Parsing doesn't have unintended side effects • Consistent name and type resolving • Names resolved top-down • Types resolved bottom-up • Transformations done in the prepare phase • Bottom-up Parse Prepare Optimize Execute SQL Resolve Transform Abstract syntax tree Logical plan Physical plan Copyright © 2020 Oracle and/or its affiliates13
  • 14. • Volcano iterator model • Possible because phases were separated • Ongoing for ~1,5 year • Much more modular exeuctor • Common iterator interface for all operations • Each operation is contained within an iterator • Able to put together plans in new ways • Immediate benefit: Removes temporary tables in some cases • Join is just an iterator • Nested loop join is just an iterator • Hash join is just an iterator • Your favorite join method is just an iterator Parse Prepare Optimize Execute SQL Resolve Transform Abstract syntax tree Logical plan Physical plan MySQL refactoring: Iterator executor Copyright © 2020 Oracle and/or its affiliates14
  • 15. Old executor Nested loop focused Hard to extend Code for one operation spread out Different interfaces for each operation Combination of operations hard coded Iterator executor Modular Easy to extend Each iterator encapsulates one operation Same interface for all iterators All operations can be connected Old MySQL executor vs. iterator executor Copyright © 2020 Oracle and/or its affiliates15
  • 16. • EXPLAIN FORMAT=TREE • Print the iterator tree • EXPLAIN ANALYZE • Insert intstrumentation nodes in the tree • Execute the query • Print the iterator tree • Hash join • Just another iterator type Parse Prepare Optimize Execute SQL Resolve Transform Abstract syntax tree Logical plan Physical plan MySQL 8.0 features based on the iterator executor Copyright © 2020 Oracle and/or its affiliates16
  • 17. InnoDB Copyright © 2020 Oracle and/or its affiliates17
  • 18. Scaling: Fixing the InnoDB RW-lock "I enjoyed reading the source code -- it is well written with useful comments. [...] the code has aged well and been significantly improved by the InnoDB team." [Mark Callaghan, http://smalldatum.blogspot.com/2019/12/fixing-innodb-rw- lock.html] Copyright © 2020 Oracle and/or its affiliates18
  • 19. InnoDB Refactoring • The IO layer has been rewritten • The B-Tree has been rewritten • The redo log has been redesigned and rewritten • The lock manager is undergoing a redesign and rewrite • BLOB handling has been rewritten Copyright © 2020 Oracle and/or its affiliates19
  • 20. Other Storage Engines • Added TempTable SE • Replaced Memory SE • Added Performance Schema SE • MySQL no longer depends upon MyISAM (becomes optional) • MySQL becomes transactional all the way • MySQL will assume transactional, thus also transactional Storage Engines in the future Copyright © 2020 Oracle and/or its affiliates20
  • 21. Data Dictionary Copyright © 2020 Oracle and/or its affiliates21
  • 22. MySQL Data Dictionary before MySQL 8.0 Copyright © 2020 Oracle and/or its affiliates Data Dictionary Files FRM TRG OPT System Tables (mysql.) user procevents InnoDB System Tables MyISAM File system InnoDB SQL 22
  • 23. Transactional Data Dictionary in MySQL 8.0 Copyright © 2020 Oracle and/or its affiliates23 Data Dictionary InnoDB SQL DD TableDD TableDD TableDD Table
  • 24. Atomic DDL in MySQL 8.0 • Ensure we have a consistent state after a DDL operation • Prevent slave drift • Prevent internal inconsistencies • Possible due to transactional storage of meta data • Enables implemention of crash-safe DDL • Enables addressing longstanding issues Copyright © 2020 Oracle and/or its affiliates24
  • 25. NEW INFORMATION_SCHEMA in MySQL 8.0 Copyright © 2020 Oracle and/or its affiliates25 MySQL Client I_S Query Results MySQL Server Optimizer prepares execution plan. Executor reads metadata from data dictionary tables. Return rows to user. MySQL Client I_S Query Results MySQL Server Create temporary table. Heuristic optimization. Read metadata from File system or from MyISAM/InnoDB engine. . TEMP TABLE Return rows to user. File system / MyISAM / InnoDB engine 5.x 8.0
  • 26. Services and Components Copyright © 2020 Oracle and/or its affiliates26
  • 27. Service Infrastructure We have decided to abandon the old plugin infrastructure and replace it with the new service infrastructure. A service is a named interface, a component can implement one or many services and will announce its services in the registry. Each component can use all services available in the registry, including the ones provided by other components. Copyright © 2020 Oracle and/or its affiliates27
  • 28. Minimal Chassis • Linked library • Independent of the Server • Service Registry • Load and Unload Components Copyright © 2020 Oracle and/or its affiliates28
  • 29. Services provided by the Server • Existing: • UDF registration services • System variable services • Performance Schema services • Future: • Keyring service • Authentication service • SQL service (session, transaction, statement) Copyright © 2020 Oracle and/or its affiliates29
  • 30. Components • Existing: • Validate password component • Error logging components • Soon: • Keyring • Future: • Make all plugins components • E.g. X Plugin and Replication components Copyright © 2020 Oracle and/or its affiliates30
  • 31. Replication Copyright © 2020 Oracle and/or its affiliates31
  • 32. Replication Refactoring • Binary log decoder as a library (binlogevents) • Binary log commit pipeline (group commit) • New replication metadata infrastructure (store data in InnoDB tables) • Binary log storage layer • Replication dump thread refactored and performance optimizations • Reduced replication receiver/applier contentions to improve perf. Copyright © 2020 Oracle and/or its affiliates32
  • 33. Replication Refactoring • Mirco-kernel, Interfaces: • Interfaces for server, transaction and statement lifecycles • Message Delivery Service, Membership Service • libmysqlgcs: standalone library for group communication based on paxos • Binary log storage interfaces Copyright © 2020 Oracle and/or its affiliates33
  • 34. Replication Refactoring • Remove functionality from core: • Removed old binary log events support • Removed old binary log conversion procedures • Add functionality as plugins and modules/services: • Group Replication • GCS based services: Membership and Message Delivery Copyright © 2020 Oracle and/or its affiliates34
  • 35. Multi-server Environments Copyright © 2020 Oracle and/or its affiliates35
  • 36. MySQL Shell & Router • Move functionality from server to other components, such as removing Query Cache away from the server and into the Router • MySQL Shell for developer ease of use, flexibility, and power Copyright © 2020 Oracle and/or its affiliates36
  • 37. Tool Chain Copyright © 2020 Oracle and/or its affiliates37
  • 38. MySQL Modernized the tool chain • C++ 14 , Use of standard constructs, e.g. std::atomic • Cleaning up header files dependencies, e.g. my_global.h gone • Warning-free with GCC 8 and Clang 6 • Asan and Ubsan clean • Google C++ Style Guide • MySQL Source Code Documentation Copyright © 2020 Oracle and/or its affiliates38 https://mysqlserverteam.com/mysql-8-0-source-code-improvements/
  • 39. Summary Copyright © 2020 Oracle and/or its affiliates39
  • 40. Summary • MySQL and Drizzle had similar goals, but chose different approaches • MySQL has changed in a way and in a pace which is acceptable to the user base • Each and every step is regression tested and validated in production environments • (Can it be challenging? Yes. Is there an alternative? No) Copyright © 2020 Oracle and/or its affiliates40
  • 41. MySQL on Social Media Copyright © 2020 Oracle and/or its affiliates41 https://www.facebook.com/mysql https://twitter.com/mysql https://www.linkedin.com/company/mysql
  • 42. MySQL Community on Slack Copyright © 2020 Oracle and/or its affiliates42 https://lefred.be/mysql-community-on-slack/