Practical Distributed Processing Using MySQL Built-In Functionality Presentation
Practical Distributed Processing Using MySQL Built-In Functionality Presentation
Who Am I
Database Analyst, radian6 Several Years Experience:
Scope
MySQL concepts used in this system How I solved our particular problem Practical set-up of our distributed system
Scope NOT
Complete course on distributed processing Complete coverage of related MySQL functionality
The Problem
Influencer calculation: complex and getting more complex More and more data More and more customers and profiles Not scalable to run the stored procedure on just the main database
Design Goals
Move calculation off of main database Keep existing stored procedure No single controller Horizontally scalable!
Influencer calculation
Whos most influential? (For this topic.)
sphinx index
Replication
Master Database Replica Database
binlogs
INSERT... UPDATE... INSERT... DELETE...
relay logs
INSERT... UPDATE... INSERT... DELETE...
Replication
Master Database
server_id = 1 log_bin = /store/log/mysql-bin grant replication slave on *.* to 'repl'@'%' identified by 'secret'; /store/log/mysql-bin.000001
binlogs
INSERT... UPDATE... INSERT... DELETE...
Replication
server_id = 2 relaylog = /store/relaylog/relaylog
change master to master_host="masterserver", master_port=3306, master_user="repl", master_password="secret", master_log_file="mysql-bin.000001", master_log_pos=0; start slave;
Replica Database
relay logs
INSERT... UPDATE... INSERT... DELETE...
/store/relaylog/relaylog.000001
> create table t (a int) engine=blackhole; Query OK, 0 rows affected (0.02 sec)
> insert into t values (1); Query OK, 1 row affected (0.00 sec) > select * from t; Empty set (0.00 sec)
Before-Insert trigger
create trigger t_bi before insert on t for each row insert into real_table values (new.a);
real table
Federated Tables
table `t`
Federated Tables
table `t`
Public-Key Authentication
Public-Key Authentication
worker
main db
# vi /home/mysql/.ssh/authorized_keys
worker db
worker 1
worker db worker 1
Recalc -Reqd Full No 103 Grapes No 104 Bananas Partial 105 Pears Full
Recalc -Date Priority Running 10:00am high 101 8:00am 0 7:30am 0 10:10am high 302 10:05am low 0
Processing Cycle
Get work... Get next Topic to do
no
sleep
Processing Cycle
Is data current? Let replication catch up!
replication stopped?
yes
sleep 10 min.
replication behind?
yes
sleep 1 min.
Processing Cycle
Replace stale Topic Profile someone else likely has it...
yes
Processing Cycle
Playing Nice with Others
mark Topic with my ID
no
sleep 1 min.
begin again
Processing Cycle
run the stored proc
success ? yes
no
sleep 1 min.
begin again
Processing Cycle
flush the data table for this Profile (locally)
flush the data table for this Profile (on main db)
Processing Cycle
done
Code who am I?
NODEID_QUERY="select @@server_id-1000;" INSTANCE=$2 NODEID=`$MYSQL -e"$NODEID_QUERY"` UNIQUE_INSTANCEID=$((NODEID*100+INSTANCE))
done
Launch Script
Shell script to launch workers On
Off
sends a signal to each worker stops after this stored proc call
Kill
abort and clean up database
Monitor Scripts
Replication Disk space Errors from the worker script
Design Decisions
SSD Storage
Development Directions
Migrate stored proc to Java More workers Problems in Code not atomic:
Take Away...
Replication and Federated tables aren't hard. Thousands of MyISAM tables isn't crazy.
Distributed
Thank you!
Bob Burgess bob.burgess@radian6.com