Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (3 votes)
264 views

Lessons Learned Building A Web 2.0 Application Using Mysql

This document was automatically uploaded to Scribd as part of the email thread "MySQL Memcached Grazr PDF".

Uploaded by

Oleksiy Kovyrin
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
264 views

Lessons Learned Building A Web 2.0 Application Using Mysql

This document was automatically uploaded to Scribd as part of the email thread "MySQL Memcached Grazr PDF".

Uploaded by

Oleksiy Kovyrin
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Lessons learned building a Web 2.

0 Application
using MySQL
Who are we?
• We’re missing a vowl (must be web
2.0!)
• 7 full-time employees
• Several ex-Andover.net (Slashdot) guys
including our CEO
• Boston based but distributed dev
• Been in existence about 2 years
What do we do?
• We’re all about
• Create and share reading lists
• Merge and filter feeds
• Publish widgets
• Future: Do more advanced stuff with
feeds
Why “Grazr”
“I’m actually coming to the conclusion that the
whole subscriptions mindset is a problem and that
in future we’ll ‘graze’‚ for the most part instead of
subscribing. As Zigbee sensors, RFID chips and
GPS trackers proliferate we’ll be drowning in an
RSS-everywhere world if we don’t change our
approach.”
James Corbett “Eirepreneur”
http://eirepreneur.blogs.com
January 2006
http://grazr.com
http://vibemetrix.com
The heart of our
system:
Lots of Mistakes
• Warning: Some of these lessons will
seem obvious
• Hindsight is 20/20
• We *did* have reasons for many of
these at the time (good ones?)
• Tried a lot of experimental
configurations
Lesson 1:
Beware arch. momentum
• Early system decisions affected later
architecture, even when abandoned
• Careful about exotic w/out good reason
• db read on our feed processing web
boxes
• The traditional setup, splitting MySQL
from Apache = system much happier
Lesson 2:
Scaling
• “Don’t worry about scaling” - 37 signals
• “You must build for scaling or die!” -
Friendster
• The truth? Somewhere in the middle
• Understand growth and scaling patterns
but don’t build up front
• Your scaling plan: wrong in some way
Overemphasis on scaling
• Two hosted data centers + one
test/backup data center
• Geographically separated
• Over provisioned
• 18 servers, architecture mirrored
• Traffic could have been served from
two or three machines
“Skynet Jr.”
Lesson 3:
Limits of Testing
• Startup reality: not enough time for
thorough testing
• Replication testing and simulation
• Speed good enough, even cross
country
• Problem: real world system behaved
differently
Lesson 4:
Replication is fast, until it isn’t
• We knew better: but empirical testing
seemed OK
• Asynchronous nature of repl.
sometimes hard to code, careful with
state
• Some of our code treated it as
synchronous (Fail!)
• Smarter code was slow (retries, polls)
Lesson 5:
Memcached is your friend
• Excellent tool in the scaling toolbox
• Classic cache: limiting touching your
database is good!
• Added benefit: on top of repl.
synchronizer
• Good temporary storage for async.
proc.
Lesson 6:
Sphinx is your other friend
• FULLTEXT was too slow for amount of
data
• Sphinx works together with MySQL for
complete search solution
• Use Sphinx to obtain your primary keys
of what you are searching for!
• Use Sphinx for ordering
• Made it possible to switch to InnoDB!
• Joins with Sphinx storage engine
Lesson 7:
Bulk insert / lazy write
• Obvious: If you don’t need it now, do it
later
• Disconnected / async good in these
cases
• If you can do it later, glom many
together (bulk)
• Much better write perf.
Lesson 8:
User experience vs. Scaling
• Emphasis on scaling hurt user
experience
• Characterize lazy vs. user affecting
transactions
• Fast, data correct transactions = single
data store (no read/write io split) or a
sync buffer (memcached)
Lesson 9:
Instrumentation
• Visibility into system good
• More data = better
• Non live testing != reality
• Scaling is iterative process, requires
feedback loop
• Nagios, Cacti, SHOW GLOBAL
STATUS
Lesson 10:
Try new things
• Best practices are good, but new ideas
sometimes better!
• Memcached as write/bulk buffer, good
results
• UDF, clever replication uses, triggers,
virtual servers, background async.
daemons
• MogileFS (?)
Lesson 11:
Everyone has the same problems
• If you can, re-use!
• Obvious: MySQL, Apache (not re-
writing db’s, webservers)
• Search Engines
• Less obvious: building batch job
processor, others have done this better!
(Gearman)
Lesson 12:
Accept change

• Architecture constantly in flux


• Think ahead, but don’t overthink, design
for now + 1, not now + 100
• Accept change, react, constantly re-
evaluate
Lesson 13:
Listen
• Even amongst team, easy to be
dogmatic
• Listen to everyone’s ideas even when it
challenges your expertise
• Some of our best ideas: the
combination of several approaches
Vibemetrix

• Uses Grazr’s Feed Engine


• Uses Sphinx
• Two months from idea to soft release
• All previous lessons have become
overriding principle!
Vibemetrix
Sphinx Sphinx

Vibemetrix
item_id-s items_dist items_dist
item_id

items results Feed Feed


Database Database
(slave) (slave)
Feed Engine
Sphinx
Feed Engine
1. Widget accesses read server with URL request
(rss/atom/rdf feed or outline/OPML)

2. Check if feed/outline in memcached

If feed/outline is in memcached…

3. URL of feed/outline is accessed with LWP checked


to see if last modified date of feed/outline
Feed Engine
4. If feed/outline has not been modified, serve cached
JSON

If feed has been modified…

5. Content of feed fetched, processed with XML


parser into perl DOM representation

6. perl DOM representation converted to JSON,


JSON served
Feed Engine
If feed/outline is in memcached, but modified (cont)…

7. JSON portion (replaced) stored in memcached

8. perl DOM object stored in memcached, key to object


inserted into table with trigger
Feed Engine
9. trigger calls grazrd daemon with key to perl DOM
object

10. grazrd in turn calls feedprocessor.pl to fetch perl


DOM ref from memcached, deletes from
memcached,

11. then stores feed and it’s components (items,


enclosures, etc) into MySQL
Grazr: Lessons Learned Using
MySQL and Memcached in
Web 2.0 Applications

Patrick Galbraith
Senior Programmer - Grazr, Inc.

Mike Kowalchik
CTO - Grazr, Inc.

Jimmy Guerrero
Sr Product Marketing Manager - Sun Microsystems, Database Group

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 1
• Sun – MySQL Overview
• Brief Introduction to memcached
• “Lessons Learned” - Grazr
• Memcached Solutions from Sun
• Next Steps plus Q & A

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 2
Established & Emerging Companies

Web 2.0

Enterprise 2.0

craigslist
SaaS

Telecom

OEM & ISV

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 3
Introduction to memcached

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 4
What is Memcached?

“A high-performance, distributed memory


object caching system, generic in nature, but
intended for use in speeding up dynamic web
applications by alleviating database load.” *

* http://www.socialtext.net/memcached/index.cgi?faq

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 5
“Cache is King”

• Browser Cache

• Web Server Cache

• Memcached
ms

• MySQL Database Cache

• Disk Storage
Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 6
What is Memcached?

• Created by Danga Interactive to speed up LiveJournal’s 20


million+ dynamic page views per day for 1 million+ users
– Lowered database load
– Faster page loads
– Better resource utilization
– Faster access to databases

• Perfect for dynamic sites that generate high database load

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 7
Why Use Memcached With MySQL?
• Enables massive scale-out of dynamic web-sites
• Faster page loads
• Allows for more efficient use of existing database resources
• Can easily utilize idle computing resources
• Dozens to hundreds of nodes can be supported in a
memcached cluster
• No interconnect or proprietary networking required
• Extensible and customizable

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 8
Who Uses Memcached?

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 9
Memcached Basics
• Community Driven
• Open Source
• Memcached Server released under BSD license
• http://www.danga.com/memcached/download.bml

• Multiple operating systems and architectures supported


• Various Client APIs and libraries available
– Perl, Python, PHP, Ruby, Java, C#, C, Lua, MySQL, more….
– http://www.danga.com/memcached/apis.bml

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 10
Memcached Basics
• Runs wherever RAM is available
– Application, Web, Database or dedicated memcached servers

• Low CPU utilization


• Designed to be massively scalable
• Production support available under MySQL Enterprise
– Gold, Platinum and Unlimited

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 11
What Memcached Isn’t
• Not Reliable/Durable storage
• Not Highly Available
• Not Secure
• Not a database
• Not a database cache

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 12
How Does Memcached Work?
Memcached
• Two-stage hash
• Similar to giant hash table looking up key = value pairs
• Client hashes the key against a list of servers
• When the server is identified, the client sends its request
• Server performs a hash key lookup for the actual data

Hash Function
• A hash is a procedure for turning data into a small integer
that serves as an index into an array
• Speeds up table lookup or data comparison tasks

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 13
Memcached Server
• Written in C
• libevent based
– Asynchronous event notification library

• Server has internal hash table


• Servers know nothing about each other

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 14
Typical Use Case: Read/Pass-Through
• Modify the application so information is read from
memcached
• In the event of a cache miss…
– data is loaded from the database
– written into memcached

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 15
Basic Memcached Example
X Y Z
Client X
mc mc mc
1) set key “1” with value “abc”
2) hashes the key against server list
hash server list get key
3) Server B is selected
select server connect 4) connects to Server B and sets key
connect get value
set key value
Client Z
1) get key “1”
2) connects to Server B
ms ms ms
3) requests “1” and gets value “abc”
A B C
key = value
1 = abc
Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 16
Solutions

Jimmy Guerrero
Sr Product Marketing Manager
Sun Microsystems - Database Group
jimmy@mysql.com

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 17
Memcached for MySQL
• Support is built into your MySQL Enterprise subscription

http://www.mysql.com/products/enterprise/memcached.html

• MySQL Enterprise
– 24x7 Production Support
– Enterprise Monitor
– MySQL Enterprise Server
– Additional Add-ons Available

• MySQL Professional Services


– MySQL Scale, High-Availability and Replication Jumpstart

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 18
Why Memcached with MySQL?
• Enables massive scale-out of dynamic web-sites
• Faster page loads
• Allows for more efficient use of existing database resources
• Can easily utilize idle computing resources
• Dozens to hundreds of nodes can be supported in a
memcached cluster
• No interconnect or proprietary networking required
• Extensible and customizable

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 19
Next Steps
Memcached for MySQL -
http://www.mysql.com/products/enterprise/memcached.html

On-Demand Webinars – Memcached for MySQL: Advanced Use Cases

http://mysql.com/news-and-events/on-demand-webinars/display-od-158.html

Whitepapers - http://www.mysql.com/why-mysql/white-papers/

• “Designing & Implementing Scalable Applications with Memcached and MySQL”


• “How MySQL Powers Web 2.0”
• “Enabling Enterprise 2.0 with MySQL”

Documentation -
http://dev.mysql.com/doc/refman/6.0/en/ha-memcached.html

Discussion Forum - http://forums.mysql.com/list.php?150

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 20
Questions?

Patrick Galbraith
Senior Programmer - Grazr, Inc.

Mike Kowalchik
CTO - Grazr, Inc.

Jimmy Guerrero
Sr Product Marketing Manager - Sun Microsystems, Database Group

Copyright 2008 MySQL AB The World’s Most Popular Open Source Database 21

You might also like