An Overview of How Mysql Helps Power Web 2.0 Technologies and Companies
An Overview of How Mysql Helps Power Web 2.0 Technologies and Companies
0
An overview of how MySQL helps power
Web 2.0 technologies and companies
Conclusion ................................................................................................................................27
Because these applications predominately “live” online, a strong collaborative and collective nature is
being harnessed. Where the web was once a static and passively consumed experience, it is now
dynamic, transactional and interactive, where participation is not optional, it is mandatory.
The companies that are delivering these applications and services are taking advantage of the lowered
market entry points, by making full use of the benefits of open source software running on commodity-off-
the-shelf hardware. This has allowed Web 2.0 companies to meet their capacity and performance
requirements, incrementally over time. It is no surprise a common characteristic of many Web 2.0
websites, applications and companies, is their use of the LAMP (Linux, Apache, MySQL, PHP, Perl &
Python) open source stack. This allows fast-growing sites to deliver performance, scalability and reliability
to millions of users at a fraction of the cost of proprietary databases. MySQL enables up-and-coming Web
2.0 sites like Wikipedia, FeedBurner and digg, - as well as established web properties like Craigslist,
Google and Yahoo! - to scale out and meet the ever-increasing volume of users, transactions and data.
The information presented here will be valuable to entrepreneurs about to create their own Web 2.0
business, existing web properties wishing to bring their applications to the next level, but also to the large
number of enterprises interested in leveraging Web 2.0 technologies. You will also gain an understanding
of how MySQL can be used in conjunction with other open source components to deliver low-cost,
reliable, scalable, high performance Web 2.0 applications.
The term “Web 2.0” was first coined back in 2004 during a brainstorming session between Tim O’Reilly of
O’Reilly Media and MediaLive International, a company which puts on technology tradeshows. The term
was originally intended for use as the name to describe an upcoming conference showcasing new web-
based companies and technologies that had emerged post dot-com bubble. The term “Web 2.0” has
since been dismissed as a marketing buzzword, co-opted and validated several times over by various
individuals and companies. It has typically been used as a way to describe the new technologies and
companies that are revolutionizing the way we use and think about the World Wide Web.
Tim O’Reilly expands further on the definition of the term, in his article “What is Web 2.0”, September 30,
2005:
“Web 2.0 applications are those that make the most of the intrinsic advantages of that
platform: delivering software as a continually updated service that gets better the more
people use it, consuming and remixing data from multiple sources, including individual
In the following sections we delve deeper into four ideas central to the discussion of Web 2.0, which
O’Reilly and others have elaborated on since the initial emergence of the term. They include:
“They should be in the business of providing services not packaged software, while enabling cost
effective scalability.” A key component here is the frequency in which Web 2.0 companies leverage
MySQL Replication in a “scale-out” configuration. Scale-out enables the use of low-cost commodity
servers to increase database performance and scalability, incrementally, at a fraction of the price of
traditional “fork-lift” or “scale-up” methods. This capability is critical for companies who experience
explosive growth and adoption in very short time frames.
“They should also exercise control over unique, difficult to replicate data sources which get richer
the more individuals use and contribute to them.” A site might include a database of customer
reviews and recommendations on products. You could imagine the difficulty of attempting to recreate all
the unique, varied, and unbiased opinions you may find on a particular product. A site like Amazon’s
“Customer’s Reviews” is an everyday example. They “control” the data of customer reviews, which would
be difficult to replicate that without the same customer traffic and level of participation Amazon’s
customers engage in. Another similar use can be found in eBay’s “Seller Ratings”. For many companies,
this competency revolves around controlling data that for competitors is prohibitive to replicate due to the
licensing costs from private data providers or an inability to engage users to “create” the data.
“Trusting users as co-developers.” This concept revolves around the idea that the users are actively
assisting in various capacities in the development process of the application. To some degree, this is not
a new concept. The open source community has relied on this model of active contribution since its
inception. More specifically, the users and development community are active participants in the
development, testing, requesting of enhancements and reporting of bugs. Even companies that offer
applications which are not “open source” have employed this methodology. This can manifest itself by the
introduction of new functionality in an accelerated manner which mimics the open source community’s
“release early, release often” development model. It can also be accomplished by monitoring the usage
patterns of users in order to gather intelligence on what functionality is being used and creating value,
and which functionality is not.
By adopting the development state of being “in perpetual beta”, it allows a company to be more
responsive in the adoption of new technologies and usage patterns. They also become more adaptable to
changing business conditions. An example of an application still in “beta” yet with a large user base
actively using and contributing either actively or passively, is Google’s Gmail application.
“Harnessing collective intelligence.” “Collective Intelligence” refers to the level participation a website
reaches when the users themselves are actively deciding what is important and provides value to them.
Websites which offer product reviews allow the users amongst themselves to rate products which present
“Leveraging the long tail through customer self service.” The “long tail” was first coined by Chris
Anderson in a 2004 Wired Magazine article to describe certain business and economic models such as
Amazon.com or Netflix. The point being made was that customer self-service can be leveraged with
effective data management in order to offer goods and services which appeal to users outside the
mainstream. The belief is that the aggregate of all these non-mainstream users is much larger then the
mainstream users. For example, online booksellers and DVD rental sites draw a significant portion of their
revenue from titles that have long disappeared from the general public’s radar. Another example can be
seen in the gaining popularity of music trading sites like LaLa.com, whose business revolves around
bringing “like-minded” individuals to trade music amongst themselves, regardless if their tastes fall well
outside the mainstream.
It could be argued that this isn’t necessarily a new concept, as many traditional booksellers, video rental
and music stores can attest to the fact that a significant portion of their business stems from music
albums which have long disappeared from the charts, movies which have not screened in a theatre for
years or books by authors who have long since fallen out of favor with the general public. Web 2.0
companies have realized that their applications must be designed to serve not only popular tastes but
also the interests of those on the fringes.
“Software above the level of a single device.” The portability of data and the access to it is something
Web 2.0 applications must attempt to adhere to. Users are coming to expect that their data can be
accessed and synchronized across many devices, such as MP3 players, PDAs, cell phones, kiosks and
more traditional computing mediums like workstations and laptops. This data must be especially
indifferent to the hardware or operating system platforms on which is to be accessed.
“Lightweight user interfaces, development models, AND business models.” On this point, the
interfaces which users use to access data must be “lightweight” and highly portable, but still capable
delivering a rich end user experience. The use of programming techniques and methodologies like Ajax
and Ruby on Rails can be thought of in this respect. Using commodity off-the-shelf hardware, open
source software and leveraging development communities and users for testing, allow Web 2.0
companies to enter established markets or create new markets at lower costs. It also allows their
applications to be in “perpetual beta”, constantly adapting to the changes conditions of the marketplace
and needs of end users.
Not many years ago it would have been hard to imagine the Web as a strategically important platform for
many of the things that are now common place, like trading stocks, booking travel, conducting commerce,
bartering for goods and services, finding new/old friends or even a potential life mate. This perception
was often due to the fact that the applications making use of the web as a platform, were often sluggish,
had few security controls, were graphically uninteresting, or were held captive by the speed of the end-
users internet connection. When comparing these characteristics against the existing desktop
Fast forward a few years and web applications are now beginning to provide close to if not better end-
user experiences. Email is a good example of an application originally relegated to the desktop if you
wanted any advanced features. It is now an application that can be accessed over the web, from
essentially anywhere in the world with an internet connection, with minimal loss in functionality over a
desktop client. In some cases, there is even increased functionality, like portability, if we include
accessing email over a PDA or cell phone. Plus advanced search capabilities, contact sharing between
devices, no need for local backups and almost unlimited “theoretical” storage capacity. A similar trend is
well under way as it relates to spreadsheets, word processing and calendaring.
The evolution we are witnessing is that of the web quickly becoming the next “desktop”, or more
specifically, the next operating platform on which applications are being designed to run on exclusively.
An Architecture of Participation
The concept of “an architecture of participation” is typically used to describe companies, technologies and
projects, intentionally designed for contribution from developer communities and individual users with an
emphasis on empowerment and openness. Often times this concept is closely linked to open source
projects and companies.
It may be worth noting that a technology or company that is open source does not necessarily mean it
automatically exhibits “an architecture of participation”. However, it is often much easier for open source
companies and projects, as they will likely have a devout and often vibrant developer community. Many
times proprietary products find it difficult to cultivate a participatory quality without heavy subsidization.
This can be further complicated if the source code is closed, or the exposed APIs are complex, making
even peripheral contributions difficult.
A “release early and release often” development cycle, characteristic of open source software, is an
excellent way to include a community of volunteers and parties with vested interests in the software, to
test and help debug code. Often the introduction of new features is done in strategic locations on a
website or within an application to help ascertain its popularity or usability. This helps developers
understand if the feature should be more widely employed and enhanced, or abandoned all together.
“An architecture of participation” also relates to the idea of users creating meaningful and valuable data
for themselves. Often times the application simply provides the framework and tools to empower users in
this capacity. A practical manifestation of this may include seller ratings, user recommendations and
restaurant reviews.
Feeds: Users and applications allow their content to be picked up for distribution to subscribers
Blogs: Users create site content and drive traffic
Social Networking: Users create site content and through their social channels to build a network
Wikis: Users contribute articles and manage the content for accuracy and relevance
“Level 2: Could exist offline, but has unique advantages by being online.” An example in this case
could be found among photo sharing applications. Unlike desktop photo management applications like
Adobe Photoshop Album or Google’s Picasa, online applications like Flickr, gain unique advantages by
being online. Specifically, by their ability to share images publicly with other users. Plus, their ability to
then be indexed and searched for online via the use of tags and other metadata characteristics.
“Level 1: Can and does exist offline, but gains additional functionality by being online.” This level
can be usually assigned to productivity applications which sometimes benefit from collaboration. O’Reilly
uses the example of Writely in this case. He points out that his word processing application can be used
offline when remarks or comments are not required from others (as is true in the vast majority of cases).
But when collaborative editing and review is required, its online attributes make it much more efficient and
effective then trying to reconcile the markups from multiple reviewers on the same document, individually.
This same idea can be expanded to calendaring software as well. Unless, the calendar needs to be
viewed, edited or shared by others, outside of the ability to access it online, there is little advantage.
“Level 0: The application has primarily taken hold online, but it would work just as well offline if
you had all the data in a local cache.” This of course is prohibitive in many cases based on the amount
of data that may be required or if the data is licensed or proprietary. O’Reilly’s examples include,
MapQuest, Yahoo! Local, and Google Maps.
Additional Characteristics
For more information about Linden Labs/Second Life and MySQL please see:
Web syndication can be used to describe the function of making an information source, such as a blog
available for feed distribution. It is very similar to other syndicated media like television and radio
programs or news stories distributed over “the wire”. Likewise the contents of a web feed may be shared
and posted by other web sites.
Feeds are typically subscribed to directly by users using aggregators or feed readers, which combine the
contents of multiple web feeds for presentation. Subscription to a feed is typically done by manually
entering the URL of a feed or by clicking a link on the page.
Popular examples of Web Syndication, Feed Management, Feed Aggregator sites and readers include:
Additional Characteristics
Blogs
Weblogs or Blogs as they are more commonly referred to are personal websites in a journal or diary
format. Text, images, videos and files make up the majority of the content on blogs. They typically allow
visitors to post comments and other messages in response to the bloggers posts. “Pingback” and
“trackbacks” can be leveraged so that conversations spanning several blogs can be easily traversed or
navigated by readers attempting to follow an exchange.
It is vital for blog building applications and blog hosting sites that the database(s) they leverage are:
• Easy to Use: For administrators and end-users if they must interact directly with database.
• Reliable: Many users may depend on the service to be available round the clock.
Additional Characteristics
Social Networking
Social networking websites enable users to socialize online based on common interests or causes. These
sites normally offer an interactive, user-submitted network of blogs, profiles, groups, photos, MP3s,
videos and even internal e-mail or messaging systems. It is estimated that there are currently well over
300 hosted social networking websites on the internet.
Additional Characteristics
“MySQL delivered the right
• More then 100 MySQL Servers in production balance of features,
• About 10 additional servers are being added every month reliability, performance and
• Explosive growth: 3.7 million users signed up within the first 3 scalability, making MySQL
years a perfect fit for scaling-out
• Dynamic: 70% are active users (less the 72 hours since last login) our system.”
• Technology Stack: Linux, Apache, MySQL, Perl and memcached
• MySQL Replication for Scale-Out
• MySQL leveraged to store meta-data about stored images Batara Kesuma
CTO
For more information about Mixi.jp and MySQL please see: mixi, inc
Mixi Delivers Massive Scale-Out with MySQL
http://www.mysql.com/why-mysql/case-studies/mysql-cs-mixi
Wikis
A wiki is a type of site that permits users to easily add, remove and edit content on wiki pages within the
web browser without having to know HTML. Wikis also have built-in tools for discussing, tracking and
implementing changes to content. This is critical when erroneous or incorrect information is posted and
needs to be resolved and then corrected. The high level of interaction and empowerment it gives users
makes it an excellent tool for projects which require high degrees of collaboration. Another interesting
characteristic about wikis is that they generally do not maintain any sort of access restrictions, or if they
do, they tend to be quite minimal. Wikis are commonly used for both public and private project
communication, intranets, documentation and knowledge bases.
Additional Characteristics
• Millions of users leverage these search engines either explicitly or through services
• Hundreds of millions of new posts are created every day
• This creates billions of hyperlinks
• In turn, there is a constant expansion of data, meta-data and relationships created every minute
craigslist
Additional Characteristics
Flickr also allows users to categorize their photos into "sets", or groups of photos that fall under the same
heading. However, sets are more flexible than the traditional folder-based method of organizing files, as
Additional Characteristics
• Technology Stack: RedHat Linux, Apache, MySQL, PHP, Perl, and Java
• Over 25,0000 transactions per second at peak times
• MySQL Replication for Scale Out
• Full Text search
Online Gaming
Online gaming can be broken down into roughly two main categories, those which involving wagering and
those which do not. Non-wagering games are typically known as massively multiplayer online games
(MMOG). This year, DFC Intelligence sized the online gaming market at about $3.4 billion, with growth
expected to exceed $13 billion by 2011. Curiously, MMOG are expected to remain the leading game
category, despite appealing to a smaller segment of players. One of the Web 2.0 competencies that
MMOG companies employ is the reliance on online channels in order to deliver their gaming services.
This allows them to adopt a business model that relies on subscriptions rather then shrink-wrapped
products on retail shelves. Their business models also exist 100% online and often make use of the “long
tail” of customer self-service, creating many games and offering almost limitless customization options.
MMOGs are incredibly popular and are experiencing unprecedented growth, with several of the top
games having millions of subscribers.
Popular examples of online gaming sites include:
Sim Dynasty
Anders Thor
DBA
Ongame
PokerRoom.com Powers High Transaction Online Poker System with MySQL and HP
http://www.mysql.com/why-mysql/case-studies/mysql-hp-ongame-casestudy.pdf
• LAMP: Fortunately, there is a proven open source stack which has been consistently leveraged by
companies big and small to deliver scalable, cost effective, and interoperable applications. This has been
achieved by leveraging the tight integration of Linux – Operating System, Apache – Web Server, MySQL
– Database, PHP/Perl/Python Programming & Scripting.
http://www.mysql.com/consulting/
http://www.mysql.com/network/
Linux
Linux is a Unix-like operating system which is free and open source. All of the source code is available for
anyone to use, modify and redistribute. This is in contrast to proprietary operating systems like Windows.
Linux has been around since the early 1990’s and has at this point become the fastest growing operating
system in the world. Much of this success can be attributed to the fact it is a low-cost, secure, scalable
and highly interoperable alternative to proprietary operating systems. Linux can be found running on
everything from hand-held devices to hardware components, desktop computers to massive computing
clusters. All of these characteristics make it an excellent choice for Web 2.0 applications.
Linux shares a long tradition of compatibility with MySQL and other open source components by serving
as the operating system component of the LAMP (Linux, Apache, MySQL, Perl, Python & PHP)
technology stack.
MySQL offers binaries and support on many popular Linux distributions like RedHat, Debian, SUSE and
Ubuntu. Generic RPMs & TAR packages are also available for other, more specialized distributions.
Apache
The Apache HTTP Server Project is an open-source Web server. It is known for being secure, highly
portable across many operating systems, efficient in utilizing resources and extensible. According to a
Netcraft Web Survey in February 2006, the Apache Web Server continues to be the world’s most popular
web server with over 70% of websites leveraging it within their technology stacks. Apache has been
extend with compiled modules for interfacing with Perl, Python and PHP.
The Apache Web Server also shares a long tradition of compatibility with MySQL and other open source
components by serving as the web server component of the LAMP stack.
We should also note the emerging popularity of another web server called Lightppd, also compatible with
MySQL, which is especially popular with developers using Ruby on Rails.
MySQL
Within the LAMP stack, MySQL comprises the database component. The database component serves as
the critical piece of software which manages the data leveraged by the applications and web servers.
MySQL is a multithreaded, multi-user, SQL Database Management System (DBMS) with over six million
installations. MySQL is the database of choice for consistently delivering lower TCO, reliability,
performance and ease of use. Many of the largest and fasting growing Web 2.0 companies are designing,
PHP Hypertext Preprocessor or simply, PHP, is an open-source language for producing dynamic web
content mainly in server-side applications. PHP typically runs on a web server, using PHP code as its
input and rendering Web pages as output. PHP is a very popular server-side alternative to Microsoft’s
ASP.NET and Adobe’s ColdFusion. PHP works extremely well with all the components within the LAMP
stack. According to php.net, it is estimated that over 20 million domains on the internet make use of the
language.
PHP includes many free and open source libraries. PHP actually provides two different MySQL API
extensions:
• mysql: which is available for PHP versions 4 and 5, is intended for use with MySQL versions
prior to MySQL 4.1
• mysqli: which stands for “MySQL, Improved”; is available only in PHP 5. It is intended for use
with MySQL 4.1.1 and later. This extension fully supports the authentication protocol used in
MySQL 5.0, as well as the Prepared Statements and Multiple Statements APIs. In addition, this
extension provides an advanced, object-oriented programming interface.
Perl is a dynamic procedural language often used for CGI scripts or as “glue” tying together systems and
interfaces not specifically designed to be interoperable. CGI or Common Gateway Interface is a standard
protocol for interfacing external application software with an information server, commonly a web server.
This allows the server to pass requests from a client web browser to the external application. The web
server can then return the output from the application to the web browser.
Python is an open source scripting language similar to Perl and Ruby. It has been used to develop many
large software projects such as the Zope application server and BitTorrent file sharing system. According
to wiki.python.org, it is also used extensively by websites like Google, Yahoo Groups and Yahoo Maps.
Ruby on Rails
Ruby on Rails is a free and open source framework written in Ruby optimized for rapidly developing
database driven web-based applications. This easy to use framework requires less code and minimal
configuration. Although Ruby on Rails ships with a default database and web server components,
production environments typically rely on the addition of MySQL and Apache. It is also characterized as
being highly interoperable across a variety of platforms and components.
Web 2.0 application development typically requires that features and enhancements be developed
quickly. An emphasis is also placed on reusability and availability so other applications and web services
For Ruby on Rails developers working with MySQL, there is the MySQL-Ruby module. It is a Ruby API for
accessing the MySQL server. It has the same functions as the very popular MySQL C API. It is available
at: http://www.tmtm.org/en/mysql/ruby/ .
Ajax
Asynchronous Javascript and XML or Ajax is it is more commonly known, is a development technique for
creating rich, visual appealing and interactive web applications. Web pages which leverage Ajax are more
responsive because they exchange smaller amounts of data with the web server. This means that the
entire web page does not have to be “refreshed” or completely reloaded in user’s browser after each
interaction. This results in web pages which have increased interactivity, speed and usability. These are
all key components to help Web 2.0 companies deliver applications which are highly interactive and
deliver rich end user experiences which rival those of desktop applications. For these reasons, Ajax is
already being widely leveraged in both consumer and business applications.
• XHTML (or HTML) and CSS leveraged for mark up and style information
• The description of how an HTML or XML document is represented in a tree structure (otherwise
known as DOM), is accessed with a client-side scripting language, usually JavaScript
• The XMLHttpRequest object is used to exchange data asynchronously with the web server
• XML as the format to transfer data between the server and client
memcached
memcachced is a popular open source distributed memory caching system originally developed by
Danga Interactive for the blogging website LiveJournal. It is traditionally leveraged to enhance the
performance and responsiveness of dynamic content websites backed by databases. This is achieved by
caching data and objects in memory, thereby reducing the amount of data that needs to be read from the
database. The performance characteristics it can deliver are faster page loading for users, more efficient
resource utilization, and faster database access times in the event of a memcached miss.
In more technical detail, memcached acts as a large hash table, caching data as it is being requested by
clients. Although it was originally designed to improve the performance of database queries, it has been
extended to cache server-side objects as well. In essence, any operation which is resource or time
intensive can benefit from the use of memcached. It goes without saying that this technology is of great
advantage to Web 2.0 applications, who by definition are very dynamic and data driven. This is in contrast
to the static non-interactive web sites characteristic of the early years of the Web.
Many websites who make use of memcached, such as LiveJournal, Slashdot and Wikipedia, also make
use of MySQL.
There are many reasons why MySQL is being leveraged time and again by established web companies
and new emerging Web 2.0 companies. The most glaring one is cost. Web 2.0 entrepreneurs
• Easily and cost-effectively add capacity to your database infrastructure using open source
software and commodity hardware.
• Reduced hardware costs by incrementally adding several low-cost commodity systems vs.
upgrading high-cost mainframe-class systems
• Reduce software costs and eliminate up-front licensing by scaling out with MySQL
• Improve response time and availability by improving the performance of your system so users
experience fewer interruptions.
• Improve scalability using MySQL Replication to distribute large workloads to individual server
nodes for execution.
• Increased flexibility to right-size the initial purchase of commodity hardware and software,
incrementally add capacity.
Virtually any application that has a rapidly growing volume of users, transactions or data may be a
candidate for more cost-effective deployment using open source technology combined with a scale-out
architecture. MySQL is widely used for
the following Scale-Out architectures: Scale-Up
– Vertical
• Web Scale-Out to improve the – Expensive SMP hardware
performance, scalability, and – Proprietary software
availability of web applications – Platform lock-in
such as e-commerce, content – “Fork Lift” to increase capacity & performance
management, session
Scale-Out
– Horizontal
– Commodity Intel/AMD hardware
Copyright © 2006, MySQL AB Page 20 of 29
– Open source software
– Platform independence
– Add servers to increase capacity & performance
management, search, and security.
• Data Warehousing Scale-Out to improve the performance and availability of traditional data
warehousing (e.g. centralized data warehouse and data marts) as well as real-time Operational
Data Stores.
For more information on how Scale Out with MySQL consult the white paper titled, “Guide to Cost-
effective Database Scale-Out using MySQL” available at: http://www.mysql.com/why-mysql/white-papers/
MySQL Replication
MySQL Replication is the key enabler of “Scale Out” discussed in the previous section. Scale Out is
leveraged extensively by Web 2.0 sites and applications. MySQL natively supports one-way,
asynchronous replication. Replication works by simply having one server act as a master, while one or
more servers act as slaves. This is in contrast to the synchronous replication which is a characteristic of
MySQL Cluster.
Asynchronous data replication means that data is copied from one machine to another, with a resultant
delay. Often this delay is determined by networking bandwidth, resource availability or a predetermined
time interval set by the administrator. However, with the correct components and tuning, replication itself
can appear to be almost instantaneous to most applications. Synchronous data replication implies that
data is committed to one or more machines at the same time, usually via what is commonly known as a
“two-phase commit”.
In standard MySQL Replication, the master server writes updates to its binary log files and maintains an
index of those files in order to keep track of the log rotation. The binary log files serve as a record of
updates to be sent to slave servers. When a slave connects to its master, it determines the last position it
has read in the logs on its last successful update. The slave then receives any updates which have taken
place since that time. The slave subsequently blocks and waits for the master to notify it of new updates.
Below in Figure 1 is an illustration of a basic Scale Out implementation using MySQL Replication.
Load Balancer
Replication
Figure 1
• In the event the master fails, the application can be designed to switch to the slave.
• Better response time for clients can be achieved by splitting the load for processing client queries
between the master and slave servers. Queries which simply “read” data, such as SELECTs, may
be sent to the slave in order to reduce the query processing load on the master. Statements that
modify data should be sent to the master so that the data on the master and slave do not get out
of synch. This load-balancing strategy is effective if non-updating queries dominate. (This is
normally the case.)
• Another benefit of using replication is that database backups can be performed using a slave
server without impacting the resources on the master. The master continues to process updates
while the backup is being made.
As of version 5.1, MySQL Cluster supports data storage not only in main memory (RAM), but also on
disk. This arrangement allows applications to leverage the benefits of in-memory data storage, which not
only increases the performance of the application, but also limits I/O bottlenecks by asynchronously
writing transaction logs to disk. But, with the introduction of disk-data support, ever larger data sets that
do not require the performance characteristics granted to in-memory data, can be leveraged within the
cluster.
MySQL Cluster delivers an extremely fast fail over time with sub-second responses so your applications
can recover quickly in the event of a software, network or hardware failure. MySQL Cluster uses
synchronous replication to propagate transaction information to all the appropriate data nodes. This also
eliminates the time consuming operation of recreating and replaying log files as is typically required by
clusters employing shared-disk architectures. MySQL Cluster data nodes are also able to automatically
restart, recover, and dynamically reconfigure themselves in the event of failures, without developers
having to program any fail over logic into their applications.
MySQL Cluster implements an automatic node recovery that ensures any fail over to another data node
will contain a consistent set of data. Should all the data nodes fail due to hardware faults, MySQL Cluster
ensures an entire system can be safely recovered in a consistent state by using a combination of
checkpoints and log execution. Furthermore, as of version 5.1, MySQL Cluster ensures systems are
available and consistent across geographies by enabling entire clusters to be replicated across regions.
We have illustrated an example MySQL Cluster architecture below in Figure 2. A brief description of the
main components of a MySQL Cluster follows as well.
• Data Nodes are the main nodes of the system. All data is stored on these nodes. Data is
replicated between data nodes to ensure it is continuously available in case one or more of the
data nodes fail. These data nodes in turn, handle all database transactions.
• Management Nodes handle the cluster configuration and are used to change the setup of the
system. Only one management server node is required, but there is also the option of running
additional management nodes in order to increase the level of fault tolerance required. The
management node is only used at startup or during a system re-configuration, which means the
cluster is operable without the management node being online.
• MySQL Nodes are the MySQL Servers accessing the clustered data nodes. By incorporating this
design, the MySQL Server provides developers a standard SQL interface to program their
applications against. This eliminates the need for any special application programming in order to
interact with the cluster.
MySQL MySQL
Server Server
NDB API
Management
Memory NDB Server
& Data Storage Engine Data
Disk Node Node
Management
Server
MySQL Cluster
Figure 2
For more information concerning MySQL Cluster and how MySQL can be part of your session
management architecture, please visit:
http://www.mysql.com/products/database/cluster/
http://www.mysql.com/why-mysql/white-papers/
The MySQL Query Cache, introduced in 4.0.1, can deliver excellent gains in the response times of both
basic and resource-intensive SQL statements. The MySQL Query Cache stores the SELECT queries
issued by clients to the MySQL database server. If an identical statement is received, the results are
returned from the query cache rather then parsing and executing the statement again.
• No stale data is ever returned to clients. Data is flushed whenever an UPDATE is issued which
invalidates the cached data set.
• The query cache is not applicable for server-side prepared statements.
• The expected overhead for enabling the query cache is about 10-15%.
• However, performance gains can be anywhere from 200 to 250% faster when used correctly.
The PSEA also provides a standard set of server, drivers, tools, management, and support services that
are leveraged across all the underlying storage engines.
An illustration of how the PSEA fits into the overall MySQL Server’s design can be illustrated bellow in
Figure 3.
Connectors
Native C API, JDBC, ODBC, .NET, PHP, Python, Perl, Ruby, VB
MySQL Server
Enterprise
Management Connection Pool
Services & Utilities Authentication -Thread Reuse - Connection Limits – Check Memory - Caches
Backup & Recovery
Security Caches & Buffers
SQL Interface Parser Optimizer
Replication
Cluster
Partitioning DML, DDL, Query Translation, Access Paths, Global and
Instance Manager Stored Procedures Object Privilege Statistics Engine Specific
INFORMATION_SCHEMA Views, Triggers, etc. Caches & Buffers
Administrator
Workbench
Query Browser
Migration Toolkit
MyISAM InnoDB Cluster Falcon Archive Federated Merge Memory Partner Community Custom
Figure 3
For more information about MySQL’s Pluggable Storage Engine Architecture visit:
• High Availability: Rock-solid reliability and constant availability are hallmarks of MySQL, with
customers relying on MySQL to guarantee around-the-clock uptime. MySQL offers a variety of
high-availability options from high-speed master/slave replication configurations, to specialized
Cluster servers offering instant fail over, to third party vendors offering unique high-availability
solutions for the MySQL database server.
• Robust Transactional Support: MySQL offers one of the most powerful transactional database
engines on the market. Features include complete ACID (atomic, consistent, isolated, durable)
transaction support, unlimited row-level locking, distributed transaction capability, and multi-
version transaction support where readers never block writers and vice-versa. Full data integrity
is also assured through server-enforced referential integrity, specialized transaction isolation
levels, and instant deadlock detection.
• Web and Data Warehouse Strengths: MySQL is the de-facto standard for high-traffic web sites
because of its high-performance query engine, tremendously fast data insert capability, and
strong support for specialized web functions like fast full text searches. These same strengths
also apply to data warehousing environments where MySQL scales up into the terabyte range for
either single servers or scale-out architectures. Other features like main memory tables, B-tree
and hash indexes, and compressed archive tables that reduce storage requirements by up to
eighty-percent make MySQL a strong standout for both web and business intelligence
applications.
• Strong Data Protection: Because guarding the data assets of corporations is the number one
job of database professionals, MySQL offers exceptional security features that ensure absolute
data protection. In terms of database authentication, MySQL provides powerful mechanisms for
ensuring only authorized users have entry to the database server, with the ability to block users
down to the client machine level being possible. SSH and SSL support are also provided to
ensure safe and secure connections. A granular object privilege framework is present so that
users only see the data they should, and powerful data encryption and decryption functions
ensure that sensitive data is protected from unauthorized viewing. Finally, backup and recovery
utilities provided through MySQL and third party software vendors allow for complete logical and
physical backup as well as full and point-in-time recovery.
• Management Ease: MySQL offers exceptional quick-start capability with the average time from
software download to installation completion being less than fifteen minutes. This rule holds true
whether the platform is Microsoft Windows, Linux, Macintosh, or UNIX. Once installed, self-
management features like automatic space expansion, auto-restart, and dynamic configuration
changes take much of the burden off already overworked database administrators. MySQL also
provides a complete suite of graphical management and migration tools that allow a DBA to
manage, troubleshoot, and control the operation of many MySQL servers from a single
workstation. Many third party software vendor tools are also available for MySQL that handle
tasks ranging from data design and ETL, to complete database administration, job management,
and performance monitoring.
Conclusion
As we have seen, a common characteristic of many top emerging Web 2.0 companies and of established
web properties, is that they rely of MySQL as a critical piece of infrastructure within their technology
stacks to deliver performance, scalability and reliability to millions of users. The MySQL database server
consistently offers a lower total cost of ownership without having to sacrifice performance, reliability or
About MySQL
MySQL AB develops and markets a family of high performance, affordable database servers and tools.
Our mission is to make superior data management available and affordable for all. We contribute to
building the mission-critical, high-volume systems and products worldwide.
MySQL AB is defining a new database standard. This is based on its dedication to providing a less
complicated solution suitable for widespread application deployment at a greatly reduced TCO. MySQL's
robust database solutions embody an ingenious software architecture while delivering dramatic cost
savings. With superior speed, reliability, and ease of use, MySQL has become the preferred choice of
corporate IT Managers because it eliminates the major problems associated with downtime,
maintenance, administration and support.
MySQL is a key part of LAMP (Linux, Apache, MySQL, PHP / Perl / Python), a fast growing open source
enterprise software stack. More and more companies are using LAMP as an alternative to expensive
proprietary software stacks because of its lower cost and freedom from lock-in.
Our flagship product is MySQL, the world's most popular open source database, with more than 10 million
active installations. Many of the world's largest organizations, including Sabre Holdings, Cox
Communications, The Associated Press, NASA and Suzuki, are realizing significant cost savings by using
MySQL to power Web sites, business-critical enterprise applications and packaged software. MySQL AB
is a second generation, open source company, with dual licensing that supports open source values and
methodology in a profitable, sustainable business
White Papers
http://www.mysql.com/why-mysql/white-papers/
Case Studies
http://www.mysql.com/customers/
Live Webinars
http://www.mysql.com/news-and-events/web-seminars/
Webinars on Demand
http://www.mysql.com/news-and-events/on-demand-webinars/