This document discusses strategies for improving the performance of Drupal sites, including caching at various levels (PHP opcode, Drupal internal, reverse proxy), using modules like Boost and Memcached, optimizing database and hardware configurations, and profiling code to identify bottlenecks. Specific techniques mentioned include leveraging caching, optimizing SQL queries, using master-slave databases, serving static files from fast servers, scaling out hardware, and profiling with tools like Xdebug to optimize page loads.
11. Boosting up the performance
• Drupal’s internal architecture
• Single-controller
• Loads a lot of code on every pageload
• Tends to be slower than a pure MVC-model
• Caching
• Minimize the CPU usage
• Minimize the amount of SQL queries
• Ultimately – avoid running Drupal’s bootstrap
13. Caching layers – PHP opcode cache
• Alternative PHP Cache (APC)
• PHP code is compiled every time it is read
• APC caches the compiled bytecode
• Parsing and compiling PHP code is not needed, if the
bytecode is in the cache
• Works generally everywhere and gives a major boost
in performance
14. Caching layers – Drupal internal caching
• Block cache
• Global / per role / per user
• Page cache
• For anonymous users
• Code level caching
• Contrib modules
• Boost
• Memcache API and Integration
15. Cache layers – Code level
• 1st rule of caching and optimization: Never do
something time consuming twice if you can hold onto
the results and re-use them
• Static variables for storing data within a function for
the duration of a single page load.
• Use drupal_static() to utilize central static variable storage
• Using cache_set() and cache_get() functions for
caching data more permanently.
16. Caching layers - Boost
• Generated page HTML is saved as a static file
• Page loads never touch the database
• For anonymous traffic and sites with a little dynamic
content
• Easy to set up even on a cheap web hotel
• Enable the module, modify .htaccess and you’re done
• Highly configurable
• For not yet cached content, serves the page first and
saves HTML after that
• Inbuilt crawler for cache warm-up
17. Caching layers – Memcached
• High-performance, distributed memory object
caching system
• In-memory key-value store for small chunks of
arbitrary data
• Drop-in replacement for changing Drupal cache
backend
• Instead of saving cached data to DB, it goes to memcached
• High-traffic sites really need to save the cache to memory
• Also for session data etc.
18. Caching layers – Reverse proxy
• Varnish Cache
• Designed from the ground up as an HTTP accelerator
• Stores data in virtual memory
• Configurable with VCL (Varnish Configuration
Language)
• Edge Side Includes (ESI)
<esi include=“/esi/some_content” />
• ESI integration module
• Block template will be changed to instruct Varnish to get block
content from e.g. http://example.com/esi/block/xxxxxx
19. Cache Control module
• An alternative to ESI
• Cheaper way to display user specific content
• How it works
• For all users, we load the page with anonymous content
hidden under a throbber
• JS then checks if the user is logged in (w/ cookie) and (for
anonymous users) set the anonymous content visible
• For logged in users (after JS has checked the login status), it
makes a single request to the backend to get the user-specific
data for the page
• http://drupal.org/node/1155312
20. Scaling Drupal
• MySQL
• High-performance configurations. There are many good base
configs available – start with them
• Dedicated server
• Master-slave setup
• Direct some of the SQL queries to slave
• Files
• Serve static files with Nginx or lighttpd
• Or use reverse proxy to cache them
• CDN if there’s a massive library of static content
• Scaling by buying more hardware?
22. Hardware stack example
Linux,
Apache PHP MySQL
master
Front R/W
server 1
MySQL
memcached server 1
23. Hardware stack example
memcached
MySQL
server 2
HTTP Front
cache 1 server 2
Linux, MySQL
Varnish slave
Load Apache PHP R
balancer
Front
Linux, R/W
server Varnish
Apache PHP MySQL
master
HTTP Front
cache 2 server 1
MySQL
memcached server 1
24. Optimizing
• Profiling
• Xdebug, XHProf and similar profiling tool to see what actually
happens during a page load
• Devel module to print a summary of all database queries
executed for page request, including how many times each
query was executed and how long each query took
• SQL bottlenecks
• Unnecessary repeating of same queries
• Missing indexes
• Temporary tables and filesort
• Use EXPLAIN to find out how MySQL executes your monster
query
• Table locking if using MyISAM engine in MySQL
25. • “Is there a lot of logged in users or are most of them
anonymous?”
• This pretty much defines what kind of caching strategies can
be applied
• “What kind of things my hosting environment allows
me to do?”
• There’s no single best solution