SQL Query Caching: Ori Staub 6 Comments
SQL Query Caching: Ori Staub 6 Comments
Intended Audience
Overview
Prerequisites
Possible Additions
The Script
Intended Audience
This tutorial is intended for the PHP programmer interested in caching SQL queries to reduce the
product information, category structure, articles or a guest book, some of the data is likely to be quite
Such a system would cache the results of an SQL query into a file stored on the system and hence
improve the response time by avoiding the need to make a database connection, forming the query,
On systems where the database does not reside on the same machine as the web server and requires
a remote connection (TCP or similar), or where large amounts of data are retrieved from the
database, you stand to gain even more in terms of response times and resources used.
Prerequisites
This tutorial will use MySQL as the database. You will need MySQL installed (available from
You will need to know the basics of the SQL (Structured Query Language) in order to query the
database.
Caching SQL Query Results
Why cache query results?
Caching query results can dramatically improve script execution time and resource requirements.
Caching SQL results also allows you to carry out post processing on the data. This may not be possible
if you use file caching to cache the outputs of the entire script (HTML output caching).
When you execute an SQL query, the typical process undertaken is:
Connect to the database
The above is quite resource intensive and can adversely affect the script performance. This can be
further compounded by factors such as amount of data retrieved and location of database server.
Although persistent connections may improve the overhead of connecting to the database, they are
more memory intensive and the overall time saved will be very little if a large amount of data is
retrieved.
SQL (Structured Query Language) queries are used as an interface to manipulate a database and its
contents. SQL can be used to define and edit the table structure, insert data into the tables, update
SQL is the language used to communicate with the database and in most PHP database extensions
(MySQL, ODBC, Oracle etc), the extension manages the process of passing the SQL query to the
database.
In this tutorial, only the select statement is used to retrieve data from the database. This data is
Caching can take a few forms according to the program's needs. The 3 most common approaches are:
Time triggered caching (expiry timestamp).
Information change triggered caching (sensing data has changed and updating the cache
accordingly).
Manual triggered caching (manually letting the system know information is outdated and
Your caching requirements may be one or a combination of the mechanisms above. In this tutorial,
the time-triggered approach is discussed. However, a combination of all 3 approaches can be used as
The basics to caching is using the serialize() and unserialize() PHP functions.
The serialize() function can be used to store PHP values without losing their types and structure.
In fact, the PHP session extension uses the serialized representation of the variables in a file to store
The unserialize() function reverses the operation and turns the serialized string back into its
In this example, an e-commerce store is used. The store has 2 basic tables, categories and products.
While product information may change daily, categories remain fairly static.
For product display, you can use an output caching script to store the resultant HTML output in a file
to be called up. However, categories may need some post processing. For example, all categories are
displayed and according to the category_id variable that is passed to the script
outdated after a set amount of time. In this particular example, 24 hours are used.
Serialize example:
Connect to database
Execute query
Get all results into an array so you can access them later.
Serialize array
a:1:{i:0;a:6:
{i:0;s:1:"1";s:11:"category_id";s:1:"1";i:1;s:9:"Computers";s:13:"category_nam
e";s:9:
"Computers" ;i:2;s:25:"Description for computers";s:20:"category_description"
;s:25:"Description for computers";}}
This output is the internal representation of the variables and their types. In this case you are using
mysql_fetch_array() that returns both numeric indexed array and an associative array (which is
why the data seems to occur twice – once with the numeric index and once with the string index).
In order to use the cache, you will need to unserialize() the information back into the original
format.
You can read the contents of the sql_cache.txt file into a variable using the file_get_contents()
function.
Please note: This function is available in PHP version 4.3.0 and above only. If you are using an older
version of PHP, a simple workaround is using the file() function (reads an entire file into an array,
each new line becomes an array entry). The implode() function is used to join the array elements
You are now able to go through the $records array and get the data from the original query:
foreach ($records as $id=>$row) {
print $row['category_name']."<br>";
}
Note that the $records array is an array of arrays (a numeric indexed array containing the query
results – each row being a numeric and string indexed array... what a mouthful).
The decision whether to cache is time based in this instance. If the file modification timestamp is
greater than the current time less the expiration time set, the cache is used, else the cache is
updated.
Check file exists AND timestamp is less than expiry time set.
Get the records stored in the cache file or update the cache file.
$file = 'sql_cache.txt';
$expire = 86400; // 24 hours (in seconds)
if (file_exists($file) &&
filemtime($file) > (time() - $expire)) {
// Get the records stored in cache
$records = unserialize(file_get_contents($file));
} else {
// Create the cache using serialize() function
}
Possible Additions
Storing cache results in shared memory for faster retrieval.
Adding a function that runs the SQL query randomly and checks if output is the same as
cached output, if not, the cache is updated (this function can be given the probability of running
once in every 100 script executions). Using a hashing algorithm (such as MD5()) can assist in
<?
function mysqlGetNewList($sql){
ob_start();
if(($result=mysql_query($sql))){
while(($rs=mysql_fetch_array($result))){
echo $rs[0];
//your results printed
}
}
/*
*Or you can store the SQL result as file directly
without ob_get_contents().
* $output = rs;
*/
$output = ob_get_contents();
ob_end_flush();
//all "echo" and "print" are beeing send to var "$output
"
// next time we have a file ready instead of having to q
uery mysql
$file = @fopen("cache/".$sql.".txt", "w+");
//we write it to a new file named after the $sql query
@fputs($file, $output);
//we start here
$sql = "select * from database order by id desc LIMIT 0, 50
";
//we try to open a file named after the sql query
$fp = @fopen("cache/".$sql.".txt","r");
if ($fp)
{
$filemod = filemtime("cache/".$sql.".txt");
//date of the file
$filemodtime = date("Ymd", $filemod);
//what date is it now
$now = time();
$now = strtotime($time);
$nowtime = date("Ymd", $now);
$diverence = $nowtime - $filemodtime;
if($diverence >= 1){
//older then 1 day? get new
print "<!-- non cached $filemodtime - $nowtime = $d
iverence -->";
mysqlGetNew($sql);
}
else
{
//get it from cache!
print "<!-- cached $filemodtime - $nowtime = $diver
ence -->";
$buff = fread($fp,1000000);
print $buff;
}
}
else
{
//file does not exist. so we try to create the file by s
tarting the function
print "<!-- error non cached $filemodtime - $nowtime =
$diverence -->";
mysqlGetNew($sql);
}
?>