Varnish Book 1
Varnish Book 1
Varnish Book 1
Request
Restart
hit
vcl_hit
miss
hit-for-pass
vcl_miss
vcl_pipe
purge
busy
vcl_purge
vcl_pass
if (conditions(beresp))
true
false
vcl_backend_response
vcl_backend_error
cacheable?
hit-for-pass yes
do not cache
vcl_deliver
cache
vcl_synth
Done
waiting
RESTART
Request received
cnt_restart:
ESI_REQ
ok?
max_restarts?
cnt_recv:
vcl_recv{}
hash
purge
req.*
pass
pipe
SYNTH
synth
cnt_pipe:
cnt_recv:
vcl_hash{}
filter req.*->bereq.*
req.*
lookup
vcl_pipe{}
pipe
cnt_lookup:
miss?
(waitinglist)
hit-for-pass?
busy?
vcl_purge{}
synth
req.*
restart
cnt_lookup:
req.*
vcl_hit{}
deliver
fetch
obj.*
restart
synth
pass
cnt_miss:
vcl_miss{}
fetch
synth
req.*
restart
pass
parallel
if obj expired
cnt_pass:
vcl_pass{}
fetch
BGFETCH
FETCH
Filter obj.->resp.
restart
req.*
resp.*
deliver
req.*
restart
FETCH_FAIL
cnt_deliver:
vcl_deliver{}
synth
FETCH_DONE
synth
cnt_purge:
hash lookup
hit?
req.*
bereq.*
synth
SYNTH
cnt_synth:
vcl_synth{}
deliver
req.*
resp.*
stream?
body
restart
V1D_Deliver
DONE
send bereq,
copy bytes until close
FETCH
BGFETCH
RETRY
vbf_stp_startfetch:
vcl_backend_fetch{}
abandon
bereq.*
fetch
send bereq,
read beresp (headers)
vbf_stp_startfetch:
bereq.*
vcl_backend_response{}
retry
max?
ok?
vbf_stp_error:
error
abandon
vcl_backend_error{}
retry
max?
ok?
bereq.*
beresp.*
deliver
RETRY
abandon
beresp.*
deliver
304?
other?
vbf_stp_condfetch:
vbf_stp_fetch:
setup VFPs
steal body
fetch
fetch_fail?
ok?
"backend synth"
FETCH_FAIL
RETRY
FETCH_DONE
fetch_fail?
error?
ok?
Authors:
Copyright:
Versions:
Date:
License:
Contact:
Web:
Source:
Contents
1 Introduction
17
18
19
21
22
2 Design Principles
24
26
27
3 Getting Started
28
29
31
32
33
35
37
38
40
42
44
3.7 Exercise: Use the administration interface to learn, review and set Varnish
parameters
45
46
47
48
49
4.3 Transactions
50
51
52
53
4.5 Exercise
55
4.6 varnishstat
56
60
62
63
64
66
67
68
69
70
71
73
74
75
76
77
79
80
81
82
5.9 Timers
83
85
86
6 HTTP
87
88
6.2 Requests
89
90
6.4 Response
91
92
93
94
95
96
6.10 Expires
97
6.11 Cache-Control
98
6.12 Last-Modified
100
6.13 If-Modified-Since
101
6.14 If-None-Match
102
6.15 Etag
103
6.16 Pragma
104
6.17 Vary
105
6.18 Age
106
107
108
109
7 VCL Basics
7.1 Varnish Finite State Machine
7.1.1 Waiting State
110
111
113
7.2 Detailed Varnish Request Flow for the Client Worker Thread
114
116
117
118
119
120
122
123
124
126
127
128
130
131
8.2.1 hit-for-pass
132
133
8.3.1 vcl_backend_response
135
136
138
8.3.4 Example: Cache .jpg for 60 seconds only if s-maxage is not present
139
140
141
142
143
144
145
146
147
148
150
151
152
153
154
155
157
158
159
160
161
164
9.3 Banning
166
169
171
172
173
174
175
177
10 Saving a Request
179
10.1 Directors
180
182
183
185
187
188
189
190
191
192
11 Content Composition
193
194
11.2 Cookies
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
214
217
219
220
221
13 Appendix A: Resources
222
223
14.1 varnishtop
224
14.2 varnishncsa
225
14.3 varnishhist
226
227
228
15.1 ajax.html
229
15.2 article.php
230
15.3 cookies.php
231
15.4 esi-top.php
232
15.5 esi-user.php
233
15.6 httpheadersexample.php
235
15.7 purgearticle.php
238
15.8 test.php
239
15.9 set-cookie.php
240
241
242
Abstract
The Varnish Book is the training material for Varnish Plus courses. The book teaches technical staff
how to use Varnish Cache 4 and selected modules of Varnish Plus effectively.
The book explains how to get started with Varnish, and its Varnish Configuration Language (VCL).
Covered are such procedures to effectively use VCL functions, cache invalidation, and more. Also
included are Varnish utility programs such as varnishlog, and extra material.
Preface
Course for Varnish Plus
Learn specific features depending the course and your needs
Necessary Background
How to Use the Book
Acknowledgements
After finishing this course, you will be able to install and configure the Varnish server, and
write effective VCL code. The Varnish Book is designed for attendees of Varnish Plus courses.
Varnish Plus is a commercial suite by Varnish Software that offers products for scalability,
customization, monitoring, and expert support services. The engine of Varnish Plus is Varnish Cache
Plus, which is the enhanced commercial edition of Varnish Cache.
Varnish Cache Plus should not be confused with Varnish Plus, a product offering by Varnish
Software. Varnish Cache Plus is one of the software components available for Varnish Plus
customers.
Most of the presented material in this book applies to both, the open source Varnish Cache and the
commercial edition Varnish Cache Plus. Therefore, you can also refer to the Varnish Cache
documentation at https://www.varnish-cache.org/docs/4.0/.
For simplicity, the book refers to Varnish Cache or Varnish Cache Plus as Varnish when there is no
difference between them. There is more information about differences between Varnish Cache and
Varnish Cache Plus in the Varnish Cache and Varnish Plus chapter.
The goal of this book is to make you confident when using Varnish. Varnish instructors focus on your
area, needs or interest. Varnish courses are usually flexible enough to make room for it.
The instructor will cover selected material for the course you take. The System Administration
(Admin) course provides attendees with the necessary knowledge to troubleshoot and tune common
parameters of a Varnish server. The Web Developer (Webdev) course teaches how to adapt web
applications so that they work with Varnish, which guarantees a fast experience for visitors of any
website. Besides that, other courses may also be taught with this book.
Necessary Background
The Admin course requires that you:
have expertise in a shell on a Linux/UNIX machine, including editing text files and starting
daemons,
understand HTTP cache headers,
understand regular-expressions, and
be able to install the software listed below.
The Webdev course requires that you:
have expertise in a shell on a Linux/UNIX machine, including editing text files and starting
daemons,
understand HTTP cache headers,
understand regular-expressions, and
be able to install the software listed below.
You do not need background in theory or application behind Varnish to complete this course.
However, it is assumed that you have experience and expertise in basic UNIX commands, and that
you can install the following software:
Varnish Cache 4.x or Varnish Cache Plus 4.x,
Apache/2.4 or later,
HTTPie 0.8.0 or later,
PHP 5.4 or later, and
curl command line tool for transferring data with URL syntaxMore specific required skills depend on the course you take. The book starts with the installation of
Varnish and navigation of some of the common configuration files. This part is perhaps the most
UNIX-centric part of the course.
/path/to/yourfile. Important notes, tips and warnings are also inside boxes, but they use the
normal bodytext font type.
Acknowledgements
In addition to the authors, the following deserve special thanks (in no particular order):
Rubn Romero
Dag Haavi Finstad
Martin Blix Grydeland
Reza Naghibi
Federico G. Schwindt
Dridi Boukelmoune
Lasse Karstensen
Per Buer
Sevan Janiyan
Kacper Wysocki
Magnus Hagander
Poul-Henning Kamp
Everyone who has participated on the training courses
Chapter 1Introduction
1 Introduction
What is Varnish?
Benefits of Varnish
Open source / Free software
Varnish Software: The company
What is Varnish Plus?
Varnish: more than a cache server
History of Varnish
Varnish Governance Board (VGB)
Page 17
Page 18
Chapter 1Introduction
Chapter 1Introduction
Page 19
Varnish Cache
Varnish Plus
VCL
Yes
Yes
varnishlog
Yes
Yes
varnishadm
Yes
Yes
varnishncsa
Yes
Yes
varnishstat
Yes
Yes
varnishhist
Yes
Yes
varnishtest
Yes
Yes
varnishtop
Yes
Yes
directors
Yes
Yes
purge
Yes
Yes
ban
Yes
Yes
Yes
Yes
vagent2
Yes
Yes
No
Yes
No
Yes
Varnish Tuner
No
Yes
No
Yes
No
Yes
No
Yes
SSL/TLS Support
No
Yes
Varnish Cache is an open source project, and free software. The development process is public and
everyone can submit patches, or just take a peek at the code if there is some uncertainty on how
does Varnish Cache work. There is a community of volunteers who help each other and newcomers.
The BSD-like license used by Varnish Cache does not place significant restriction on re-use of the
code, which makes it possible to integrate Varnish Cache in virtually any solution.
Varnish Cache is developed and tested on GNU/Linux and FreeBSD. The code-base is kept as
self-contained as possible to avoid introducing out-side bugs and unneeded complexity. Therefore,
Varnish uses very few external libraries.
Chapter 1Introduction
Page 20
Varnish Software is the company behind Varnish Cache. Varnish Software and the Varnish
community maintain a package repository of Varnish Cache for several common GNU/Linux
distributions.
Varnish Software also provides a commercial suite called Varnish Plus with software products for
scalability, customization, monitoring and expert support services. The engine of the Varnish Plus
commercial suite is the enhanced commercial edition of Varnish Cache. This edition is proprietary
and it is called Varnish Cache Plus.
Table 1 shows the components covered in this book and their availability for Varnish Cache users
and Varnish Plus customers. The covered components of Varnish Plus are described in the Varnish
Plus Software Components chapter. For more information about the complete Varnish Plus offer,
please visit https://www.varnish-software.com/what-is-varnish-plus.
At the moment of writing this book, Varnish Cache supports the operating systems and Linux
distributions listed in Table 2.
Table 2: Varnish Cache and Varnish Plus supported platforms
Varnish Cache
Varnish Plus
Deprecated
Deprecated
Yes
Yes
Coming soon
Coming soon
Yes
Yes
Yes
Yes
Yes
Yes
FreeBSD 9
Yes
No
FreeBSD 10
Yes
No
Note
Varnish Cache Plus should not be confused with Varnish Plus, a product offering by Varnish
Software. Varnish Cache Plus is one of the software components available for Varnish Plus
customers.
Chapter 1Introduction
Page 21
Page 22
Chapter 1Introduction
and
you
must
explicitly
return
it:
replaced
by
std.port(client.ip)
and
Chapter 1Introduction
Page 23
is renamed to thread_pool_destroy_delay
and it is in
Page 24
2 Design Principles
Varnish is designed to:
Solve real problems
Run on modern hardware (64-bit multi-core architectures)
Work with the kernel, not against it
Translate Varnish Configuration Language (VCL) to C programming language
Be extendible via Varnish Modules (VMODs)
Reduce lock-contention via its workspace-oriented shared memory model
The focus of Varnish has always been performance and flexibility. Varnish is designed for hardware
that you buy today, not the hardware you bought 15 years ago. This is a trade-off to gain a simpler
design and focus resources on modern hardware. Varnish is designed to run on 64-bit architectures
and scales almost proportional to the number of CPU cores you have available. Though CPU-power
is rarely a problem.
32-bit systems, in comparison to 64-bit systems, allow you to allocate less amount of virtual memory
space and less number of threads. The theoretical maximum space depends on the operating
system (OS) kernel, but 32-bit systems usually are bounded to 4GB. You may get, however, about
3GB because the OS reserves some space for the kernel.
Varnish uses a workspace-oriented memory-model instead of allocating the exact amount of space it
needs at run-time. Varnish does not manage its allocated memory, but it delegates this task to the
OS because the kernel can normally do this task better than a user-space program.
Event filters and notifications facilities such as epoll and kqueue are advanced features of the OS
that are designed for high-performance services like Varnish. By using these, Varnish can move a lot
of the complexity into the OS kernel which is also better positioned to decide which threads are
ready to execute and when.
In addition, Varnish uses a configuration language (VCL) that is translated to C programming
language code. This code is compiled with a standard C compiler and then dynamically linked
directly into Varnish at run-time. This has several advantages. The most practical is the freedom you
get as system administrator.
You can use VCL to decide how you want to interact with Varnish, instead of having a developer
trying to predict every possible caching scenario. The fact that VCL is translated to C code, gives
Varnish a very high performance. You can also by-pass the process of code translation and write raw
C code, this is called in-line C in VCL. In short: VCL allows you to specify exactly how to use and
combine the features of Varnish.
Varnish allows integration of Varnish Modules or simply VMODs. These modules let you extend the
functionality of VCL by pulling in custom-written features. Some examples include non-standard
header manipulation, access to memcached or complex normalization of headers.
The shared memory log (SHMLOG) allows Varnish to log large amounts of information at almost no
cost by having other applications parse the data and extract the useful bits. This design and other
mechanisms decrease lock-contention in the heavily threaded environment of Varnish.
Page 25
To summarize: Varnish is designed to run on modern hardware under real work-loads and to solve
real problems. Varnish does not cater to the "I want to make Varnish run on my 486 just
because"-crowd. If it does work on your 486, then that's fine, but that's not where you will see our
focus. Nor will you see us sacrifice performance or simplicity for the sake of niche use-cases that can
easily be solved by other means -- like using a 64-bit OS.
Page 26
Page 27
Page 28
3 Getting Started
In this chapter, you will:
learn about the Varnish distribution,
install Varnish and Apache,
configure Varnish to use Apache as backend, and
cover basic configuration.
Most of the commands you will type in this course require root privileges. You can get temporary
root privileges by typing sudo <command>, or permanent root privileges by typing sudo -i.
In Varnish terminology, a backend server is whatever server Varnish talks to fetch content. This can
be any sort of service as long as it understands HTTP. Most of the time, Varnish talks to a web server
or an application frontend server. In this book, we use backend, web server or application frontend
server interchangeably.
Page 29
Note
There is a delay in the log process, but usually not noticeable.
Page 30
Page 31
SysV
systemd
systemd
Ubuntu/Debian
RHEL/CentOS
Ubuntu/Debian
Fedora/RHEL
7+/CentOS 7+
/etc/default/varnish
/etc/sysconfig/varnish
/etc/systemd/system/va
rnish.service [1]
/etc/varnish/varnish.par
ams
/etc/default/varnishlog
[2]
/etc/systemd/system/va
rnishlog.service [1]
[3]
/etc/default/varnishncsa
[2]
/etc/systemd/system/va
rnishncsa.service [1]
[3]
[1] The file does not exist by default. Copy it from /lib/systemd/system/ and edit it.
[2]
There
is
no
configuration
file.
chkconfig varnishlog/varnishncsa on/off instead.
Use
the
[3]
There
is
no
configuration
file.
Use
the
systemctl start/stop/enable/disable/ varnishlog/varnishncsa instead.
command
command
Result
Configuration file
Varnish
Listens on port 80
/etc/default/varnish *
Varnish
/etc/varnish/default.vcl
Apache
/etc/apache2/ports.conf * and
/etc/apache2/sites-enabled/000-default *
Page 32
to
8080
in
/etc/apache2/ports.conf
and
Page 33
Plus
and
VMODs
in
# Remember to replace DISTRO and RELEASE with what applies to your system.
# distro=(debian|ubuntu), RELEASE=(precise|trusty|wheezy|jessie)
# Varnish Cache Plus 4.0 and VMODs
deb https://<username>:<password>@repo.varnish-software.com/DISTRO RELEASE \
varnish-4.0-plus
# non-free contains VAC, VCS, Varnish Tuner and proprietary VMODs.
deb https://<username>:<password>@repo.varnish-software.com/DISTRO RELEASE \
non-free
Then:
apt-get update
apt-get install varnish-plus
All software related to Varnish Cache Plus including VMODs are available in RedHat and Debian
package repositories. These repositories are available on http://repo.varnish-software.com/, using
your customer specific username and password.
Varnish is already distributed in many package repositories, but those packages might contain an
outdated Varnish version. Therefore, we recommend you to use the packages provided by
varnish-software.com for Varnish Cache Plus or varnish-cache.org for Varnish Cache. Please be
advised that we only provide packages for LTS releases, not all the intermediate releases. However,
these packages might still work fine on newer releases.
To use Varnish Cache Plus repositories
/etc/yum.repos.d/varnish-4.0-plus.repo:
on
RHEL
6,
put
the
following
[varnish-4.0-plus]
name=Varnish Cache Plus
baseurl=https://<username>:<password>@repo.varnish-software.com/redhat
/varnish-4.0-plus/el$releasever
enabled=1
gpgcheck=0
in
Page 34
[varnish-admin-console]
name=Varnish Administration Console
baseurl=
https://<username>:<password>@repo.varnish-software.com/redhat
/vac/el$releasever
enabled=1
gpgcheck=0
[varnishtuner]
name=Varnish Tuner
baseurl=
https://<username>:<password>@repo.varnish-software.com/redhat
/varnishtuner/el$releasever
enabled=1
gpgcheck=0
If you want to install Varnish Cache in Ubuntu change the corresponding above lines to:
curl https://repo.varnish-cache.org/ubuntu/GPG-key.txt | apt-key add echo "deb https://repo.varnish-cache.org/ubuntu/ trusty varnish-4.0" >> \
/etc/apt/sources.list.d/varnish-cache.list
apt-get install varnish
Change the Linux distribution and Varnish Cache release in the needed lines.
Page 35
0
0
0 0.0.0.0:80
0 127.0.0.1:1234
0.0.0.0:*
0.0.0.0:*
LISTEN
LISTEN
9223/varnishd
9221/varnishd
Note
We recommend you to disable Security-Enhanced Linux (SELinux). If you prefer otherwise,
then set the boolean varnishd_connect_any variable to 1. You can do that by executing
the command sudo setsebool varnishd_connect_any 1. Also, be aware that SELinux
defines the ports 6081 and 6082 for varnishd.
Tip
Issue the command man vcl to see all available options to define a backend.
Page 36
Tip
You can also configure Varnish via the Varnish Administration Console (VAC).
Figure 3: GUI to configure Varnish via the Varnish Administration Console (VAC).
Page 37
Page 38
Tip
Varnish provides many on-line reference manuals. To learn more about varnishadm, issue
man varnishadm. To check the Varnish CLI manual page, issue man varnish-cli.
Page 39
Tip
You can also access the varnishadm via the Varnish Administration Console (VAC). To do
that, you just have to navigate to the CONFIGURE tab and click on the Varnish server you
want to administrate. Then, varnishadm is ready to use in a terminal emulator right in your
web browser.
Figure 4: Access to varnishadm by clicking on the Varnish server that you want to
administrate.
Page 40
Restart Required
Yes
Tunable parameters
No (if changed in
varnishadm)
Configuration in VCL
No
Yes
Result
varnishadm vcl.load
<configname> <filename> and
varnishadm vcl.use
<configname>
Page 41
There are other ways to reload VCL and make parameter-changes take effect, mostly using the
varnishadm
tool.
However,
using
the
service
varnish
reload
and
service varnish restart commands is a good habit.
Note
If you want to know how the service varnish-commands work, look at the script that runs
behind the scenes. The script is in /etc/init.d/varnish.
Warning
The varnish script-configuration (located under /etc/default/ or /etc/sysconfig/) is directly
sourced as a shell script. Pay close attention to any backslashes (\) and quotation marks that
might move around as you edit the DAEMON_OPTS environmental variable.
Page 42
-f <filename>
VCL file
-p <parameter=value>
-S <secretfile>
-T <hostname:port>
-s <storagetype,options>
All the options that you can pass to the varnishd binary are documented in the varnishd(1)
manual page (man varnishd). You may want to take a moment to skim over the options mentioned
above.
For Varnish to start, you must specify a backend. You can specify a backend by two means: 1) declare
it in a VCL file, or 2) use the -b to declare a backend when starting varnishd.
Though they are not strictly required, you almost always want to specify a -s to select a storage
backend, -a to make sure Varnish listens for clients on the port you expect and -T to enable a
management interface, often referred to as a telnet interface.
For both -T and -a, you do not need to specify an IP, but can use :80 to tell Varnish to listen to
port 80 on all IPs available. Make sure you do not forget the colon, as -a 80 tells Varnish to listen to
the IP with the decimal-representation "80", which is almost certainly not what you want. This is a
result of the underlying function that accepts this kind of syntax.
You can specify -p for parameters multiple times. The workflow for tuning Varnish parameters
usually is that you first try the parameter on a running Varnish through the management interface to
find the value you want. Then, you store the parameter and value in a configuration file. This file is
read every time you start Varnish.
The -S option specifies a file which contains a secret to be used for authentication. This can be used
to authenticate with varnishadm -S as long as varnishadm can read the same secret file -- or
rather the same content: The content of the file can be copied to another machine to allow
varnishadm to access the management interface remotely.
Note
Varnish requires that you specify a backend. A backend is normally specified in the VCL file.
You specify the VCL file with the -f option. However, it is possible to start Varnish without a
VCL file by specifying the backend server with the -b <hostname:port> option instead.
Since the -b option is mutually exclusive with the -f option, we use only the -f option. You
can use -b if you do not intend to specify any VCL and only have a single backend server.
Tip
Type man varnishd to see all options of the Varnish daemon.
Page 43
Page 44
Tip
You can also add and edit your VCL code via the Varnish Administration Console (VAC). This
interface also allows you to administrate your VCL files.
Figure 6: GUI of Varnish Administration Console (VAC) with command line interface to edit your
VCL code.
Page 45
Page 46
Page 47
Tip
All utility programs to display Varnish logs have installed reference manuals. Use the man
command to retrieve their manual pages.
Page 48
Page 49
Page 50
4.3 Transactions
One transaction is one work item in Varnish.
Share a single Varnish Transaction ID (VXID) per types of transactions. Examples of transaction
types are:
Session
Client request
Backend request
Transaction reasons. Examples:
ESI request
restart
fetch
A transaction is a set of log lines that belongs together, e.g. a client request or a backend request.
The Varnish Transaction IDs (VXIDs) are applied to lots of different kinds of work items. A unique
VXID is assigned to each type of transaction. You can use the VXID when you view the log through
varnishlog.
The default is to group the log by VXID. When viewing a log for a simple cache miss, you can see the
backend request, the client request and then the session. They are displayed in the order they end.
Some people find it a bit counter intuitive that the backend request is logged before the client
request, but if you think about it makes sense.
Page 51
Page 52
Page 53
Page 54
Taglists are not case-sensitive, but we recommend you to follow the same format as declared in
man vsl.
The grouping and the query log processing all happens in the Varnish logging API. This means that
other programs using the varnishlog API automatically get grouping and query language.
Tip
man vsl-query shows you more details about query expressions. man vsl lists all taglists
and their syntax.
Page 55
4.5 Exercise
Make varnishlog only print client-requests where the ReqURL tag contains /favicon.ico.
Page 56
4.6 varnishstat
Uptime mgt:
1+23:38:08
Uptime child: 1+23:38:08
NAME
MAIN.uptime
MAIN.sess_conn
MAIN.client_req
MAIN.cache_hit
MAIN.cache_miss
MAIN.backend_conn
MAIN.backend_toolate
MAIN.backend_recycle
MAIN.fetch_length
MAIN.pools
MAIN.threads
MAIN.threads_created
MAIN.n_object
MAIN.n_objectcore
MAIN.n_objecthead
MAIN.n_backend
MAIN.n_expired
MAIN.s_sess
MAIN.s_req
MAIN.s_fetch
MAIN.s_req_hdrbytes
MAIN.s_resp_hdrbytes
MAIN.s_resp_bodybytes
MAIN.backend_req
MAIN.n_vcl
MAIN.bans
MAIN.n_gunzip
MGT.uptime
SMA.s0.c_req
SMA.s0.c_bytes
SMA.s0.c_freed
SMA.s0.g_alloc
SMA.s0.g_bytes
SMA.s0.g_space
VBE.default(127.0.0.1,,8080).bereq_hdrbytes
VBE.default(127.0.0.1,,8080).beresp_hdrbytes
VBE.default(127.0.0.1,,8080).beresp_bodybytes
Hitrate n:
avg(n):
CURRENT
171488
1055
1055
1052
3
4
3
4
4
2
200
200
1
3
4
1
2
1055
1055
3
122380
376249
3435094
4
1
1
4
171488
8
15968
11976
2
3992
268431464
630
1128
13024
CHANGE
1.00
7.98
7.98
7.98
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
7.98
7.98
0.00
926.07
2854.04
25993.71
0.00
0.00
0.00
0.00
1.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
10
0.9967
100
0.5686
438
0.3870
AVERAGE
AVG_10
AVG_100
AVG_1000
1.00
1.00
1.00
1.00
.
8.35
4.49
2.11
.
8.35
4.49
2.11
.
8.35
4.49
2.10
.
0.00
0.00
0.00
.
0.00
0.00
0.01
.
0.00
0.00
0.01
.
0.00
0.00
0.01
.
0.00
0.00
0.01
.
2.00
2.00
2.00
.
200.00
200.00
200.00
.
0.00
0.00
0.00
.
1.00
0.85
0.81
.
3.00
2.85
2.81
.
4.00
3.89
3.83
.
1.00
1.00
1.00
.
2.00
1.76
1.33
.
8.35
4.49
2.11
.
8.35
4.49
2.11
.
0.00
0.00
0.00
.
968.24
520.74
244.35
2.00
2982.17
1602.59
751.87
20.00
27177.59
14616.67
6858.74
.
0.00
0.00
0.01
.
0.00
0.00
0.00
.
1.00
1.00
1.00
.
0.00
0.00
0.01
1.00
1.00
1.00
1.00
.
0.00
0.01
0.01
.
0.01
18.98
27.33
.
0.00
12.17
18.56
.
2.00
1.70
1.62
.
3991.87
3398.82
3235.53
.
268431464.13 268432057.18 268432220.47
.
0.00
0.70
1.13
.
0.00
1.34
1.93
.
0.01
15.48
22.29
MAIN.cache_hit
Cache hits:
Count of cache hits. A cache hit indicates that an object has been delivered to a client without fetching it from a
backend server.
INFO
Page 57
Description
Name
Current
Change
The average per second change over the last update interval.
Average
The average value of this counter over the runtime of the Varnish daemon, or a
period if the counter can't be averaged.
Avg_10
Avg_100
Avg_1000
varnishstat looks only at counters. These counters are easily found in the SHMLOG, and are
typically polled at reasonable interval to give the impression of real-time updates. Counters, unlike
the rest of the log, are not directly mapped to a single request, but represent how many times a
specific action has occurred since Varnish started.
varnishstat gives a good representation of the general health of Varnish. Unlike all other tools,
varnishstat does not read log entries, but counters that Varnish updates in real-time. It can be
used to determine your request rate, memory usage, thread usage, number of failed backend
connections, and more. varnishstat gives you information just about anything that is not related
to a specific request.
There are over a hundred different counters available. To increase the usefulness of varnishstat,
only counters with a value different from 0 is shown by default.
varnishstat can be used interactively, or it can display the current values of all the counters with
the -1
option. Both methods allow you to specify specific counters using
-f field1 -f field2 .. to limit the list.
In interactive mode, varnishstat has three areas. The top area shows process uptime and hitrate
information. The center area shows a list of counter values. The bottom area shows the description
of the currently selected counter.
Hitrate n and avg(n) are related, where n is the number intervals. avg(n) measures the cache hit rate
within n intervals. The default interval time is one second. You can configure the interval time with
the -w option.
Since there is no historical data of counters changes, varnishstat has to compute the average
while it is running. Therefore, when you start varnishstat, Hitrate values start at 1, then they
increase to 10, 100 and 1000. In the example above, the interval is one second. The hitrate average
avg(n) show data for the last 10, 100, and 438 seconds. The average hitrate is 0.9967 (or 99.67%) for
the last 10 seconds, 0.5686 for the last 100 seconds and 0.3870 for the last 438 seconds.
In the above example Varnish has served 1055 requests and is currently serving roughly 7.98
requests per second. Some counters do not have "per interval" data, but are gauges with values that
increase and decrease. Gauges normally start with a g_ prefix.
Page 58
Tip
You can also see many parameters in real-time graphs with the Varnish Administration
Console (VAC). Here are screenshots:
Page 59
Tip
If you need to collect statistics from more than a single Varnish server, Varnish Custom
Statistics (VCS) allows you to do that. In addition, VCS allows you to define your metrics to
collect and analyze aggregated statistics, for example:
A/B testing
Measuring click-through rate
Track slow pages and cache misses
Analyze what is "hot" right now in a news website
Track changes in currency conversions in e-commerce
Track changes in Stock Keeping Units (SKUs) behavior in e-commerce
Track number of unique consumers of HLS/HDS/DASH video streams
Page 60
Description
MAIN.threads_limited
Counts how many times varnishd hits the maximum allowed number
of threads. The maximum number of Varnish threads is given by the
parameter thread_pool_max. Issue the command
varnishadm param.show thread_pool_max to see this
parameter.
MAIN.threads_failed
MAIN.thread_queue_len
MAIN.sess_queued
Contains the number of sessions that are queued because there are no
available threads immediately. Consider to increase the
thread_pool_min parameter.
MAIN.sess_dropped
MAIN.n_lru_nuked
Number of least recently used (LRU) objects thrown out to make room
for new objects. If this is zero, there is no reason to enlarge your cache.
Otherwise, your cache is evicting objects due to space constraints. In
this case, consider to increase the size of your cache.
MAIN.n_object
MAIN.client_req
MAIN.losthdr
Varnish provides a large number of counters for information, and debugging purposes. Table 8
presents counters that are typically important. Other counters may be relevant only for Varnish
developers when providing support.
Counters also provide feedback to Varnish developers on how Varnish works in production
environments. This feedback in turn allows Varnish to be developed according to its real usage. Issue
varnishstat -1 to list all counters with their current values.
Page 61
Tip
Remember that Varnish provides many reference manuals. To see all Varnish counter field
definitions, issue man varnish-counters.
Page 62
Chapter 5Tuning
Page 63
5 Tuning
This chapter is for the system administration course only
This section covers:
Architecture
Best practices
Parameters
Perhaps the most important aspect of tuning Varnish is writing effective VCL code. For now, however,
we will focus on tuning Varnish for your hardware, operating system and network. To be able to do
that, knowledge of Varnish architecture is helpful.
It is important to know the internal architecture of Varnish for two reasons. First, the architecture is
chiefly responsible for the performance, and second, it influences how you integrate Varnish in your
own architecture.
There are several aspects of the design that were unique to Varnish when it was originally
implemented. Truly good solutions, regardless of reusing ancient ideas or coming up with something
radically different, is the aim of Varnish.
Chapter 5Tuning
Page 64
Chapter 5Tuning
Page 65
Figure 14: Varnish Agent's HTML interface; designed to showcase the various features of the Varnish
Agent.
For
more
information
about
https://github.com/varnish/vagent2.
vagent2
and
installation
instructions,
please
visit
Varnish Software has a commercial offering of a fully functional web UI called Varnish Administration
Console (VAC). For more information about VAC, refer to the Varnish Administration Console (VAC)
section.
Page 66
Chapter 5Tuning
Note
Even if you do not perceive a lengthy service downtime, you should check whether the
Varnish child is being restarted. This is important, because child restarts introduce extra
loading time as varnishd is constantly emptying its cache. Automatic restarts are logged
into /var/log/syslog.
To verify that the child process is not being restarted, you can also check its lifetime with the
MAIN.uptime counter in varnishstat.
Varnish Software and the Varnish community at large occasionally get requests for assistance
in performance tuning Varnish that turn out to be crash-issues.
Chapter 5Tuning
Page 67
Page 68
Chapter 5Tuning
Chapter 5Tuning
Page 69
Note
As a rule of thumb use: malloc if it fits in memory, or file otherwise. Expect around 1kB of
overhead per object cached.
They approach the same basic problem from different angles. With the -s malloc method, Varnish
will request the entire size of the cache with a malloc() (memory allocation) library call. The operating
system divides the cache between memory and disk by swapping out what it can't fit in memory.
Another possibility is to use the -s file storage backend. This option creates a file on a filesystem
to contain the entire cache. Then, the operating system maps the entire file into memory if possible.
The -s file storage method does not retain data when you stop or restart Varnish! For this
purpose, Varnish provides a persistence option -s persistent. The usage of this option, however,
is strongly discouraged mainly because of the consistency issues that arise with it.
The Varnish Massive Storage Engine (MSE) is an improved storage backend for Varnish Plus only. Its
main improvements are decreased disk IO load and lower storage fragmentation. MSE is designed
and tested with storage sizes up to 10 TB.
When choosing storage backend, use malloc if your cache will be contained entirely or mostly in
memory. If your cache will exceed the available physical memory, you have two options: file or MSE.
We recommend you to use MSE because it performs much better than file storage backend.
It is important to keep in mind that the size you specify with the -s option is the size for the actual
cache. Varnish has an overhead on top of this for keeping track of the cache, so the actual memory
footprint of Varnish will exceed what the '-s' argument specifies if the cache is full. The current
estimate (subject to change on individual Varnish-versions) is that about 1kB of overhead needed for
each object. For 1 million objects, that means 1GB extra memory usage.
In addition to the per-object overhead, there is also a fairly static overhead which you can calculate
by starting Varnish without any objects. Typically around 100MB.
Page 70
Chapter 5Tuning
Warning
Some Varnish distribution setup the file storage backend option -s file by default. Those
distribution set a path that puts the storage file in the same directory as the shm-log. We
discourage this practice.
Chapter 5Tuning
Page 71
Tip
Parameters can also be configured via the Varnish Administration Console (VAC) as shown in
the figure below.
Page 72
Chapter 5Tuning
Figure 15: GUI to configure parameters via the Varnish Administration Console (VAC).
Chapter 5Tuning
Page 73
Warning
Copying Varnish Tuner suggestions to other systems might not be a good idea.
Page 74
Chapter 5Tuning
or
in
Chapter 5Tuning
Page 75
put
the
following
lines
[varnishtuner]
name=Varnishtuner
baseurl=https://<username>:<password>@repo.varnish-software.com/redhat/ \
varnishtuner/el6
enabled=1
gpgcheck=0
into
Chapter 5Tuning
Page 76
Amount of threads
Task
cache-worker
Handle requests
cache-main
Startup
ban lurker
Clean bans
acceptor
epoll/kqueue
Configurable, default: 2
expire
backend poll
Health checks
The child process runs multiple threads in two thread pools. The threads of these pools are called
worker threads. Table 9 presents relevant threads.
Chapter 5Tuning
Page 77
Default
Description
thread_pool_add_delay
2 [milliseconds]
thread_pool_destroy_delay
1 second
thread_pool_fail_delay
200 [milliseconds]
thread_pool_max
500 [threads]
thread_pool_min
5 [threads]
thread_pool_stack
65536 [bytes]
thread_pool_timeout
300 [seconds]
thread_pools
2 [pools]
thread_queue_limit
20 requests
thread_stats_rate
10 [requests]
workspace_thread
2 [kb]
To tune Varnish, think about the expected traffic. The most important thread setting is the number
of cache-worker threads. You may configure thread_pool_min and thread_pool_max. These
parameters are per thread pool.
Although Varnish threading model allows you to use multiple thread pools, we recommend you to
do not modify this parameter. Time and experience shows that 2 thread pools are enough. Adding
more pools will not increase performance.
Note
If you run across the tuning advice that suggests to have a thread pool per CPU core, rest
assured that this is old advice. Experiments and data from production environments have
revealed that as long as you have two thread pools (which is the default), there is nothing to
Page 78
Chapter 5Tuning
gain by increasing the number of thread pools. Still, you may increase the number of threads
per pool.
All other thread variables are not configurable.
Chapter 5Tuning
Page 79
Page 80
Chapter 5Tuning
Warning
New threads use preallocated workspace. If threads have not enough workspace, the child
process is unable to process the task and it terminates. The workspace needed depends on
the task that the thread handles. This is normally defined in your VCL. To avoid that the child
terminates, evaluate your VCL code and consider to increase the workspace_client or
workspace_backend parameter.
Chapter 5Tuning
Page 81
Page 82
Chapter 5Tuning
Chapter 5Tuning
Page 83
5.9 Timers
Table 11: Timers
Parameter
Default
Description
Scope
connect_timeout
3.5 [s]
OS/network latency
Backend
first_byte_timeout
60 [s]
Backend
between_bytes_timeout
60 [s]
Hiccoughs
Backend
send_timeout
600 [seconds]
Client-in-tunnel
Client
timeout_idle
5 [seconds]
keep-alive timeout
Client
timeout_req
2 [seconds]
Client
cli_timeout
60 [seconds]
Management thread->child
Management
The timeout-parameters are generally set to pretty good defaults, but you might have to adjust them
for unusual applications. The default value of connect_timeout is 3.5 [s]. This value is more than
enough when having the Varnish server and the backend in the same server room. Consider to
increase the connect_timeout value if your Varnish server and backend have a higher network
latency.
Keep in mind that the session timeout affects how long sessions are kept around, which in turn
affects file descriptors left open. It is not wise to increase the session timeout without taking this into
consideration.
The cli_timeout is how long the management thread waits for the worker thread to reply before
it assumes it is dead, kills it and starts it back up. The default value seems to do the trick for most
users today.
Warning
If connect_timeout is set too high, it does not let Varnish handle errors gracefully.
Note
Another use-case for increasing connect_timeout occurs when virtual machines are
involved, as they can increase the connection time significantly.
Page 84
Chapter 5Tuning
Tip
More
information
https://www.varnish-software.com/blog/understanding-timeouts-varnish-cache.
in
Chapter 5Tuning
Page 85
and parameters in
Tip
You may need to enable the cgi module in apache. One way to do that, is by issuing the
commands: a2enmod cgi, and then service apache2 restart.
Page 86
Chapter 5Tuning
Note
It's not common to modify thread_pool_stack, thread_pool_add_delay
or
thread_pool_timeout. These assignments are for educational purposes, and not intended
as an encouragement to change the values.
Chapter 6HTTP
Page 87
6 HTTP
This chapter is for the webdeveloper course only
This chapter covers:
Protocol basics
Requests and responses
HTTP request/response control flow
Statelessness and idempotence
Cache related headers
HTTP is at the heart of Varnish, or rather the model HTTP represents.
This chapter will cover the basics of HTTP as a protocol, how it's used in the wild, and delve into
caching as it applies to HTTP.
Page 88
Chapter 6HTTP
latest
version
(HTTP/1.1)
is
available
from
Chapter 6HTTP
Page 89
6.2 Requests
Standard request methods are: GET, POST, HEAD, OPTIONS, PUT, DELETE, TRACE, or
CONNECT.
This is followed by a URI, e.g: /img/image.png or /index.html
Usually followed by the HTTP version
A new-line (CRLF), followed by an arbitrary
(Accept-Language, Cookie, Host, User-Agent, etc).
amount
of
CRLF-separated
headers
Page 90
Chapter 6HTTP
Chapter 6HTTP
Page 91
6.4 Response
HTTP/1.1 200 OK
Cache-Control: max-age=150
Content-Length: 150
[data]
An HTTP response contains the HTTP versions, a status code (e.g: 200) and a reason (e.g: OK)
CRLF as line separator
A number of headers
Headers are terminated with a blank line
Optional response body
The HTTP response is similar to the request itself. The response code informs the browser both
whether the request succeeded and what type of response this is. The response message is a
text-representation of the same information, and is often ignored by the browser itself.
Examples of status codes are 200 OK, 404 File Not Found, 304 Not Modified and so fort. They are all
defined in the HTTP standard, and grouped into the following categories:
1xx: Informational - Request received, continuing process
2xx: Success - The action was successfully received, understood, and accepted
3xx: Redirection - Further action must be taken in order to complete the request
4xx: Client Error - The request contains bad syntax or cannot be fulfilled
5xx: Server Error - The server failed to fulfill an apparently valid request
Note
The main difference between a request and a response (besides the semantics) is the start
line. Both share the same syntax for headers and the body, but some headers are request- or
response-specific.
Page 92
Chapter 6HTTP
Chapter 6HTTP
Page 93
Page 94
Chapter 6HTTP
Chapter 6HTTP
Page 95
Page 96
Chapter 6HTTP
Chapter 6HTTP
Page 97
6.10 Expires
The Expires response header field gives the date/time after which the response is considered stale.
A stale cache item will not be returned by any cache (proxy cache or client cache).
The syntax for this header is:
Expires: GMT formatted date
It is recommended not to define Expires too far in the future. Setting it to 1 year is usually enough.
Using Expires does not prevent the cached resource to be updated. If a resource is updated changing
its name (by using a version number for instance) is possible.
Expires works best for any file that is part of your design like JavaScripts stylesheets or images.
Chapter 6HTTP
Page 98
6.11 Cache-Control
The Cache-Control header field specifies directives that must be applied by all caching mechanisms
(from proxy cache to browser cache). Cache-Control accepts the following arguments (only the most
relevant are described):
public: The response may be cached by any cache.
no-store: The response body must not be stored by any cache mechanism;
no-cache: Authorizes a cache mechanism to store the response in its cache but it must not
reuse it without validating it with the origin server first. In order to avoid any confusion with this
argument think of it as a "store-but-do-no-serve-from-cache-without-revalidation" instruction.
max-age: Specifies the period in seconds during which the cache will be considered fresh;
s-maxage: Like max-age but it applies only to public caches;
must-revalidate: Indicates that a stale cache item can not be serviced without revalidation with
the origin server first;
Table 12: Cache-control argument for each context
Argument
Request
Response
no-cache
no-store
max-age
s-maxage
max-stale
min-fresh
no-transform
only-if-cached
public
private
must-revalidate
proxy-revalidate
Unlike Expires, Cache-Control is both a request and a response header. Table 12 summarizes the
arguments you may use for each context.
Example of a Cache-Control header:
Cache-Control: public, must-revalidate, max-age=2592000
Chapter 6HTTP
Page 99
Note
As you might have noticed Expires and Cache-Control do more or less the same job,
Cache-Control gives you more control though. There is a significant difference between these
two headers:
Cache-Control uses relative times in seconds, cf (s)max-age
Expires always returns an absolute date
Note
Cache-Control always overrides Expires.
Note
By default, Varnish does not care about the Cache-Control request header. If you want to let
users update the cache via a force refresh you need to do it yourself.
Page 100
Chapter 6HTTP
6.12 Last-Modified
The Last-Modified response header field indicates the date and time at which the origin server
believes the variant was last modified. This headers may be used in conjunction with
If-Modified-Since and If-None-Match.
Example of a Last-Modified header:
Last-Modified: Wed, 01 Sep 2004 13:24:52 GMT
Chapter 6HTTP
Page 101
6.13 If-Modified-Since
The If-Modified-Since request header field is used with a method to make it conditional:
if the requested variant has not been modified since the time specified in this field, an entity
will not be returned from the server;
instead, a 304 (not modified) response will be returned without any message-body.
Example of an If-Modified-Since header:
If-Modified-Since: Wed, 01 Sep 2004 13:24:52 GMT
Chapter 6HTTP
Page 102
6.14 If-None-Match
The If-None-Match request header field is used with a method to make it conditional.
A client that has one or more entities previously obtained from the resource can verify that none of
those entities is current by including a list of their associated entity tags in the If-None-Match header
field.
The purpose of this feature is to allow efficient updates of cached information with a minimum
amount of transaction overhead. It is also used to prevent a method (e.g. PUT) from inadvertently
modifying an existing resource when the client believes that the resource does not exist.
Example of an If-None-Match header:
If-None-Match: "1edec-3e3073913b100"
Chapter 6HTTP
Page 103
6.15 Etag
The ETag response header field provides the current value of the entity tag for the requested
variant. The idea behind Etag is to provide a unique value for a resource's contents.
Example of an Etag header:
Etag: "1edec-3e3073913b100"
Page 104
Chapter 6HTTP
6.16 Pragma
The Pragma request header is a legacy header and should no longer be used. Some applications still
send headers like Pragma: no-cache but this is for backwards compatibility reasons only.
Any proxy cache should treat Pragma: no-cache as Cache-Control: no-cache, and should not
be seen as a reliable header especially when used as a response header.
Chapter 6HTTP
Page 105
6.17 Vary
The Vary response header indicates the response returned by the origin server may vary depending
on headers received in the request.
The most common usage of Vary is to use Vary: Accept-Encoding, which tells caches (Varnish
included) that the content might look different depending on the Accept-Encoding-header the client
sends. In other words: The page can be delivered compressed or uncompressed depending on the
client.
The Vary-header is one of the trickiest headers to deal with for a cache. A cache, like Varnish, does
not necessarily understand the semantics of a header, or what part triggers different variants of a
page.
As a result, using Vary: User-Agent for instance tells a cache that for ANY change in the
User-Agent-header, the content might look different. Since there are probably thousands of
User-Agent strings out there, this means you will drastically reduce the efficiency of any cache
method.
An other example is using Vary: Cookie which is actually not a bad idea. Unfortunately, you can't
issue Vary: Cookie(but only THESE cookies: ...). And since a client will send you a great
deal of cookies, this means that just using Vary: Cookie is not necessarily sufficient. We will
discuss this further in the Content Composition chapter.
Note
Varnish can handle Accept-Encoding and Vary: Accept-Encoding, because Varnish has
support for gzip compression.
Page 106
Chapter 6HTTP
6.18 Age
A cache server can send an additional response header, Age, to indicate the age of the
response.
Varnish (and other caches) does this.
Browsers (and Varnish) will use the Age-header to determine how long to cache.
E.g: for a max-age-based equation: cache duration = max-age - Age
If you allow Varnish to cache for a long time, the Age-header could effectively disallow
client-side caches.
Consider what happens if you let Varnish cache content for a week, because you can easily invalidate
the cache Varnish keeps. If you do not change the Age-header, Varnish will happily inform clients
that the content is, for example, two days old, and that the maximum age should be no more than
fifteen minutes.
Browsers will obey this. They will use the reply, but they will also realize that it has exceeded its
max-age, so they will not cache it.
Varnish will do the same, if your web-server emits and Age-header (or if you put one Varnish-server
in front of another).
We will see in later chapters how we can handle this in Varnish.
Chapter 6HTTP
Page 107
Request
Expires
Cache-Control
X
X
Last-Modified
X
X
If-Modified-Since
If-None-Match
Etag
Pragma
Response
X
X
Vary
Age
Chapter 6HTTP
Page 108
Chapter 6HTTP
Page 109
to
send
an
Age
header
that
says
30
and
2. Watch varnishlog.
3. Send a request to Varnish for article.php. See what Age-Header Varnish replies with.
4. Is the Age-header an accurate method to determine if Varnish made a cache hit or not?
5. How long does Varnish cache the reply? How long would a browser cache it?
Also consider how you would avoid issues like this to begin with. We do not yet know how to modify
Varnish' response headers, but hopefully you will understand why you may need to do that.
Varnish is not the only part of your web-stack that parses and honors cache-related headers. The
primary consumer of such headers are the web browsers, and there might also be other caches
along the way which you do not control, like a company-wide proxy server.
By using s-maxage instead of max-age we limit the number of consumers to cache servers, but even
s-maxage will be used by caching proxies which you do not control.
In the next few chapters, you will learn how to modify the response headers Varnish sends. That
way, your web-server can emit response headers that are only seen and used by Varnish.
Page 110
7 VCL Basics
The Varnish Configuration Language (VCL) is a domain-specific language
VCL as a state machine
VCL subroutines
Built-in subroutines
Available functions, legal return actions and variables.
The Varnish Configuration Language (VCL) is a domain-specific language designed to describe
request handling and document caching policies for Varnish Cache. When a new configuration is
loaded, the varnishd manager process translates the VCL code to C and compiles it to a shared
object. This shared object is then loaded into the cacher process.
This chapter focuses on the most important tasks to write effective VCL code. For this, you will learn
the basic syntax of VCL, and the most important VCL built-in subroutines: VCL_recv and
VCL_backend_fetch. All other built-in subroutines are taught in the next chapter.
Tip
Remember that Varnish has many reference manuals. For more details about VCL, check its
manual page by issuing man vcl.
Page 111
Page 112
VCL is also often described as a finite state machine. Each state has available certain parameters that
you can use in your VCL code. For example: response HTTP headers are only available after
vcl_backend_fetch state.
Figure 21 shows a simplified version of the Varnish finite state machine. This version shows by no
means all possible transitions, but only a traditional set of them. Figure 22 shows a detailed version
of the state machine for the frontend worker as a request flow diagram. A detailed version of the
request flow diagram for the backend worker is in the VCL vcl_backend_fetch and
vcl_backend_response section.
States in VCL are conceptualized as subroutines, with the exception of the waiting state described in
Waiting State
Subroutines in VCL take neither arguments nor return values. Each subroutine terminates by calling
return (action), where action is a keyword that indicates the desired outcome. Subroutines
may inspect and manipulate HTTP headers and various other aspects of each request. Subroutines
instruct how requests are handled.
Subroutine example:
sub pipe_if_local {
if (client.ip ~ local) {
return (pipe);
}
}
To call a subroutine, use the call instruction followed by the subroutine's name:
call pipe_if_local;
Varnish has built-in subroutines that are hook into the Varnish workflow. These built-in subroutines
are all named vcl_*. Your own subroutines cannot start their name with vcl_.
Page 113
Page 114
7.2 Detailed Varnish Request Flow for the Client Worker Thread
Figure 22: Detailed Varnish Request Flow for the Client Worker Thread
Page 115
Figure 22 shows the detailed request flow diagram of the backend worker. This diagram details the
grayed grayed box in Figure 21.
Page 116
Page 117
Warning
If you define your own subroutine and call it from one of the built-in subroutines, executing
return(foo) does not return execution from your custom subroutine to the default
function, but returns execution from VCL to Varnish.
Page 118
Subroutines
hash_data
vcl_hash
new
vcl_init
synthetic
vcl_synth, vcl_backend_error
VCL offers a handful of simple to use built-in functions that allow you to modify strings, add bans,
restart the VCL state engine and return control from the VCL Run Time (VRT) environment to Varnish.
This book describes the most important functions in later sections, so the description at this point is
brief.
regsub() and regsuball() have the same syntax and does the almost same thing: They both
take a string str as input, search it with a regular-expression regex and replace it with another
string. The difference between regsub() and regsuball() is that the latter changes all
occurrences while the former only affects the first match.
The ban() function invalidates all objects in cache that match the expression exp with the ban
mechanism. banning and purging in detailed in the Cache Invalidation chapter.
Page 119
scope
deliver
vcl_deliver
client
vcl_hash
client
vcl_hit
client
vcl_miss
fetch
restart
hash
pass
pipe
synth
purge
lookup
x
x
client
vcl_pass
client
vcl_pipe
client
vcl_purge
client
vcl_recv
client
vcl_synth
client
x
x
x
x
Table 16: VCL built-in subroutines and their legal returns at the backend side, vcl.load, and
vcl.discard
subroutine
scope
abandon
retry
ok
vcl_backend_error
backend
vcl_backend_fetch
backend
vcl_backend_response
backend
vcl_init
vcl.load
vcl_fini
vcl.discard
fail
x
x
The table above shows the VCL built-in subroutines and their legal returns. return() is a built-in
function that ends execution of the current VCL subroutine, and continue to the next action step in
the request handling state machine. Legal return actions are: lookup, synth, purge, pass, pipe, fetch,
deliver, hash, restart, retry, and abandon.
Note
Varnish 4 defines purge as a return action. This is contrary to Varnish 3, where purge is a
function.
Page 120
req.
bereq.
beresp.
vcl_backend_fetch
R/W
vcl_backend_response
R/W
R/W
vcl_backend_error
R/W
R/W
vcl_recv
obj.
resp.
R/W
vcl_pipe
R/W
vcl_pass
R/W
vcl_hash
R/W
vcl_purge
R/W
vcl_miss
R/W
vcl_hit
R/W
vcl_deliver
R/W
vcl_synth
R/W
R/W
R/W
The State column lists the different states in a request work-flow. States are handled by subroutines,
which have a leading vcl_ prefix name.
The Variables columns list the prefix of the available variables. Most but not all of the prefixed
variables are readable (R) or writable (W) at the given state. To have a detailed availability of each
variable, refer to the VCL man page by typing: man vcl.
Table 17 shows the availability of variables in different states of the Varnish finite state machine. In
addition to the variable prefixes in Table 17, there are other three variables prefixes; client.*,
server.*, and storage.*, which are accessible from all subroutines at the frontend (client) side.
Another variable is now, which is accessible from all subroutines.
These additional prefixes and variable are practically accessible everywhere.
Remember that changes made to beresp. variables are stored in obj. afterwards. And the
resp. variables are copies of what is about to be returned to the client. The values of resp.
variables come possibly from obj..
A change to beresp. variables, in other words, affects obj. and resp. variables. Similar
semantics apply to req. and bereq. variables.
Variables belonging to the backend request (bereq.) are assigned with values from the original
request (req.). However, their values may slightly differ, because Varnish may modify HTTP requests
methods. For example, client requests with HEAD methods may be converted to backend requests
with GET methods.
Page 121
Many but not all of the variables are self-explanatory. To get more information about a particular
variable, consult the VCL man page or ask the instructor at the course.
Page 122
Page 123
Note
It is strongly advised to let the default built-in subroutines whenever is possible. The built-in
subroutines are designed with safety in mind, which often means that they handle any flaws
in your VCL code in a reasonable manner.
Tip
Looking at the code of built-in subroutines can help you to understand how to build your own
VCL
code.
Built-in
subroutines
are
in
the
file
/usr/share/doc/varnish/examples/builtin.vcl.gz
or
{varnish-source-code}/bin/varnishd/builtin.vcl. The first location may change
depending on your distro.
Page 124
Page 125
Tip
The built-in vcl_recv subroutine may not cache all what you want, but often it's better not
to cache some content instead of delivering the wrong content to the wrong user. There are
exceptions, of course, but if you can not understand why the default VCL does not let you
cache some content, it is almost always worth it to investigate why instead of overriding it.
Page 126
Page 127
detection
at
This simple VCL will create a request header called X-Device which will contain either mobile or
desktop. The web server can then use this header to determine what page to serve, and inform
Varnish about it through Vary: X-Device.
It might be tempting to just send Vary: User-Agent, but that requires you to normalize the
User-Agent header itself because there are many tiny variations in the description of similar
User-Agents. This normalization, however, leads to loss of detailed information of the browser. If you
pass the User-Agent header without normalization, the cache size may drastically inflate because
Varnish would keep possibly hundreds of different variants per object and per tiny User-Agent
variants. For more information on the Vary HTTP response header, see the Vary section.
Note
If you do use Vary: X-Device, you might want to send Vary: User-Agent to the users
after Varnish has used it. Otherwise, intermediary caches will not know that the page looks
different for different devices.
Page 128
"value";
or
In point 2, change req.http.host by calling the function regsub(str, regex, sub). str is the
input string, in this case, req.http.host. regex is the regular-expression matching whatever
content you need to change. Use ^ to match what begins with www, and \. to finish the
regular-expression, i.e. ^www.. sub is what you desire to change it with, an empty string "" can be
used to remove what matches regex.
In point 3, you can check for host headers with a specific domain name:
if (req.http.host == "sport.example.com"). An alternative is to check for all hosts that start
with sport, regardless the domain name: if (req.http.host ~ "^sport\."). In the first case,
setting the host header is straight forward: set req.http.host = "example.com". In the second
case, you can set the host header by removing the string that precedes the domain name
set req.http.host = regsub(req.http.host,"^sport\.", ""); Finally, you rewrite the
URL in this way: set req.url = regsub(req.url, "^", "/sport");.
To simulate client requests, you can issue the following command:
http -p hH --proxy=http:http://localhost sport.example.com/article1.html
To verify your result, you can issue the following command:
``varnishlog -i ReqHeader,ReqURL``.
Tip
Remember that man vcl contains a reference manual with the syntax and details of
functions such as regsub(str, regex, sub). We recommend you to leave the default VCL
file untouched, and create a new file for your VCL code. Remember to update the location of
the VCL file in the Varnish configuration file, and restart Varnish.
Page 129
Page 130
Page 131
Page 132
8.2.1 hit-for-pass
Used when an object should not be cached
hit-for-pass object instead of fetched object
Has TTL
Some requested objects should not be cached. A typical example is when a requested page contains
a Set-Cookie response header, and therefore it must be delivered only to the client that requests
it. In this case, you can tell Varnish to create a hit-for-pass object and stores it in the cache, instead of
storing the fetched object. Subsequent requests are processed in pass mode.
When an object should not be cached, the beresp.uncacheable variable is set to true. As a result,
the cacher process keeps a hash reference to the hit-for-pass object. In this way, the lookup operation
for requests translating to that hash find a hit-for-pass object. Such requests are handed over to the
vcl_pass subroutine, and proceed in pass mode.
As any other cached object, hit-for-pass objects have a TTL. Once the object's TTL has elapsed, the
object is removed from the cache.
Page 133
Figure 23: Varnish Request Flow for the Backend Worker Thread.
Figure 23 shows the vcl_backend_fetch, vcl_backend_response and vcl_backend_error
subroutines. These subroutines are the backend-counterparts to vcl_recv. You can use data
provided by the client in vcl_recv or even vcl_backend_fetch to decide on caching policy. An
important difference is that you have access to bereq.* variables in vcl_backend_fetch.
vcl_backend_fetch can be called from vcl_miss or vcl_pass. When vcl_backend_fetch is
called from vcl_miss, the fetched object may be cached. If vcl_backend_fetch is called from
vcl_pass, the fetched object is not cached even if obj.ttl or obj.keep variables are greater
than zero.
Page 134
A relevant variable is bereq.uncacheable. This variable indicates whether the object requested
from the backend may be cached or not. However, all objects from pass requests are never cached,
regardless the bereq.uncacheable variable.
vcl_backend_fetch has two possible terminating actions, fetch or abandon. The fetch action sends
the request to the backend, whereas the abandon action calls the vcl_synth routine. The built-in
vcl_backend_fetch subroutine simply returns the fetch action. The backend response is
processed by vcl_backend_response or vcl_backend_error.
Figure 24 shows that vcl_backend_response may terminate with one of the following actions:
deliver, abandon, or retry. The deliver terminating action may or may not insert the object into the
cache depending on the response of the backend.
Backends might respond with a 304 HTTP headers. 304 responses happen when the requested
object has not been modified since the timestamp If-Modified-Since in the HTTP header. If the
request hits a non fresh object (see Figure 2), Varnish adds the If-Modified-Since header with
the value of t_origin to the request and sends it to the backend.
304 responses do not contain a message-body. Thus, Varnish tries to steal the body from cache,
merge it with the header response and deliver it. This process updates the attributes of the cached
object.
Typical tasks performed in vcl_backend_fetch or vcl_backend_response include:
Overriding cache time for certain URLs
Stripping Set-Cookie headers that are not needed
Stripping bugged Vary headers
Adding helper-headers to the object for use in banning (more information in later sections)
Applying other caching policies
Page 135
8.3.1 vcl_backend_response
built-in vcl_backend_response
sub vcl_backend_response {
if (beresp.ttl <= 0s ||
beresp.http.Set-Cookie ||
beresp.http.Surrogate-control ~ "no-store" ||
(!beresp.http.Surrogate-Control &&
beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
beresp.http.Vary == "*") {
/*
* Mark as "Hit-For-Pass" for the next 2 minutes
*/
set beresp.ttl = 120s;
set beresp.uncacheable = true;
}
return (deliver);
}
The vcl_backend_response built-in subroutine is designed to avoid caching in conditions that are
most probably undesired. For example, it avoids caching responses with cookies, i.e., responses with
Set-cookie HTTP header field. This built-in subroutine also avoids request serialization described in
the Waiting State section.
To avoid request serialization, beresp.uncacheable is set to true, which in turn creates a
hit-for-pass object. The hit-for-pass section explains in detail this object type.
If you still decide to skip the built-in vcl_backend_response subroutine by having your own and
returning deliver, be sure to never set beresp.ttl to 0. If you skip the built-in subroutine and
set 0 as TTL value, you are effectively removing objects from cache that could eventually be used to
avoid request serialization.
Note
Varnish 3.x has a hit_for_pass return action. In Varnish 4, this action is achieved by setting
beresp.uncacheable to true. The hit-for-pass section explains this in more detail.
Page 136
Page 137
Warning
Bear in mind that removing or altering the Age response header field may affect how
responses are handled downstream. The impact of removing the Age field depends on the
HTTP implementation of downstream intermediaries or clients.
For example, imagine that you have a three Varnish-server serial setup. If you remove the
Age field in the first Varnish server, then the second Varnish server will assume Age=0. In this
case, you might inadvertently be delivering stale objects to your client.
Page 138
Page 139
8.3.4 Example: Cache .jpg for 60 seconds only if s-maxage is not present
sub vcl_backend_response {
if (beresp.http.cache-control !~ "s-maxage" && bereq.url ~ "\.jpg$") {
set beresp.ttl = 60s;
}
}
The purpose of the above example is to allow a gradual migration to using a backend-controlled
caching policy. If the backend does not supply s-maxage, and the URL is a jpg file, then Varnish sets
beresp.ttl to 60 seconds.
The Cache-Control response header field can contain a number of directives. Varnish parses this
field and looks for s-maxage and max-age.
By default, Varnish sets beresp.ttl to the value of s-maxage if found. If s-maxage is not found,
Varnish uses the value max-age. If neither exists, Varnish uses the Expires response header field
to set the TTL. If none of those header fields exist, Varnish uses the default TTL, which is 120
seconds.
The default parsing and TTL assignment are done before vcl_backend_response is executed. The
TTL changing process is recorded in the TTL tag of varnishlog.
Page 140
Page 141
Page 142
Tip
Divide and conquer! Most somewhat complex VCL tasks are easily solved when you divide the
tasks into smaller problems and solve them individually. Try solving each part of the exercise
by itself first.
Note
Varnish automatically parses s-maxage for you, so you only need to check if it is there or not.
Remember that if s-maxage is present, Varnish has already used it to set beresp.ttl.
Page 143
Page 144
Note
One cache hash may refer to one or many object variations. Object variations are created
based on the Vary header field. It is a good practice to keep several variations under one
cache hash, than creating one hash per variation.
Page 145
Page 146
Page 147
Page 148
Note
Note how you can use {" and "} to make multi-line strings. This is not limited to the
synthetic() function, but can be used anywhere.
Page 149
Note
A vcl_synth defined object is never stored in cache, contrary to a vcl_backend_error
defined object, which may end up in cache. vcl_synth and vcl_backend_error replace
vcl_error from Varnish 3.
Page 150
Note
The 301 response can affect how browsers prioritize history and how search engines treat the
content. 302 responses are temporary and do not affect search engines as 301 responses do.
Page 151
Page 152
Page 153
Page 154
Page 155
9 Cache Invalidation
Cache invalidation is an important part of your cache policy
Varnish will automatically invalidate expired objects
You can however pro-actively invalidate objects with Varnish
You should define those rules before caching objects in production environments
There are four mechanisms to invalidate caches in Varnish:
1. HTTP PURGE
Use the vcl_purge subroutine
Invalidate caches explicitly using objects' hashes
vcl_purge is called via return(purge) from vcl_recv
vcl_purge removes all variants of an object from cache, freeing up memory
The restart return action can be used to update immediately a purged object
2. Banning
Use the built-in function ban(expression)
Invalidates objects in cache that match the regular-expression
Does not necessarily free up memory at once
Also accessible from the management interface
3. Force Cache Misses
Use req.hash_always_miss in vcl_recv
If set to true, Varnish disregards any existing objects and always (re)fetches from the
backend
May create multiple objects as side effect
Does not necessarily free up memory at once
4. Hashtwo -> Varnish Plus only!
For websites with the need for cache invalidation at a very large scale
Varnish Software's implementation of surrogate keys
Flexible cache invalidation based on cache tags
Page 156
Page 157
Page 158
Note
Cache invalidation with purges is done via return (purge); from vcl_recv in Varnish 4.
The purge; keyword from Varnish 3 has been retired.
Page 159
Page 160
Tip
Remember to place your php files under /var/www/html/.
=>
=>
=>
=>
=>
=>
true,
'PURGE',
true ,
true,
$finalURL,
2000
$fd = false;
if( $debug == true ) {
print "\n---- Curl debug -----\n";
$fd = fopen("php://output", 'w+');
$curlOptionList[CURLOPT_VERBOSE] = true;
$curlOptionList[CURLOPT_STDERR] = $fd;
}
$curlHandler = curl_init();
curl_setopt_array( $curlHandler, $curlOptionList );
curl_exec( $curlHandler );
curl_close( $curlHandler );
if( $fd !== false ) {
fclose( $fd );
}
Page 161
Page 162
}
?>
solution-purge-from-backend.vcl
acl purgers {
"127.0.0.1";
}
sub vcl_recv {
if (req.method == "PURGE") {
if (!client.ip ~ purgers) {
return (synth(405, "Not allowed."));
}
return (purge);
}
}
Page 163
Page 164
Page 165
Warning
Restarts are likely to cause a hit against the backend, so do not increase max_restarts
thoughtlessly.
Page 166
9.3 Banning
Use ban to prevent Varnish from serving a cached object
Does not free up memory
Examples in the varnishadm command line interface:
ban req.url ~ /foo
ban req.http.host ~ example.com && obj.http.content-type ~ text
ban.list
Example in VCL:
ban("req.url ~ /foo");
Example of VCL code to catch the HTTP BAN request method:
sub vcl_recv {
if (req.method == "BAN") {
ban("req.http.host == " + req.http.host +
" && req.url == " + req.url);
# Throw a synthetic page so the request won't go to the backend.
return(synth(200, "Ban added"));
}
}
Banning in the context of Varnish refers to adding a ban expression that prohibits Varnish to serve
certain objects from the cache. Ban expressions are more useful when using regular-expressions.
Bans work on objects already in the cache, i.e., it does not prevent new content from entering the
cache or being served. Cached objects that match a ban are marked as obsolete. Obsolete objects
are expunged by the expiry thread like any other object with obj.ttl == 0.
Ban expressions match against req.* or obj.* variables. Think about a ban expression as; "the
requested URL starts with /sport", or "the cached object has a Server-header matching lighttpd". You
can add ban expressions in VCL syntax via VCL code, HTTP request method, or CLI.
Ban expressions are inserted into a ban-list. The ban-list contains:
ID of the ban,
timestamp when the ban entered the ban list,
counter of objects that have matched the ban expression,
a C flag for completed that indicates whether a ban is invalid because it is duplicated,
the ban expression.
To inspect the current ban-list, issue the ban.list command in the CLI:
0xb75096d0 1318329475.377475
0xb7509610 1318329470.785875
10
20C
Page 167
obj.http.x-url ~ test0
obj.http.x-url ~ test1
Varnish tests bans whenever a request hits a cached object. A cached object is checked only against
bans added after the last checked ban. That means that each object checks against a ban expression
only once.
Bans that match only against obj.* are also checked by a background worker thread called the ban
lurker. The parameter ban_lurker_sleep controls how often the ban lurker tests obj.* bans. The
ban lurker can be disabled by setting ban_lurker_sleep to 0.
Note
You may accumulate a lot of ban expressions based in req.* variables if you have many
objects with long TTL that are seldom accessed. This accumulation occurs because bans are
kept until all cached objects have been checked against them. This might impact CPU usage
and thereby performance.
Therefore, we recommend you to avoid req.* variables in your ban expressions, and to use
obj.* variables instead. Ban expressions using only obj.* are called lurker-friendly bans.
Note
If the cache is completely empty, only the last added ban stays in the ban-list.
Tip
You can also execute ban expressions via the Varnish Administration Console (VAC).
Page 168
Figure 24: Executing ban expressions via the Varnish Administration Console (VAC).
Page 169
Page 170
The following snippet shows an example on how to preserve the context of a client request in the
cached object:
sub vcl_backend_response {
set beresp.http.x-url = bereq.url;
}
sub vcl_deliver {
# The X-Url header is for internal use only
unset resp.http.x-url;
}
Now imagine that you just changed the blog post template. To invalidate all blog posts, you can then
issue a ban such as:
$ varnishadm ban 'obj.http.x-url ~ ^/blog'
Since it uses a lurker-friendly ban expression, the ban inserted in the ban-list will be gradually
evaluated against all cached objects until all blog posts are invalidated. The snippet below shows
how to insert the same expression into the ban-list in the vcl_recv subroutine:
sub vcl_recv {
if (req.method == "BAN") {
# Assumes the ``X-Ban`` header is a regex,
# this might be a bit too simple.
ban("obj.http.x-url ~ " + req.http.x-ban);
return(synth(200, "Ban added"));
}
}
Page 171
Page 172
Page 173
Note
Forcing cache misses do not evict old content. This means that causes Varnish to have
multiple copies of the content in cache. The newest copy is always used. If you cache your
content for a long period of time, the memory usage increases gradually.
Page 174
Page 175
Page 176
Warning
You should protect purges with ACLs from unauthorized hosts.
Page 177
Purge
Hashtwo
Surrogate keys
Force Cache
Misses
Targets
One specific
object (with all its
variants)
One specific
object (with all its
variants)
Frees
memory
Immediately
Immediately
No
Scalability
High
High
No. Memory
usage increases
because old
objects are not
invalidated.
Flexibility
High
Low
High
Low
CLI
Yes
No
No
No
VCL
Yes
Yes
Yes
Yes
Availability
Varnish Cache
Varnish Cache
Varnish Cache
There is rarely a need to pick only one solution, as you can implement many of them. Some
guidelines for selection, though:
Any frequent automated or semi-automated cache invalidation most likely require VCL changes
for the best effect.
If you need to invalidate more than one item at a time, consider to use bans or hashtwo.
If it takes a long time to pull content from the backend into Varnish, consider to use
req.hash_always_miss.
Page 178
Note
Purge and Hashtwo work very similar. The main difference is that they have they act on
different hash keys.
Page 179
10 Saving a Request
This chapter is for the system administration course only
Table 19: Connotation of Saving a Request
Rescue
Directors
Health Checks
Grace Mode
Retry a Request
Economization
Protection
Page 180
10.1 Directors
Loadable VMOD
Contains 1 or more backends
All backends must be known
Selection methods:
round-robin
fallback
random
seeded with a random number
seeded with a hash key
Round-robin director example:
vcl 4.0;
import directors;
backend one {
.host = "localhost";
.port = "80";
}
backend two {
.host = "127.0.0.1";
.port = "81";
}
sub vcl_init {
new round_robin_director = directors.round_robin();
round_robin_director.add_backend(one);
round_robin_director.add_backend(two);
new random_director = directors.random();
random_director.add_backend(one, 10); # 2/3 to backend one
random_director.add_backend(two, 5);
# 1/3 to backend two
}
sub vcl_recv {
set req.backend_hint = round_robin_director.backend();
}
Page 181
Varnish can have several backends defined, and it can set them together into clusters for load
balancing purposes. Backend directors, usually just called directors, provide logical groupings of
similar web servers by re-using previously defined backends. A director must have a name.
There are several different director selection methods available, they are: random, round-robin,
fallback, and hash. The next backend to be selected depends on the selection method. The simplest
directors available are the round-robin and the random director.
A round-robin director takes only a backend list as argument. This director type picks the first
backend for the first request, then the second backend for the second request, and so on. Once the
last backend have been selected, backends are selected again from the top. If a health probe has
marked a backend as sick, a round-robin director skips it.
A fallback director will always pick the first backend unless it is sick, in which case it would pick the
next backend and so on. A director is also considered a backend so you can actually stack directors.
You could for instance have directors for active and passive clusters, and put those directors behind
a fallback director.
Random directors are seeded with either a random number or a hash key. Next section explains their
commonalities and differences.
Note
Health probes are explain in the Health Checks section.
Note
Directors are defined as loadable VMODs in Varnish 4. Please see the vmod_directors man
page for more information.
Page 182
Note
In Varnish 3 there is a client director type, which is removed in Varnish 4. This client director
type is a special case of the hash director. Therefore, the semantics of a client director type are
achieved using hash.backend(client.identity).
Page 183
Page 184
You can also declare standalone probes and reuse them for several backends. It is particularly useful
when you use directors with identical behaviors, or when you use the same health check procedure
across different web applications.
import directors;
probe www_probe {
.url = "/health";
}
backend www1 {
.host = "localhost";
.port = "8081";
.probe = www_probe;
}
backend www2 {
.host = "localhost";
.port = "8082";
.probe = www_probe;
}
sub vcl_init {
new www = directors.round_robin();
www.add_backend(www1);
www.add_backend(www2);
}
Note
Varnish does NOT send a Host header with health checks. If you need that, you can define an
entire request using .request instead of .url.
Note
The healthy function is implemented as VMOD in Varnish 4. req.backend.healthy from
Varnish 3 is replaced by std.healthy(req.backend_hint). Do not forget to include the
import line: import std;
Page 185
Page 186
Note
obj.ttl and obj.grace are countdown timers. Objects are valid in cache as long as they
have a positive remaining time equal to obj.ttl + obj.grace.
Page 187
Page 188
Page 189
by
typing
5. Send a single request, this time via Varnish, to cache the response from the CGI script. This
should take 10 seconds.
6. Send three requests: one before the TTL (10 seconds) elapses, another after 10 seconds and
before 30 seconds, and a last one after 30 seconds.
7. Repeat until you understand the output of varnishlog.
8. Play with the values of max-age and stale-while-revalidate in the CGI script, and the
beresp.grace value in the VCL code.
With this exercise, you should see that as long as the cached object is within its TTL, Varnish delivers
the cached object as normal. Once the TTL expires, Varnish delivers the graced copy, and
asynchronously fetches an object from the backend. Therefore, after 10 seconds of triggering the
asynchronous fetch, an updated object is available in the cache.
Page 190
Note
In Varnish 3.0 it is possible to do return (restart) after the backend response failed. This
is now called return (retry), and jumps back up to vcl_backend_fetch.
Page 191
Tip
Varnish only accepts hostnames for backend servers that resolve to a maximum of one IPv4
address and one IPv6 address. The parameter prefer_ipv6 defines which IP address
Varnish prefer.
Page 192
Page 193
11 Content Composition
This chapter is for the webdeveloper course only
This chapter teaches you how to glue content from independent sources into one web page.
Cookies and how to work with them
Edge Side Includes (ESI) and how to compose a single client-visible page out of multiple objects
Combining ESI and Cookies
AJAX and masquerading AJAX requests through Varnish
Page 194
Page 195
11.2 Cookies
Be careful when caching cookies!
Cookies are frequently used to identify unique users, or user-choices. They can be used for anything
from identifying a user-session in a web-shop to opting for a mobile version of a web page. Varnish
can handle cookies coming from three different sources:
req.http.Cookie header field from clients
beresp.http.Set-Cookie header field from servers
By default Varnish does not cache a page if the Cookie request header or Set-Cookie response
header are present. This is for two main reasons: 1) to avoid littering the cache with large amount of
copies of the same content, and 2) to avoid delivering cookie-based content to a wrong client.
It is far better to either cache multiple copies of the same content for each user or cache nothing at
all, than caching personal, confidential or private content and deliver it to a wrong client. In other
words, the worst is to jeopardize users' privacy for saving backend resources. Therefore, it is strongly
advised to take your time to write a correct VCL program and test it thoroughly before caching
cookies in production deployments.
Despite cookie-based caching being discouraged, Varnish can be forced to cache content based on
cookies. If a client request contains req.http.Cookie, issue return (hash); in vcl_recv. If the
cookie is a Set-Cookie HTTP response header from the server, issue return (deliver); in
vcl_backend_response.
Page 196
Page 197
to
the
cache
hash
by
issuing
Never cache a Set-Cookie header. Either remove the header before caching or do not cache
the object at all.
Finish vcl_backend_response with something similar to:
if (beresp.ttl > 0s) {
unset beresp.http.Set-cookie;
}
This ensures that all cached pages are stripped of Set-cookie.
Page 198
Page 199
Page 200
Note
Varnish outputs ESI parsing errors in varnishstat and varnishlog.
Page 201
"Content-Type: text/plain"
""
"ESI content is cached for 30 seconds."
"Timestamp: "
"+%Y-%m-%d %H:%M:%S"
Page 202
Page 203
Page 204
This works
With AJAX it is not possible by default to send requests across another domain. This is a security
restriction imposed by browsers. If this represents an issue for your web pages, you can be easily
solve it by using Varnish and VCL.
Page 205
Page 206
Page 207
Page 208
Page 209
Page 210
Page 211
Page 212
Page 213
Figure 30 and 31 are screenshots of the VCS GUI. These screenshots are from the demo on
http://vcsdemo.varnish-software.com. Your instructor can provide you credential for you to try the
demo online.
Note
For further details on VCS, please look
https://www.varnish-software.com/resources/.
at
its
own
documentation
at
Page 214
example.com
example.com
timestamp
2013-09-18T09:58:00
2013-09-18T09:58:30
n_req
84
76
n_req_uniq
NaN
NaN
n_miss
avg_restart
0.000000
0.000000
n_bodybytes
12264
10950
ttfb_miss
NaN
0.000440
ttb_hit
0.000048
0.000054
resp_1xx
resp_2xx
84
76
resp_3xx
resp_4xx
resp_5xx
reqbytes
respbytes
32
29
berespbytes
30
27
bereqbytes
VCS uses the time-based tumbling windows technique to segment the data stream into finite parts.
These windows are created based on the vcs-key tag that you specify in your VCL code. Each
window aggregates the data within a configurable period of time.
Table 20 shows the data model in VCS. This table is basically a representation of two windows seen
as two records in a conventional database. In this example, data shows two windows of 30 second
based on the example.com vcs-key. For presentation purposes in this page, the distribution of
this table is of a database that grows from left to right.
The VCS data model has the following fields:
Page 215
vcs-key
common key name for transactions making this record
timestamp
Timestamp at the start of the window
n_req
Number of requests
n_req_uniq
Number of unique requests, if configured
n_miss
Number of backend requests (i.e. cache misses) Number of hits can be calculated as
n_hit = n_req - n_miss
avg_restart
Average number of VCL restarts triggered per request
n_bodybytes
Total number of bytes transferred for the response bodies
ttfb_miss
Average time to first byte for requests that ended up with a backend request
ttb_hit
Average time to first byte for requests that were served directly from varnish cache
resp_1xx -- resp_5xx
Counters for response status codes.
reqbytes
Number of bytes received from clients.
respbytes
Number of bytes transmitted to clients.
berespbytes
Number of bytes received from backends.
bereqbytes
Number of bytes transmitted to backends.
You can think of each window as a record of a traditional database that resides in memory. This
database is dynamic, since the engine of VCS updates it every time a new window (record) is
available. VCS provides an API to retrieve this data from the table above in JSON format:
{
"example.com": [
{
"timestamp": "2013-09-18T09:58:30",
"n_req": 76,
"n_req_uniq": "NaN",
"n_miss": 1,
Page 216
"avg_restarts": 0.000000,
"n_bodybytes": 10950,
"ttfb_miss": 0.000440,
"ttfb_hit": 0.000054,
"resp_1xx": 0,
"resp_2xx": 76,
"resp_3xx": 0,
"resp_4xx": 0,
"resp_5xx": 0,
...
},
{
"timestamp": "2013-09-18T09:58:00",
"n_req": 84,
"n_req_uniq": "NaN",
"n_miss": 0,
"avg_restarts": 0.000000,
"n_bodybytes": 12264,
"ttfb_miss": "NaN",
"ttfb_hit": 0.000048,
"resp_1xx": 0,
"resp_2xx": 84,
"resp_3xx": 0,
"resp_4xx": 0,
"resp_5xx": 0,
...
},
...
]
}
Page 217
Page 218
/top_berespbytes
Sort based on number of bytes fetched from backends.
/top_bereqbytes
Sort based on number of bytes transmitted to backends.
/top_restarts
Sort based on the avg_restarts field.
/top_5xx, /top_4xx, ..., /top_1xx
Sort based on number of HTTP response codes returned to clients for 5xx, 4xx, 3xx, etc.
/top_uniq
Sort based on the n_req_uniq field.
Further, a /k parameter can be appended, which specifies the number of keys to include in the top
list. If no k value is provided, the top 10 is displayed.
Note
For more details on the API, please read the documentation of vstatd.
Page 219
Page 220
Page 221
#
#
#
#
Varnish Plus allows you to improve your website security without having to rely on third-party
solutions. SSL/TLS support allows you to encrypt and secure communication on both the front and
backend as Varnish acts as both HTTP server and client. On the client side, the HTTP server
intercepts web requests before they reach a web server. The SSL/TLS support on this side enables
traffic encryption between the client and Varnish.
On the backend, the HTTP client fetches content missing in the cache from the web server. This
enables content to be fetched over the encrypted SSL/TLS, which particularly benefits customers
who run a fully encrypted data center or have web servers that reside in a different location to their
Varnish Plus servers.
Page 222
13 Appendix A: Resources
Community driven:
https://www.varnish-cache.org
https://www.varnish-cache.org/docs/
http://repo.varnish-cache.org/
https://www.varnish-cache.org/trac/wiki/VCLExamples
Public mailing lists: https://www.varnish-cache.org/trac/wiki/MailingLists
Public IRC channel: #varnish at irc.linpro.no
Commercial:
https://www.varnish-software.com/resources/
http://planet.varnish-cache.org/
https://www.varnish-software.com
http://repo.varnish-software.com (for service agreement customers)
support@varnish-software.com (for existing customers, with SLA)
sales@varnish-software.com
Page 223
Page 224
14.1 varnishtop
$varnishtop -i BereqURL,RespStatus
list length 5
7.20
5.26
0.86
0.68
0.39
RespStatus
RespStatus
BereqURL
BereqURL
BereqURL
trusty-amd64
200
404
/test.html
/
/index.html
Page 225
14.2 varnishncsa
10.10.0.1 - - [24/Aug/2008:03:46:48 +0100] "GET \
http://www.example.com/images/foo.png HTTP/1.1" 200 5330 \
"http://www.example.com/" "Mozilla/5.0"
If you already have tools in place to analyze NCSA Common log format, varnishncsa can be used
to print the SHMLOG in this format. varnishncsa dumps everything pointing to a certain domain
and subdomains.
Filtering works similar to varnishlog.
Page 226
14.3 varnishhist
1:1, n = 71
localhost
#
#
#
#
##
###
###
###
###
###
| ###
| ###
| | ###
|||| ###
#
|||| ####
#
|##|#####
#
#
# # #
+-------+-------+-------+-------+-------+-------+-------+-------+------|1e-6
|1e-5
|1e-4
|1e-3
|1e-2
|1e-1
|1e0
|1e1
|1e2
The varnishhist utility reads the SHMLOG and presents a continuously updated histogram
showing the distribution of the last n requests. varnishhist is particularly useful to get an idea
about the performance of your Varnish Cache server and your backend.
The horizontal axis shows a time range from 1e-6 (1 microsecond) to 1e2 (100 seconds). This time
range shows the internal processing time of your Varnish Cache server and the time it takes to
receive a response from the backend. Thus, this axis does not show the time perceived at the client
side, because other factors such as network delay may affect the overall response time.
Hits are marked with a pipe character ("|"), and misses are marked with a hash character ("#"). These
markers are distributed according to the time taken to process the request. Therefore, distributions
with more markers on the left side represent a faster performance.
When the histogram grows in vertical direction more than what the terminal can display, the 1:m
rate changes on the left top corner. Where m represent the number of times that each marker is
found. On the right top corner, you can see the name of the host.
Page 227
Page 228
Page 229
15.1 ajax.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<script type="text/javascript"
src="http://ajax.googleapis.com/ajax/libs/jquery/1.4/jquery.min.js">
</script>
<script type="text/javascript">
function getNonMasqueraded()
{
$("#result").load( "http://www.google.com/robots.txt" );
}
function getMasqueraded()
{
$("#result").load( "/masq/robots.txt" );
}
</script>
</head>
<body>
<h1>Cross-domain Ajax</h1>
<ul>
<li><a href="javascript:getNonMasqueraded();">
Test a non masqueraded cross-domain request
</a></li>
<li><a href="javascript:getMasqueraded();">
Test a masqueraded cross-domain request
</a></li>
</ul>
<h1>Result</h1>
<div id="result"></div>
</body>
</html>
Page 230
15.2 article.php
<?php
header("Cache-Control: public, must-revalidate, max-age=3600, s-maxage=3600");
$date = new DateTime();
$now = $date->format( DateTime::RFC2822 );
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<h1>This is an article, cached for 1 hour</h1>
<h2>Now is <?php echo $now; ?></h2>
<a href="<?=$_SERVER['PHP_SELF']?>">Refresh this page</a>
</body>
</html>
15.3 cookies.php
<?php
header( 'Content-Type: text/plain' );
print( "The following cookies have been received from the server\n" );
foreach( $_COOKIE as $name => $value )
print( "- ${name} : ${value}\n" );
?>
Page 231
Page 232
15.4 esi-top.php
<?php
header('Content-Type: text/html');
header('Cache-Control: max-age=30, s-maxage=3600');
$date = new DateTime();
$now = $date->format( DateTime::RFC2822 );
$setc = "";
if( isset($_POST['k']) and $_POST['k'] !== '' and
isset($_POST['v']) and $_POST['v'] !== '') {
$k=$_POST['k'];
$v=$_POST['v'];
$setc = "Set-Cookie: $k=$v";
header("$setc");
?><meta http-equiv="refresh" content="1" />
<h1>Refreshing to set cookie <?php print $setc; ?></h1><?php
}
?>
<html><head><title>ESI top page</title></head><body><h1>ESI Test page</h1>
<p>This is content on the top-page of the ESI page.
The top page is cached for 1 hour in Varnish,
but only 30 seconds on the client.</p>
<p>The time when the top-element was created:</p><h3>
<?php echo "$now"; ?>
<h1>Set a cookie:</h1><form action="/esi-top.php" method="POST">
Key: <input type="text" name="k">
Value: <input type="text" name="v">
<input type="submit"> </form>
</h3><p>The top page received the following Cookies:</p><ul>
<?php
foreach( $_COOKIE as $name => $value )
print( "<li>${name} : ${value}</li>\n" );
?>
<table border="1"><tr><td><esi:include src="/esi-user.php" /></td></tr>
</table></body></html>
15.5 esi-user.php
<?php
header('Content-Type: text/html');
header('Cache-Control: max-age=30, s-maxage=20');
header('Vary: Cookie');
$date = new DateTime();
$now = $date->format( DateTime::RFC2822 );
?>
<p>This is content on the user-specific ESI-include. This part of
the page is can be cached in Varnish separately since it emits
a "Vary: Cookie"-header. We can not affect the client-cache of
this sub-page, since that is determined by the cache-control
headers on the top-element.</p>
<p>The time when the user-specific-element was created:</p><h3>
<?php echo "$now"; ?>
Page 233
Page 234
15.6 httpheadersexample.php
<?php
define( 'LAST_MODIFIED_STRING', 'Sat, 09 Sep 2000 22:00:00 GMT' );
// expires_date : 10s after page generation
$expires_date = new DateTime();
$expires_date->add(new DateInterval('PT10S'));
$headers = array(
'Date' => date( 'D, d M Y H:i:s', time() ),
);
if( isset( $_GET['h'] ) and $_GET['h'] !== '' )
{
switch( $_GET['h'] )
{
case "expires" :
$headers['Expires'] = toUTCDate($expires_date);
break;
case "cache-control":
$headers['Cache-Control'] = "public, must-revalidate,
max-age=3600, s-maxage=3600";
break;
case "cache-control-override":
$headers['Expires'] = toUTCDate($expires_date);
$headers['Cache-Control'] = "public, must-revalidate,
max-age=2, s-maxage=2";
break;
case "last-modified":
$headers['Last-Modified'] = LAST_MODIFIED_STRING;
$headers['Etag'] = md5( 12345 );
if( isset( $_SERVER['HTTP_IF_MODIFIED_SINCE'] ) and
$_SERVER['HTTP_IF_MODIFIED_SINCE'] ==
LAST_MODIFIED_STRING ) {
header( "HTTP/1.1 304 Not Modified" );
exit( );
}
break;
case "vary":
$headers['Expires'] = toUTCDate($expires_date);
$headers['Vary'] = 'User-Agent';
Page 235
Page 236
break;
}
sendHeaders( $headers );
}
function sendHeaders( array $headerList )
{
foreach( $headerList as $name => $value )
{
header( "${name}: ${value}" );
}
}
function toUTCDate( DateTime $date )
{
$date->setTimezone( new DateTimeZone( 'UTC' ) );
return $date->format( 'D, d M Y H:i:s \G\M\T' );
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<h1>Headers sent</h1>
<?php
foreach( $headers as $name => $value ) {
print "<strong>${name}</strong>: ${value}<br/>";
}
if( isset( $_SERVER['HTTP_IF_MODIFIED_SINCE'] ) ) {
print "<strong>If-Modified-Since</strong> has been
sent in the";
print "request, value : " .
$_SERVER['HTTP_IF_MODIFIED_SINCE'];
}
?>
<hr/>
<h1>Links for testing</h1>
<ul>
<li><a href="<?=$_SERVER['PHP_SELF']?>?h=expires">
Test Expires response header</a></li>
<li><a href="<?=$_SERVER['PHP_SELF']?>?h=cache-control">
Test Cache-Control response header</a></li>
<li><a href="<?=$_SERVER['PHP_SELF']?>?
h=cache-control-override">
Test Cache-Control response header overrides Expires</a></li>
Page 237
<li><a href="<?=$_SERVER['PHP_SELF']?>?h=last-modified">
Test Last-Modified/If-modified-since response header</a></li>
<li><a href="<?=$_SERVER['PHP_SELF']?>?h=vary">
Test Vary response header</a></li>
<ul>
</body>
</html>
Page 238
15.7 purgearticle.php
<?php
header( 'Content-Type: text/plain' );
header( 'Cache-Control: max-age=0' );
$hostname = 'localhost';
$port
= 80;
$URL
= '/article.php';
$debug
= true;
print "Updating the article in the database ...\n";
purgeURL( $hostname, $port, $URL, $debug );
function purgeURL( $hostname, $port, $purgeURL, $debug )
{
$finalURL = sprintf(
"http://%s:%d%s", $hostname, $port, $purgeURL
);
print( "Purging ${finalURL}\n" );
$curlOptionList = array(
CURLOPT_RETURNTRANSFER
CURLOPT_CUSTOMREQUEST
CURLOPT_HEADER
CURLOPT_NOBODY
CURLOPT_URL
CURLOPT_CONNECTTIMEOUT_MS
);
=>
=>
=>
=>
=>
=>
true,
'PURGE',
true ,
true,
$finalURL,
2000
$fd = false;
if( $debug == true ) {
print "\n---- Curl debug -----\n";
$fd = fopen("php://output", 'w+');
$curlOptionList[CURLOPT_VERBOSE] = true;
$curlOptionList[CURLOPT_STDERR] = $fd;
}
$curlHandler = curl_init();
curl_setopt_array( $curlHandler, $curlOptionList );
curl_exec( $curlHandler );
curl_close( $curlHandler );
if( $fd !== false ) {
fclose( $fd );
}
}
?>
15.8 test.php
<?php
$cc = "";
if( isset($_GET['k']) and $_GET['k'] !== '' and
isset($_GET['v']) and $_GET['v'] !== '') {
$k=$_GET['k'];
$v=$_GET['v'];
$cc = "Cache-Control: $k=$v";
header("$cc");
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<h1>Cache-Control Header:</h1>
<?php
print "<pre>$cc</pre>\n";
?>
<hr/>
<h1>Links for testing</h1>
<form action="/test.php" method="GET">
Key: <input type="text" name="k">
Value: <input type="text" name="v">
<input type="submit">
</form>
</body>
</html>
Page 239
Page 240
15.9 set-cookie.php
<?php
header("Cache-Control: max-age=0");
$cc = "";
if( isset($_POST['k']) and $_POST['k'] !== '' and
isset($_POST['v']) and $_POST['v'] !== '') {
$k=$_POST['k'];
$v=$_POST['v'];
$setc = "Set-Cookie: $k=$v";
header("$setc");
}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head></head>
<body>
<h1>Set-Cookie Header:</h1>
<?php
print "<pre>$setc</pre>\n";
?>
<hr/>
<h1>Links for testing</h1>
<form action="/set-cookie.php" method="POST">
Key: <input type="text" name="k">
Value: <input type="text" name="v">
<input type="submit">
</form>
</body>
</html>
Page 241
Page 242
Page 243
VSL
Varnish Shared memory Log -- The log written into the shared memory segment for
varnish{log,ncsa,top,hist} to see.
VSB
Varnish string Buffer -- a copy of the FreeBSD "sbuf" library, for safe string handling.
VSC
Varnish Statistics Counter -- counters for various stats, exposed via varnishapi.
VSS
Varnish Session Stuff -- library functions to wrap DNS/TCP. (lib/libvarnish/vss.c)
VTC
Varnish Test Code -- a test-specification for the varnishtest program.
VTLA
Varnish Three Letter Acronym -- No rule without an exception.
VUG
Varnish User Group meeting -- Half-yearly event where the users and developers of Varnish
Cache gather to share experiences and plan future development.
VWx
Varnish Waiter 'x' -- A code module to monitor idle sessions.
VWE
Varnish Waiter Epoll -- epoll(2) (linux) based waiter module.
VWK
Varnish Waiter Kqueue -- kqueue(2) (freebsd) based waiter module.
VWP
Varnish Waiter Poll -- poll(2) based waiter module.
VWS
Varnish Waiter Solaris -- Solaris ports(2) based waiter module.