webslap
webslap is our present-day update to Adam Twiss' ApacheBench utility. In a sincere attempt to pay homage to his work, webslap has been designed to literally be a drop-in equivalent of ApacheBench. Aside from multiple URL support (which of course Siege does as well), webslap when used in TLS and/or automatic gzipping webserver environments is the only benchmarking tool that will keep up with modern day common webserver configurations (without having an equally sized test client environment). Key differences show greatly when testing with TLS, cookies and gzip. webslap came into existence due to the lack of any ability to test large-scale web environments with the existing tools.
It goes without saying, but make sure your ulimit -n
is of sufficient size beforehand. Some of the common distro's defaults are surprisingly low.
As with all of our products, webslap is bundled with the HeavyThing library itself. NOTE: compiling from source is not required; the compiled binary is included with the library. Present-day features included in webslap that the other web infrastructure testing utilities are sorely missing:
- TLS 1.0-1.2 session caching
- gzip content encoding
- Session cookie handling
- ETag handling
- Last-Modified handling
- First URL - Then... capability
Download
A standalone executable binary of webslap can be downloaded from our Products page. Please note that you will have to chmod +x
it after download.
As with all our products, webslap is bundled with the HeavyThing library itself. The download link for our library is in the top right of every page on our site, along with the SHA256 sum of the download itself. If you have downloaded the same version and your SHA256 does not match, it has been modified by parties other than ourselves.
Table of Contents
Use case scenarios
Multi-stage deployment quality assurance
Of the beneficial use-case scenarios for webslap is in accurately benchmarking deployment environments for ongoing quality assurance. In complex web application environments, it is all-too-common for a single developer mistake to go undetected and end up in a production environment with catastrophic results (read: downtime). Load testing a potentially complex trail of web requests to reasonably mimick real user behaviour has historically been difficult or impossible. Because webslap provides cookie handling along with a "first URL - then" capability, it is trivial to setup deployment test scenarios that can catch these mistakes well before they cause downtime.
Webserver configuration validation/testing
Due to the fact that most performance-enhancing HTTP features are only implemented in browsers, many developers rely on single-session browser debug information to validate their webserver configuration and/or changes. For example, to be able to accurately measure timing and bandwidth effects of various gzip compression levels at the webserver level under load (versus a single browser debug session) is nigh on impossible. TLS session caching and its cascading effects are equally difficult to measure, especially so when migrating to new cryptographic settings (bigger keys, different signature algorithms, etc). Thanks to webslap's support for gzip and TLS session caching, as well as ETag and Last-Modified, benchmarking results can be obtained that match real-world much closer than otherwise possible.
Topdown Service Level Monitoring
For complex web serving environments, often times HTTP requests involve more than one node, instance or server. Performance monitoring scripts can be used of course, but by making use of webslap's "first URL - then" feature, combined session cookies and the option to output results as JSON means that you can simulate actual user logins and navigate real paths through your web application. By placing this setup on one or more remote virtual servers and regularly scheduling the webslap sampling, you can get very detailed performance data that is similar if not identical to what your real user traffic also experiences.
Usage and options
$ ./webslap Usage: webslap [options] [POST:filename:contenttype:]http[s]://hostname[:port]/path[?query][#ref] [...] Options are: -n requests Number of requests to perform -c concurrency Number of simultaneous channels -cpu count Number of processes to use -first URL Visit URL before commencing tests -g filename Output TSV per-request data -json filename Output JSON results -nokeepalive Disable keep-alive -nogz Disable ungzip/Accept-Encoding: gzip headers -nocookies Disable session cookies -notlsresume Disable TLS session resumption -noetag Disable ETag/If-None-Match -nolastmodified Disable Last-Modified/If-Modified-Since -ordered Visit URL arglist in order instead of randomly -noui Do not fire up a user interface
The primary options are of course the same as ApacheBench such that webslap can indeed be used as a drop-in replacement, noting however that by default all of the common browser features are enabled by default. This means that simply replacing ab with webslap will not necessarily produce the same results without additional configuration options to webslap to make it behave the same as ab. Each option is detailed below:
Option: -n requests
Simply the total number of requests to perform. Some care must be taken when providing multiple URLs to arrive at an otherwise "correct" number however. In its simplest form whereby only one URL is specified, if the concurrency is set to 1, then a single channel will perform precisely this number of requests. If the concurrency is 2 however, then 2 channels will be open, and each channel will do half this number (obviously). If multiple URLs are specified, each channel iterates through its list (possibly in order), but this does not affect the number of requests to perform. For example, if you wanted 500 channels to each request 3 URLs, you would set the number of requests to 1500, concurrency to 500, and provide three URLs to visit. Default value is 1.
Option: -c concurrency
The number of simultaneous TCP connections to utilise, and is also used for "single client" simulation. In the real world of course, each real user might connect anywhere from 1 to 6 open TCP channels per host. To accurately simulate benchmarking, despite the HeavyThing's webclient
object supporting browser-style interaction, webslap forces each concurrency channel to be precisely 1:1 TCP-wise. Caution: watch out for SYN flooding detection if you go too crazy with this number (or disable SYN flood detection on your target webservers), because all of the initial concurrent SYNs go out in one fell swoop. Default value is 1.
Option -cpu count
The number of processes to distribute the workload between. Unlike typical multithreaded models, these are truly lock-free child processes. Default value is 2.
Option -first URL
If this option is specified, each concurrent channel will visit this URL first, before commencing the normal URL list tests. This first URL does not count toward the total number of requests to perform. The intent behind this option is to provide simulated login support for web application testing. An example would be to create a custom script in your webserver environment that checks for the lack of a session cookie, picks a random test user, sets the session cookie such that the remainder of the URLs for normal testing are done under potentially different simulated user logins. Default is no first URL visit.
Option: -g filename
Similar to ApacheBench's -g option, this will create a TSV file of each individual request performed. Different fields are placed however, so if you have existing gplot scripts, you'll need to modify them accordingly. Each line consists of URL, ctime, response code, ctime, dtime, ttime, wait. Default is no TSV output.
Option: -json filename
Very useful option for automated benchmarking at regular intervals when combined with javascript charting utilities such as amCharts. This option writes all of its normal text output values in javascript parse friendly JSON format.
Option: -nokeepalive
By default, HTTP/1.1 connections make use of the keep-alive feature. Specifying this option adds the Connection: close header to each request, but does not otherwise alter client behavior. Particularly when testing TLS session caching, this option can be used to benchmark specific webserver cryptographic timings.
Option: -nogz
By default, webslap sends the Accept-Encoding: gzip header along with each request. Specifying this option omits the header, and is useful for comparing baseline timings versus automatic gzip timings in webserver environments.
Option: -nocookies
By default, each individual concurrency channel maintains its own separate cookiejar
object and will accept Set-Cookie headers and send Cookie headers in subsequent requests. Specifying this option disables the cookie handling altogether.
Option: -notlsresume
It is surprising [to us] how many relatively high profile web environments are operating without the use of TLS session caching, as its correct usage results in a considerable improvement in secure user experience in addition to reduced webserver loads. By default, webslap supports and actively attempts to reuse TLS sessions. Specifying this option disables TLS session resumption. Especially for larger key sizes and high-load environments, this option can be used to highlight maximum CPU tolerances for webservers.
Option: -noetag
By default, each individual concurrency channel maintains its own separate stringmap
of URLs to corresponding ETag headers if the remote webserver(s) sent them, and subsequent requests for the same URL by said channel will include an If-None-Match header. Specify this option to disable ETag awareness per channel.
Option: -nolastmodified
By default, each individual concurrency channel maintains its own separate stringmap
of URLs to corresponding Last-Modified headers if the remote webserver(s) sent them, and subsequent requests for the same URL by said channel will include an If-Modified-Since header. Specify this option to disable Last-Modified awareness per channel.
Option: -ordered
By default, each individual concurrency channel separately randomizes the list of URLs specified. Specifying this option causes each concurrency channel to visit the URL list in the order specified on the command line.
Option: -noui
By default, webslap starts with a very basic TUI that has an update frequency of the otherwise-text-only output 10 times per second, which is useful for conducting larger/longer running tests so you can see results as they happen (and interrupt it, etc). By specifying the -noui option, webslap reverts to the nearly identical output form as ApacheBench.
Tool Comparisons
For the tests that follow, an nginx 1.7.7 webserver was configured with the following setup:
worker_processes 4; http { include mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; gzip on; gzip_comp_level 6; server { listen 443; server_name 2ton.com.au; ssl on; # Our key is 4096 bits ssl_certificate 2ton.crt; ssl_certificate_key 2ton.key; ssl_session_timeout 5m; ssl_session_cache shared:SSL:60m; # Our DH parameters are also 4096 bits ssl_dhparam dhparam.pem; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_ciphers DHE-RSA-AES256-SHA256:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-DSS-AES256-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!3DES:!MD5:!PSK; ssl_prefer_server_ciphers on; add_header Strict-Transport-Security 'max-age=31536000; includeSubDomains'; ssl_stapling on; ssl_stapling_verify off; ssl_trusted_certificate gd_bundle-g2-g1.crt; resolver 10.0.0.1; location / { root html; index index.html index.htm; } # pass the PHP scripts to FastCGI server listening on unix:/dev/shm/php.sock: # location ~ \.php$ { root html; fastcgi_pass unix:/dev/shm/php.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } } }
For our tests, we are connecting to localhost. Our server is an openSUSE machine, which has its ApacheBench binary installed as ab2, and our test.php
script is a one-liner PHP that makes a single call to phpinfo();
. For brevity, we omitted all but the most important results from the output of each.
Test 1: ApacheBench, 10x10 keepalive
$ time ab2 -n 100 -c 10 -k https://127.0.0.1/test.php Server Software: nginx/1.7.7 Server Hostname: 127.0.0.1 Server Port: 443 SSL/TLS Protocol: TLSv1.2,DHE-RSA-AES256-SHA256,4096,256 Document Path: /test.php Document Length: 67880 bytes Concurrency Level: 10 Time taken for tests: 11.850 seconds Complete requests: 100 ... Keep-Alive requests: 0 Total transferred: 6809080 bytes HTML transferred: 6787980 bytes Requests per second: 8.44 [#/sec] (mean) Time per request: 1184.995 [ms] (mean) Time per request: 118.499 [ms] (mean, across all concurrent requests) Transfer rate: 561.14 [Kbytes/sec] received ... real 0m11.857s user 0m10.635s sys 0m0.015s
The first thing to notice here is that despite my requesting keep-alive, no requests were done that way. Why this is, without diving into the ab code, is unclear. The second thing to take note of is the 10.635s user wallclock, indicating how much CPU effort was consumed by ab during the test.
Test 2: webslap, 10x10 keepalive
$ time ./webslap -cpu 1 -n 100 -c 10 -noui https://127.0.0.1/test.php Server Software: nginx/1.7.7 X-Powered-By: PHP/5.4.20 Concurrency Level: 10 Time taken for tests: 1.367s Total requests: 100 ... Keep-alive requests: 90 Non-2xx requests: 0 Total transferred: 1,049,111 bytes Headers transferred: 26,800 bytes Body transferred: 6,803,280 bytes Requests per second: 73.15 [#/sec] (mean) Time per request: 136.700 [ms] (mean) Time per request: 13.670 [ms] (mean, across all concurrent requests) Wire Transfer rate: 749.09 [Kbytes/sec] received Body Transfer rate: 4,859.55 [Kbytes/sec] received ... real 0m1.372s user 0m0.004s sys 0m0.004s
Note that in fairness, we restricted webslap to only utilise 1 cpu. In contrast to the Test 1 results, webslap successfully did keep-alive requests for the remainder. Due to the fact that fork()
is used, user space wallclock timing is not reported, but overall time is correctly indicated.
Test 3: webslap, 10x10 no keepalive
$ time ./webslap -cpu 1 -n 100 -c 10 -nokeepalive -noui https://127.0.0.1/test.php Server Software: nginx/1.7.7 X-Powered-By: PHP/5.4.20 Concurrency Level: 10 Time taken for tests: 1.692s Total requests: 100 ... Keep-alive requests: 0 Non-2xx requests: 0 Total transferred: 1,047,998 bytes Headers transferred: 26,300 bytes Body transferred: 6,802,260 bytes Requests per second: 59.10 [#/sec] (mean) Time per request: 169.200 [ms] (mean) Time per request: 16.920 [ms] (mean, across all concurrent requests) Wire Transfer rate: 604.61 [Kbytes/sec] received Body Transfer rate: 3,925.53 [Kbytes/sec] received ... real 0m1.697s user 0m0.007s sys 0m0.003s
This test highlights the use of TLS session resumption, as its runtime is only slightly longer than Test 2's results, indicating that webslap did in fact make use of the resumption feature.
Test 4: webslap, 10x10 no keepalive, no TLS resume
$ time ./webslap -cpu 1 -n 100 -c 10 -nokeepalive -notlsresume -noui https://127.0.0.1/test.php Server Software: nginx/1.7.7 X-Powered-By: PHP/5.4.20 Concurrency Level: 10 Time taken for tests: 4.297s Total requests: 100 ... Keep-alive requests: 0 Non-2xx requests: 0 Total transferred: 1,048,054 bytes Headers transferred: 26,300 bytes Body transferred: 6,802,282 bytes Requests per second: 23.27 [#/sec] (mean) Time per request: 429.700 [ms] (mean) Time per request: 42.970 [ms] (mean, across all concurrent requests) Wire Transfer rate: 238.07 [Kbytes/sec] received Body Transfer rate: 1,545.73 [Kbytes/sec] received ... real 0m4.302s user 0m0.009s sys 0m0.002s
For this test, each and every request involved a new and complete TLS session, and highlights the significant performance difference between OpenSSL and our HeavyThing library.
Test 5: ApacheBench, 10x100 keepalive
$ time ab2 -n 1000 -c 100 -k https://127.0.0.1/test.php Server Software: nginx/1.7.7 Server Hostname: 127.0.0.1 Server Port: 443 SSL/TLS Protocol: TLSv1.2,DHE-RSA-AES256-SHA256,4096,256 Document Path: /test.php Document Length: 67880 bytes Concurrency Level: 100 Time taken for tests: 107.810 seconds Complete requests: 1000 ... Keep-Alive requests: 0 Total transferred: 68090782 bytes HTML transferred: 67879782 bytes Requests per second: 9.28 [#/sec] (mean) Time per request: 10781.002 [ms] (mean) Time per request: 107.810 [ms] (mean, across all concurrent requests) Transfer rate: 616.78 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 485 7483 2620.7 6967 11849 Processing: 34 3238 2999.2 3086 10280 Waiting: 2 2039 1739.2 1891 6707 Total: 8280 10721 2770.8 9735 19873 ... real 1m47.818s user 1m45.922s sys 0m0.120s
For the duration of the test, ab was using 100% of its single CPU core. As a result, these tests mostly useless as now we are only really looking at ab's (as well as OpenSSL's) speed, and NOT our webserver. Note that this was also the case with the initial test too, as we will see in the following test.
Test 6: webslap, 10x100 no keepalive, no TLS resume
$ time ./webslap -cpu 1 -n 1000 -c 100 -nokeepalive -notlsresume -noui https://127.0.0.1/test.php Server Software: nginx/1.7.7 X-Powered-By: PHP/5.4.20 code count min avg max kbhdrs kbtotal kbbody 200 1,000 468 3,778 11,989 256 10,234 66,428 Concurrency Level: 100 Time taken for tests: 39.285s Total requests: 1,000 ... Total transferred: 10,480,514 bytes Headers transferred: 263,000 bytes Body transferred: 68,022,784 bytes Requests per second: 25.46 [#/sec] (mean) Time per request: 3,928.400 [ms] (mean) Time per request: 39.284 [ms] (mean, across all concurrent requests) Wire Transfer rate: 260.51 [Kbytes/sec] received Body Transfer rate: 1,690.97 [Kbytes/sec] received min avg max Connect Time: 72 2,593 6,631 Processing Time: 0 1 12 Waiting Time: 457 3,777 11,989 Total Time: 468 3,778 11,989 real 0m39.290s user 0m0.016s sys 0m0.049s
With this test, things start to get a little more interesting. Watching top during the course of this test, we begin to notice nginx oddities for the initial batch of connections, whereby only a single nginx process (despite us having configured 4 worker processes) goes to 100% CPU utilisation. Then, once the next sets of connections starts, all four worker processes go to 100%. With the previous ab test, the early nginx CPU utilisation was the same, but beyond that did not end up being flat out like our webslap test here. Also of note, webslap's single CPU utilisation hovered at 38% throughout the course of the test. As you can clearly see, ab is no longer a useful tool for dealing with TLS.
Test 7: Siege 3.0.7, 10x100
$ time siege -c100 -b -r10 https://127.0.0.1/test.php Transactions: 1000 hits Availability: 100.00 % Elapsed time: 45.72 secs Data transferred: 9.75 MB Response time: 4.44 secs Transaction rate: 21.87 trans/sec Throughput: 0.21 MB/sec Concurrency: 97.02 Successful transactions: 1000 Failed transactions: 0 Longest transaction: 12.03 Shortest transaction: 0.87 real 0m45.733s user 2m19.623s sys 0m0.230s
Siege was able to keep nginx similarly at 100% CPU, but required nearly as much CPU for itself to do so. Similar to ab, there are no options to deal with TLS session caches, and unlike ab, there is no option to enable keep-alive behaviour either. We are unable to get real-world TLS timings and analysis done with Siege, and in order to test raw, uncached, no keep-alive TLS installations, you'd need roughly 1:1 CPU resources for your test machine as you have for your webserver, and the data you produce won't be relevant to real-world server scenarios.
Test Conclusions
Due to the dissimilar nature of the three tools' options, we were forced to keep these comparative tests very simple. Despite them being so simplistic, they greatly highlight the gap that webslap was built to fill. With minimal webserver setup, webslap can be used for ongoing TLS installation quality assurance and monitoring, and we believe this is an essential thing to have as more and more things on the net are secured.