Benchmarking Tool using Flexible I/O Tester (FIO) to test performance of IO operations in containerized environments.
- The tool should gather a set of performance metrics
- The tool should allow to compare performance of different storage solutions
- The tool should ship with pre-defined job configs for common tests
- The tool should include configs for different workloads (git, web server, video streams)
- The tool should include configs for testing performance of different block sizes
- User must be able to provide a custom FIO config to run the benchmark.
- The coordinator node must distribute benchmark tasks to worker nodes.
- The results of the tasks must be aggregated, so they can be processed later.
Usability:
- The tool should provide a command line interface for configuring and starting benchmarks.
- The tool should provide simplified output for the single-run benchmarking results
- Users should be able to start the benchmarking tool with a single command.
Reliability:
- In case of a worker node failure, the benchmark coordination must continue for the other nodes.
Performance:
- The tool should not interfere with the FIO Benchmark
Security:
- The communication between the worker nodes and the coordinator must be encrypted.
Constraints:
- The benchmarking tool must be fully containerized using Docker.
- Users should be able to pull the Docker image from a public Docker repository.
- The tool must include a Docker Compose file.
We provide the tool as a Docker image since we primarily intend to benchmark performance on containerized environments. For guides on how to perform the benchmarks on bare metal, check out the Installation section.
To run the tool in the container:
docker run --rm -it ghcr.io/ls1intum/storage-benchmarking
usage: main.py [-h] {run,worker,coordinator} ...
Benchmarking Cluster
positional arguments:
{run,worker,coordinator}
Role of the execution
run Single run options
worker Worker node options
coordinator Coordinator node options
options:
-h, --help show this help message and exit
Developed by Colin Wilk as part of his Bachelor Thesis. Licensed as MIT, see LICENSE file for details
You can perform a single benchmark using the run command
docker run --rm -it ghcr.io/ls1intum/storage-benchmarking run -d /tmp
Job Duration in Seconds
----------------- ---------------------
random-reads 10s
random-writes 10s
sequential-reads 10s
sequential-writes 10s
web-server-assets 25s
media-streaming 20s
TOTAL 85s
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Metric | random-reads | random-writes | sequential-reads | sequential-writes | web-server-assets | media-streaming |
+==================================+==================+=================+====================+=====================+=====================+===================+
| Total Read IO | 1.1 GiB | 0 Bytes | 5.0 GiB | 0 Bytes | 21.8 GiB | 39.2 GiB |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Total Write IO | 0 Bytes | 9.7 GiB | 0 Bytes | 8.8 GiB | 0 Bytes | 0 Bytes |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read Bandwidth | 235.2 MiB/s | 0 Bytes/s | 1022.1 MiB/s | 0 Bytes/s | 1.5 GiB/s | 3.9 GiB/s |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write Bandwidth | 0 Bytes/s | 1.9 GiB/s | 0 Bytes/s | 1.8 GiB/s | 0 Bytes/s | 0 Bytes/s |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read IOPS | 60_205.56 IOPS | 0.00 IOPS | 261_660.87 IOPS | 0.00 IOPS | 381_679.89 IOPS | 32_120.39 IOPS |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write IOPS | 0.00 IOPS | 509_639.87 IOPS | 0.00 IOPS | 460_674.47 IOPS | 0.00 IOPS | 0.00 IOPS |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read Submission Latency | 0 microseconds | 0 microseconds | 0 microseconds | 0 microseconds | 12 microseconds | 22 microseconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read Completion Latency | 64 microseconds | 0 microseconds | 7 microseconds | 0 microseconds | 657 microseconds | 1 millisecond |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read Total Latency | 64 microseconds | 0 microseconds | 7 microseconds | 0 microseconds | 669 microseconds | 1 millisecond |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write Submission Latency | 0 microseconds | 0 microseconds | 0 microseconds | 0 microseconds | 0 microseconds | 0 microseconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write Completion Latency | 0 microseconds | 2 microseconds | 0 microseconds | 2 microseconds | 0 microseconds | 0 microseconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write Total Latency | 0 microseconds | 2 microseconds | 0 microseconds | 2 microseconds | 0 microseconds | 0 microseconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Job Runtime | 20 seconds | 10 seconds | 10 seconds | 5 seconds | 2 minutes | 40 seconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| User CPU | 3.19% | 26.19% | 10.22% | 24.60% | 12.67% | 3.00% |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| System CPU | 14.48% | 72.41% | 36.39% | 74.22% | 37.45% | 16.50% |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Context Switches | 303,094 | 868 | 28,193 | 601 | 155,691 | 107,834 |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read Latency (99.0 Percentiles) | 94 microseconds | --- | 206 microseconds | --- | 4 milliseconds | 6 milliseconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Read Latency (99.9 Percentiles) | 123 microseconds | --- | 465 microseconds | --- | 8 milliseconds | 9 milliseconds |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write Latency (99.0 Percentiles) | --- | 5 microseconds | --- | 3 microseconds | --- | --- |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
| Write Latency (99.9 Percentiles) | --- | 24 microseconds | --- | 15 microseconds | --- | --- |
+----------------------------------+------------------+-----------------+--------------------+---------------------+---------------------+-------------------+
You can run any of the shipped job_files
from fio, such as block size tests:
docker run --rm -it ghcr.io/ls1intum/storage-benchmarking run -d /tmp -c /app/job_files/blocks.ini
Naturally you can mount your own custom ini file into the container and run that:
docker run --rm -it -v /my-conf.ini:$(pwd)/my-conf.ini ghcr.io/ls1intum/storage-benchmarking run -d /tmp -c /my-conf.ini
For automatic distributed benchmarking over time we offer the setup of a worker coordinator cluster. In this setup we have a coordinator node that distributes tasks through a Redis Broker to a set of Worker Nodes.
The worker coordinator deployment is shown here:
Every worker boots with a hostname (or the default hostname) which must be unique and a group which is how workers are scheduled by the coordinator.
When a worker boots, it registers itself to a Group of workers at the Redis instance and opens a queue to wait for jobs. It processes the jobs it receives sequentially and de-registers itself before shutting down.
The coordinator makes sure that only one group is actively running a benchmark. This is important if you try to measure different levels of abstraction for example raw disk performance, zfs performance and zvol performance in a virtual machine and want to make sure that your benchmarks don't influence one another.
The coordinator can do a few different scheduling techniques. First you have to
define groups using --groups group1 group2 ...
which will be benchmarked in that
order. By default, every node in the group will start a benchmark but if you
only want a single random one to be picked in every iteration you can use the
--random
flag. You can also trigger a single benchmark directly by using the
--trigger
tag. If you don't want to schedule by the default time (every 2
hours) you can use --quick
which will directly start the next benchmarking
round after the last group finished running the benchmarks, you can optionally
limit the maximum number of runs from the quick run using --limit <int>
after
which the coordinator will exit.
To run the project locally clone it first:
git clone https://github.com/ls1intum/storage-benchmarking
cd storage-benchmarking
Then install the dependencies using Poetry
(you can install poetry with pip: pip install poetry
).
poetry install --no-dev
Make sure you have fio installed and in your PATH
;
$ fio -v
fio-3.37
Then you can run the project as described in the Usage section with
poetry run python3 src/benchmarking_tool/main.py
The project is licensed under MIT, see the LICENSE for more information.
We would like to express our gratitude to the FIO Project and its contributors. The tools and resources provided by the FIO Project have been indispensable to the development of this tool.