Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

What Is a Container? Open Container Initiative Explained

Illustration of a person with a scarf and tousled hair pointing at an open container door in a desolate, monochrome environment. Visible are an orange shipping container and blurry foreground elements resembling pipes or bones. The mood is mysterious and industrial.
Last updated: | Published:
Illustration of a person with a scarf and tousled hair pointing at an open container door in a desolate, monochrome environment. Visible are an orange shipping container and blurry foreground elements resembling pipes or bones. The mood is mysterious and industrial.

What Is a Container

Let's start by trying to understand what a container is.

Imagine you have a relatively big Linux server. It has, let's say, 48Gb of RAM and 8 core CPU.

This server has a standard filesystem layout, with well familiar /var, /tmp/, /etc and other directories.

Your company doesn't really have an application that needs all of this memory and CPU at once.

Instead, there are multiple applications that you need to run on this server.

There is a Java web application, and there are some Python scripts to be executed periodically.

You don't want to run these applications under the root user, because that would mean that each application can do anything it wants on this server - including accessing the files and directories of the other application.

So, to isolate them from each other, you craft a beautiful directory layout, and then run each application under a different Linux user. To actually run the application you create new systemd services for each app, with cgroups making sure that system resources are managed properly.

It works pretty well for some time. Thanks to the proper combination of Linux users, file permissions, SELinux labels and systemd unit definitions you have a secure multi-tenant server.

First problems appear during the next patching. One of the Python applications relies on a now outdated system package. You can't update this package, because the application will break. And you can't leave this package as it is, because it puts the whole server, with all of the applications running there, at risk.

You’ve tried to isolate each application as much as possible with the help of SELinux, cgroups and multi-user setup, but the final frontier - the filesystem - remains shared between all applications.

If you start looking closer, you will notice a couple of other things that remained shared. For example, each application shares the same process table - your Python application is well aware of the existence of the Java application running on the same server.

Your quest to properly isolate applications from each other becomes harder as you dive deeper into this topic. Wouldn't it be great, if there was something to do this isolation for you? This is where containers come in.

On the technical level, each container is just a Linux process that is isolated from the rest of the system with the help of the already mentioned and some extra tools.

cgroups, Selinux or Apparmor, standard unix permissions, Linux namespaces and Linux Capabilities all work together to isolate this process in such a way, that from inside the process your application is not aware that it lives in a container.

Container, then, is nothing but a useful abstraction to describe a process that is so isolated from every other process on the same server that it actually believes that this isolated box it runs in is the actual server.

There is an old movie, The Truman Show - you might have seen it. The hero of this movie is oblivious to the fact that he is in a reality tv show since his birth. He lives under a huge dome, his friends and relatives are nothing but actors, and every place he knows is just a decoration for the show.

If we translate this movie to the Linux world, then the container is the fake world for the process that lives in this world. The showrunners make sure that our process, the hero of our Linux Truman Show, never realises that it has a fake filesystem, fake process table, fake networking and everything else. We can only hope that, unlike Truman, our process will never escape this isolated little world and will not wreak havoc on the real world, the actual server we have.

You may ask, how is it different from virtualisation?

On a technical level, the big difference is that while containers are simply using existing Linux toolkit to isolate the process that is still running on the same Linux Kernel, virtual machines can do a bit more complex things, including running not only different Kernel versions, but even completely different operating systems on a single host.

If container is a Truman Show-like decoration of a little town that is still located on the planet, then virtual machine is a space station, located very far away from the Earth, communicating with it only via specialised channels - and even those channels are not visible to anyone on the station.

The inhabitants of this space station are not aware of the existence of Earth - their whole world is represented by this artificial environment.

So, the approach to isolation is different between containers and virtual machines - and this leads to a slight conceptual difference between them.

In the space station, the complete space station is a special environment, dedicated to do many different things with complete isolation from the planet - and in the virtual machine case, it’s the complete system, that can run many processes and do lots of different tasks, just like the real server, but, well, virtualised.

In the Truman Show there is only one misled person, and in the container, there is only one process isolated from the real server - containers are, by nature, very specialised to do just one particular task.

And who is the showrunner of this container world? It's something called a container runtime.

You probably don't want to setup linux namespaces, cgroups and everything else from scratch for every new container you want to create. The tool that does it for you is called the "container runtime" - the low, even the lowest level utility of every container environment.

But what's the name of this runtime? Well, it doesn’t actually matter - and we will learn why it doesn't matter in the next chapter.

Open Container Initiative

We’ve discussed that there are many different bits and pieces that make up a container: cgroups, user namespaces, process namespaces, various security mechanisms like SELinux and Linux Capabilities and so on. To avoid people manually binding all of these things together, some smart folks came up with the concept of a container runtime.

Container runtime is basically a tool that starts and runs your containers. You tell the container runtime to run a new container, and it will prepare everything for you - it will create the namespaces, cgroups and other isolation mechanisms and it will start the process with all of the isolation layers around it.

Because the container process is fully isolated from the host where it runs, it needs the complete filesystem with all the binaries, libraries, config files and what not to be able to run successfully. Providing this filesystem is not exactly the job of the container runtime - you need a different set of tools for that.

As you can imagine, there could be many different ways to implement the container runtime, and even more ways to prepare a filesystem for the container. One company could build one tool to solve this problem, some other company could come up with another tool, while the open source community could implement a third one from scratch.

The outcome of this could be that multiple conflicting implementations of how to work with containers would co-exist, each of them incompatible with one another.

This would be similar to the situation we have with virtualisation. If you look around, you won't find many widely adopted open standards for virtualisation - every virtualisation technology is different and there is no open standard that would be identical for each of them.

On the other hand, look at the modern web technologies. Regardless of which browser and operating system you are using, each of them speaks the language of HTML, CSS and JavaScript, each of them works with exactly the same, standard HTTP requests, websockets and many many other components of the modern and open web.

In order to avoid virtualization case with containers, the Open Containers Initiative was created back in 2015 by Docker, CoreOS and other leaders in the container industry. The purpose of OCI is creating and maintaining a set of open standards around container formats and runtimes.

There are 3 main specifications that OCI provides:

  • runtime-spec, which specifies how to create a container out of a container bundle
  • image-spec, which defines everything related to container images
  • distribution-spec, which defines an API protocol for the distribution of content - this covers, first and foremost, container registry and operations to pull and push images

These 3 specifications together ensure that regardless of which container tool you use, as long as they comply with the standard, they will work nicely together. For example, you can use one tool to build images and a completely different tool to run containers from these images.

You can also use one container runtime in production, as part of your Kubernetes cluster and another one on your laptop - again, it doesn't matter which one you use, as long as both of them follow the OCI standards.

Open Container Standard is what allows us to embrace the Dockerless world. It would be very hard to try any other container tools, if every image and every system would be Docker-specific, but luckily, they are not - thanks to the standards, we can jump between Docker and other tools without sacrificing anything.

In the next lessons, we will examine the image and runtime spec. The relationship between them is somehow curious. We will start with the image spec and move down to the runtime spec, even though, as you will learn, you don't even need a container image to run the container.

Series "The Dockerless Course"
  1. What’s Wrong With Docker? Introduction to the Dockerless Course
  2. What Is a Container? Open Container Initiative Explained
  3. Where Container Images Are Stored: Introduction to Skopeo