Introduction
During lecture #6, we discussed containers, and explained how they provide a light-weight environment isolated from the host. Today, we’ll get some hands-on experience with containers.
The following sections should help to refresh some of the concepts discussed during the lecture.
Images
An image is a template for a container: it’s a file system, plus some metadata.
A Containerfile
(a vendor-neutral name for a Dockerfile
) serves as a recipe
to build the image. Consider our simple hello.sh
script again:
#!/bin/sh
while :; do
printf "Hello World\n"
sleep 1
done | cat -n
And the following Containerfile
which packages it as an application running
on Alpine Linux:
FROM docker.io/alpine:latest
COPY hello.sh /entrypoint
RUN chown nobody:nobody /entrypoint
USER nobody
ENTRYPOINT /entrypoint
An image can be built from this Containerfile
using Buildah:
% buildah bud -f Containerfile -t hello-world .
This image will now be available as hello-world
locally.
Containers
This image can now be used to create and run a new container using Podman:
% podman run -it --rm hello-world
1 Hello World
2 Hello World
3 Hello World
[...]
Note that the container is running Alpine Linux—a distinct distro from Arch Linux, which your VM is running—but shares the same kernel with the host. Recall that this is called OS-level virtualization.
3rd-party images
In our Containerfile
, in the FROM
directive, we referenced an upstream
image: docker.io/alpine, and we used the
latest
tag to specify we want the latest version of that image. This image
was created by a 3rd party, not by us.
There are many 3rd-party images available in the hub.docker.io registry, and there are many registries: some public, such as Docker Hub, and some private.
For example, if you wanted to test something on Fedora 38, you could just:
% podman run -it --rm docker.io/fedora:38
That gives you a shell running on a Fedora system.
The FROM
directive specifies a so-called base image, the image which serves
as a foundation for the image you’re building. Changes you make in your
Containerfile are layered upon that base. In our hello-world
image example,
we took a working Alpine image, copied a script (hello.sh
) into it, and made
it the default program to run when a container is instantiated from that image.
No magic.
(The configuration of the entrypoint and the USER
directive would be examples
of some of the metadata stored along with the file system within the image.
They form instructions for the container runtime (Podman) to spawn a default
process, and to switch to a particular user, respectively. Switching to the
least-privileged user in the container is a good security practice.)
Restricting the isolation
Containers are meant to be isolated from the host as much as possible, to keep the processes within the container in check. However, absolute isolation isn’t practical. Most of the time, the container needs to access the network, or data need to be shared between the host and the container.
Port publishing
For example, suppose that you have an HTTP server running in a Ubuntu container. One can easily simulate that with just netcat:
% podman run -it --rm -p 8000:8000 docker.io/ubuntu:22.04
root@d3eff3dc3dc7:/# apt update
[...]
root@d3eff3dc3dc7:/# apt install netcat
root@d3eff3dc3dc7:/# while :; do printf "HTTP/1.0 200 OK\r\nContent-Length: 0\r\n\r\n" | nc -l 0.0.0.0 8000; done
Let’s break this down:
- First, we run a new Ubuntu 22.04 container
- Then, we update the package list and install netcat (apt is the Ubuntu package manager)
- Then, we run a loop in the interactive shell, which pipes a static HTTP/1.0 “all good” response to netcat, which listens on all interfaces (0.0.0.0) on port 8000. Netcat will forward that HTTP message to any client which connects to it.
If you now run ss -tlnp
outside of the container, you’ll notice
there’s a listener for TCP port 8000:
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
[...]
LISTEN 0 4096 0.0.0.0:8000 0.0.0.0:* users:(("rootlessport",pid=5051,fd=11))
[...]
And when you try to curl that port, you talk to the netcat running in the container:
% curl -i localhost:8000
Note the -p
option in podman run
. What does it do?
Volumes
Suppose that your container needs to access some part of the host’s file system. For example, the container could provide a database server, and the server would need to store data persistently. (Since containers get created from scratch each time, the database server would otherwise lose all data when the container is stopped, which isn’t normally desirable.) One can get around that issue with volumes:
% podman run -v $(pwd):/host docker.io/voidlinux/voidlinux
# cat /etc/os-release > /host/foo
Here, we’re running a Void Linux container, and we cat the contents of
/etc/os-release
into /host/foo
. If you check your current working directory
on the host (oustide of the container), you should now have a foo
file:
% cat foo
NAME="void"
ID="void"
DISTRIB_ID="void"
PRETTY_NAME="void"
Note the -v
option in podman run
. What does it do, exactly?
Podman Pods
Our hello-world
container is trivial: it does nothing useful. Most practical
applications need other services—say, a database server—to store and
retrieve data, and various other so-called side-cars to do their job.
Over time, it became apparent that it’s useful when all the related containers share the same network namespace, so that the application can communicate with the database server and the other containers freely, while still being isolated from the host. This gives rise to the idea of a pod: a collection of related containers serving a common purpose, which, by default, share the network namespace.
Podman allows us to create pods:
% podman pod create foobar
ffd8e4b91ab4ca0dce8beac312be80af2a3190be74b39d0dce7d785b2323a9db
% podman pod list
POD ID NAME STATUS CREATED INFRA ID # OF CONTAINERS
ffd8e4b91ab4 foobar Created 11 seconds ago 032822999ced 1
We can then run containers in the pod. Let’s run our prior Ubuntu example with the phony HTTP server:
% podman run -it --pod=foobar docker.io/ubuntu:22.04
root@foobar:/# apt update
[...]
root@foobar:/# apt install netcat
root@foobar:/# while :; do printf "HTTP/1.0 200 OK\r\nContent-Length: 0\r\n\r\n" | nc -l 0.0.0.0 8000; done
(mind the --pod
), and an Arch Linux container in that same pod:
% podman run -it --pod=foobar docker.io/archlinux
[root@foobar /]# pacman -Syu curl
:: Synchronizing package databases...
core downloading...
extra downloading...
community downloading...
:: Starting full system upgrade...
resolving dependencies...
[...]
[root@foobar /]# curl -i localhost:8000
HTTP/1.0 200 OK
Content-Length: 0
The Arch Linux container can curl the server running in the Ubuntu container, since they share the same network namespace.
Homework
This homework has got a two-week deadline (strict):
- Monday 2022-12-05 9:00 Prague time for both groups, as the Thursday class was cancelled due to a national holiday
Please try to get it done during the first week. As usual, if anything is unclear, don’t hesitate to ask.
Important notes
- Please finish the systemd task promptly. To grade homeworks #3 and #4, I need your machines to be running. Many people have taken their VMs down completely. From now on, please keep your VMs running.
- Soon, the hypervisors will go through a planned power-cycle. This will test that you have set up everything correctly, and that systemd starts your entire infrastructure as expected.
- Please stick to the exact file names where given. I was trying to be lenient for a very long time, but subtle variations in file names due to typos and undesirable creativity get in the way of automation. Thank you.
Hosting a web application
- Here, you will find a trivial Go application.
- Your task is to get this app running as a collection of three containers
running in one pod:
- A container running the application itself,
- A database container running PostgreSQL,
- An NGINX container acting as a reverse proxy for the application. Please note that in this set-up, the reverse proxy is superfluous, but is a typical component of real-world set-ups, so it makes sense to have it there.
- Run the containers as systemd units on your
gw
machine. I expect the app to be listening on port 8000. - Hints:
- You’ll have to build an image for the Go application. You can choose whatever distro you like as your base image.
- 3rd-party images for NGINX and PostgreSQL already exist, use them! You’ll have to configure both.
- I would start by running the PostgreSQL container first, and publishing the database port. I would then build the app locally and get it to work with the container-hosted database. Then, I would containerize the app and put the DB server and the app into the same pod. I would do the NGINX reverse proxy last. Finally, I would write the systemd units and test that everything works.
- Please submit all artifacts (
Containerfile
s you used to build the image(s), systemd unit files, etc.) ashw/07/01-twytter/*
. This time, exact file names cannot be prescribed, so please use any reasonable file names within that directory. - 15 bonus points for using a multi-stage image for the Go application.
- 30 bonus points for extending the application (see README.md).
- Please note that this assignment can be quite difficult, depending on your experience. So please don’t hesitate to reach out. As long as you have concrete questions, I’ll be very happy to point you in the right direction :-). It may also be a good idea to start early.
- (100+45 points)
hw/07/02-feedback
- If you have any valuable feedback, please do provide it here.
- Points are only awarded for feedback which is actionable and can be used to improve the quality of the course.
- Any constructive criticism is appreciated (and won’t be weaponized).
(Total = 100+45 points)
Don’t forget to git push
all your changes! Also, make sure that VM still
works by the deadline—otherwise we have no way of grading your work.