Cvičení: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
Toto cvičení se zaměří na kontejnery – velmi lehké virtuální stroje. Na konci cvičení pak použijeme jejich znalost k tomu, abychom nastavili pipeline na GitLabu tak, aby spouštěla náš kód – třeba testy – při každém commitu a udržela náš kód v dobrém (zeleném) stavu.
Příprava
Before staring with Podman, ensure you have up-to-date copy of the examples
repository.
We will be using the subdirectory 14/
.
Podman is not available in IMPAKT labs (actually, it is installed but you
will not be able to execute anything). Feel free to use the shared machine
linux.ms.mff.cuni.cz
. But it is much more comfortable to use your own
machine as you do not have to setup further SSH port forwards etc.
To check that your setup is okay, try the following command:
podman run --rm docker.io/library/alpine:latest cat /etc/os-release
Pokud uvidíte něco jako následující výpis, vše je připraveno. Jinak klidně otevřete Issue na Foru a pokusíme se to nějak vyřešit (nezapomeňte říct, kterou distribuci používáte).
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob df9b9388f04a done
Copying config 0ac33e5f5a done
Writing manifest to image destination
Storing signatures
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.15.4
PRETTY_NAME="Alpine Linux v3.15"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"
If you run podman
on linux.ms.mff.cuni.cz
always remove unused images.
While the system has enough space for experimenting, the images can easily
fill-up the whole disk. Use podman images
and podman rmi IMAGE_ID
to
remove them once you do need them (see below for further details).
Running the first container
The first execution will be a bit more complex to give you a taste of what is possible. We will explain the details in the following sections.
The following assumes you are inside the directory 14
in the examples
repository.
It will launch an Nginx web server.
podman run --rm --publish 8080:80/tcp -v ./web:/usr/share/nginx/html:ro docker.io/library/nginx:1.20.0
You will see similar output to the following.
Trying to pull docker.io/library/nginx:1.20.0...
Getting image source signatures
Copying blob 525e372d6dee done
Copying blob 69692152171a done
Copying blob b141b026b9ce done
Copying blob 8d70dc384fb3 done
Copying blob 965615a5cec8 done
Copying blob 6e60219fdb98 done
Copying config 7ab27dbbfb done
Writing manifest to image destination
Storing signatures
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2021/05/18 13:15:55 [notice] 1#1: using the "epoll" event method
2021/05/18 13:15:55 [notice] 1#1: nginx/1.20.0
2021/05/18 13:15:55 [notice] 1#1: built by gcc 8.3.0 (Debian 8.3.0-6)
2021/05/18 13:15:55 [notice] 1#1: OS: Linux 5.10.16-arch1-1
2021/05/18 13:15:55 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 524288:524288
2021/05/18 13:15:55 [notice] 1#1: start worker processes
2021/05/18 13:15:55 [notice] 1#1: start worker process 26
2021/05/18 13:15:55 [notice] 1#1: start worker process 27
2021/05/18 13:15:55 [notice] 1#1: start worker process 28
2021/05/18 13:15:55 [notice] 1#1: start worker process 29
Open http://localhost:8080/ in your browser. You should see a NSWI177 Test Page in the browser.
If you see 403 Forbidden instead, append ,Z
to the -v
. Thus, the
command would contain -v ./web:/usr/share/nginx/html:ro,Z
. This is needed
(and generally a good practice) when you are running on a machine with
SELinux enabled in enforcing mode (default installation of Fedora but not on
the USB disks from us).
Terminate the execution by killing Podman with Ctrl-C
.
Note that the running Nginx webserver was printing its log – i.e., the list of accessed pages – to stdout.
Now open the page web/index.html
in your browser. Again, you shall see a
NSWI177 Test Page, but the URL would point to your local filesystem (i.e.,
file:///home/.../examples/14/web/index.html
).
The above example illustrated three important features that are available with containers:
- The web server in the container does not need any configuration or system-wide installation.
- The container can listen on ports of the host system and forward network communication inside the container.
- The container can access host’s files and use them.
All very good features for development, testing as well as distribution of your software.
Pulling and inspecting the images
The first thing that needs to be done when starting a container is to get
its image. While Podman is able to pull the image as a part of the run
subcommand, it is sometimes useful to fetch it as a separate step.
The command podman images
prints a list of images that are present on your
system. The output may look like this.
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/library/nginx 1.20.0 7ab27dbbfbdf 6 days ago 137 MB
docker.io/library/fedora 34 8d788d646766 2 weeks ago 187 MB
...
The repository refers to the on-line repository we fetched the image from. The tag is basically a version string. The image id is a unique identification of the image, it is generally derived from a cryptographic hash of the image contents. The remaining columns are self-descriptive.
When you execute podman pull IMAGE:TAG
, Podman will fetch the image
without starting any container. If you use latest
as a tag, the latest
available version will be fetched.
Pull docker.io/library/python:3-alpine
and check that it has appeared in
podman images
afterwards.
Shorter image names
If you paste the following content into
/etc/containers/registries.conf.d/unqualified.conf
, you will not need to
type docker.io/
in front of every image name. It is called an unqualified
search and it is tried first for every image name.
unqualified-search-registries = ["docker.io"]
Companies can have their own repositories and you may set up multiple repositories here if you wish to try more of them when fully-qualified name is not provided.
Image repository
If you wonder where the images are coming from, have a look at https://hub.docker.com/. Anyone can upload their images there for others to use.
Similarly to Python package index, you may find
malicious images here. At least, the containers are running isolated, so
the chances of misbehaviour are limited a little bit (compared to pip install
that you execute in the context of a normal user).
Images from the library
group are official images endorsed by Docker
itself and hence are relatively trustworthy.
Running containers
After the image is pulled, we can create a container from it.
We will start with an Alpine image because it is very small and thus very fast.
podman run --interactive --tty alpine:latest /bin/sh
If all went fine, you should see an interactive prompt / #
and inspecting
/etc/os-release
should show you the following text (version numbers may
differ):
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.13.5
PRETTY_NAME="Alpine Linux v3.13"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"
The run
subcommand starts a container from a specified image. With
--interactive
and --tty
(that are often combined into single -it
) we
specify that we want to attach a terminal to the container as we would use
it interactively. The last part of the command is the program to run.
Inside the container, we can execute any commands we wish. We are securely contained and the changes will not affect the host system.
Install curl
and check that you have functional network
access.
Solution.
Open a second terminal so that we can inspect how the container looks from the outside.
Inside the container, execute sleep 111
and in the other terminal (that is
running in the host) execute ps -ef --forest
. You shall see lines like
the following:
student 1477313 1 0 16:29 ? 00:00:00 /usr/bin/conmon ...
student 1477316 1477313 0 16:29 pts/0 00:00:00 \_ /bin/sh
student 1477370 1477316 0 16:33 pts/0 00:00:00 \_ sleep 111
This confirms that the processes inside a container are visible from the outside.
Run ps -ef
inside a container (or look into /proc
there).
What do you see? Is there something surprising?
Solution.
Execute also podman ps
. That prints list of running containers.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
643b5e7cea06 docker.io/library/alpine:latest /bin/sh 4 minutes ago Up 4 minutes ago practical_bohr
Container ID is again a unique identification, the other columns are self-descriptive. Note that since we have not specified a name, Podman assigned a random one.
If you terminate the session inside the container (exit
or Ctrl-D
), you
will return to the host terminal.
Execute podman ps
again. It is empty: the container is not running. If
you add --all
, you will see that the STATUS
has changed.
Exited (130) 1 second ago
Note that if we would execute podman run ...
again, we would start a new
container. Try it now.
We will describe the container life cycle later on, if you wish to remove
the container now, execute podman rm NAME
. Instead of NAME
, you can use
the randomly assigned one or CONTAINER ID
.
Single shot runs
You can pass any command to podman run
to be executed. If you know that
you would be removing the container immediately afterwards, you can add
--rm
to tell Podman to remove it automatically once it finishes execution.
podman run --rm alpine:latest cat /etc/os-release
If you want to pass a more complicated command, it is better via sh -c
.
Change the above command to first cd
to etc
and then call cat os-release
.
Why the following does not work podman run --rm alpine:latest cd /etc && cat os-release
?
Solution.
Managing container life cycle
Starting a container
After we have terminated the interactive session, the container exited. We
can call podman start CONTAINER
to start it again.
Each container has a so-called entry point that is executed when the container is started. For a service-style container (e.g., with a web server), the service would be started again.
For our Alpine example, the entry point is /bin/sh
(shell), so nothing
interesting will happen.
Check that the container is running with podman ps
.
Attaching to a running container
When the container is running, we can attach to it. podman attach
basically connects the stdout of the entrypoint to your terminal. With our
Alpine container, we can run command again inside the container.
We can also call podman exec -it CONTAINER CMD
that connects to the
running container in a new terminal (like a new tab). For us, running the
following would work (replace with your container name).
podman exec -it practical_bohr /bin/sh
Run again ps -ef
inside the container.
Which processes do you see?
Solution.
Terminating the exec
-ed shell returns us back to the host. Terminating
the attach
-ed shell terminates the whole container.
Containers in background (with names)
For service-style containers (e.g. nginx
that provides the webserver), we
often want to run them in daemon mode – in background.
That is possible with a --detach
option to the run
command.
We will also add a name webserver
to it so we can easily refer it.
podman run --detach --name webserver --publish 8080:80/tcp -v ./web:/usr/share/nginx/html:ro nginx:1.20.0
We will explain the -v
and --publish
later on.
This command starts the container and terminates. The webserver is running in the background. Check that you can again access http://localhost:8080/ in your browser.
You can stop such container with podman stop webserver
. Kind of similar
to systemctl stop ...
. Not a coincidence.
Check that after stopping the webserver, http://localhost:8080/ no longer works.
Starting the container again is possible with podman start webserver
.
start
and stop
and stdout
Note that both start
and stop
print the name of the container that was
started (stopped) on stdout. That is useful when executed in scripts, for
interactive use we can simply ignore the output.
Clean-up actions
When we are done with a container, we can remove it (but first, we need to
stop
it).
Executing the following command would remove webserver
container
completely.
podman rm webserver
You can also remove pull
-ed images using rmi
subcommand.
For example, to remove the nginx:1.20.0
, you can execute the following
command.
podman rmi nginx:1.20.0
Note that Podman will refuse to remove an image if it is used by an existing container. Recall that the images are stacked and hence Podman cannot remove the underlying layers.
Limiting the isolation
By default, container is an isolated world. If you want to access it from
the outside, you have to exec
into it (for terminal-style work) or publish
its services to the outside.
Port forwarding (a.k.a. port publishing)
For server-style containers (e.g. Nginx one we used above), that means
exposing some of ports to the host computer. That is done with the
--publish
argument where you specify which port on the host (e.g., 8080
)
shall be forwarded into the container: to which port and which protocol
(e.g., 80
and tcp
).
Therefore, the argument --publish 8080:80/tcp
means that we expect that
the container itself offers a service on its port 80
and we want to make
this (container’s) port available as 8080
. It is similar to SSH port
forwarding with -L
.
We can start the nginx
container without --publish
, but it does not
make much sense. Why?
Solution.
Volume mounts
Another option how to break the container isolation is to bind a certain
directory into the container. There are several options how to do that, we
will show the --volume
(or -v
) parameter.
It takes (again colon-separated) three arguments: source directory on the host, mapping inside the container and options.
Our example ./web:/usr/share/nginx/html:ro
thus specified that local
(host) directory web
shall be visible under /usr/share/nginx/html
inside
the container in read-only mode. It is very similar to normal mounts you
already know.
If you specify rw
instead of ro
, you can modify the files inside the
container.
Volume mounting is useful for any service-style container. A typical example is a database server. You start the container and you give it a mounted volume. To this volume (directory), it will store the actual database (the data files). Thus, when the container terminates, your data are actually persistent as they were stored outside of the container.
This has a huge advantage for testing service updates. You stop the container, make a backup of the data directory and start a new container (with a newer version) on the top of the same data directory. If everything works fine, you are good to go. Otherwise, you can stop the new container, restore from the backup and return to the old version.
Very simple and effective.
Cvičení
Apache web server
Start the Apache web server on the top of the 14/web
directory.
Use this httpd image.
Verify that you are really using the Apache web server.
Solution.
Python applications
Install the timestamp2iso
command system wide.
We recommend to use python:3.9-alpine
.
Note that you will not need to set up any virtual environment in this case: the whole machine (container) is yours. You can install things system-wide. Hint. Solution.
GitLab CI
We will now see how to actually configure CI on your GitLab repositories.
In this course we will focus on the simplest configuration where we want to execute tests after each commit. GitLab can be configured for more complex tasks where software can be even deployed to a virtual cloud machine but that is unfortunately out of scope.
If you are interested in this topic, GitLab has an extensive documentation for continuous integration and continuous deployment (CI/CD). The documentation is often densely packed with a lot of information, but it is a great source of knowledge not only about GitLab, but about many software engineering principles in general.
.gitlab-ci.yml
The configuration of the GitLab CI is stored inside file .gitlab-ci.yml
that has to be stored in the root directory of the project.
Your submission repository contains a bit more complex setup where we fetch actual configuration on-line so that only active tasks and quizzes are evaluated (without needing you to keep the repository up-to-date).
But the configuration for the timestamp2iso project now contains a very simple GitLab CI configuration.
base-tests:
image: python:3.9-alpine
script:
- apk add bats
- pip install .
- ./tests/base.bats
It specifies a pipeline job base-tests (you will see this name in the web UI) that is executed using python:3.9-alpine and it executes three commands. The first one installs a dependency, the second one installs the actual package (the project) and the last one executes simple BATS tests.
Note that GitLab will mount the Git repository into the container first and
then execute the commands inside the clone. The commands are executed with
set -e
: the first failing command terminates the whole pipeline.
Emulate the run locally. Hint. Solution.
Note that the command you created for running the script locally on top of the given image is virtually identical to the one executed by GitLab. GitLab does some extra caching and other performance-related tweaks, but conceptually, there is nothing more. And your code is tested in a reproducible way in a clean container (that is, in a sense, undistinguishable from a full virtual machine).
Cvičení
Add your own pipeline to GitLab that would check that you never use
/usr/bin/python
in a shebang.
Hint.
Solution.
Other bits
Notice how using the GitLab pipeline is easy. You find the right image, specify your script, and GitLab takes care of the rest.
From now on, every project you create on GitLab should have a pipeline that runs the tests (this includes Shellcheck, Pylint etc.). Set it up NOW for your assignments in other courses. Set it up for your Individual Software Project (NPRG045) next year. Use the chance to have your code regularly tested. It will save your time in the long run.
If you are unsure about which image to choose, official images are a good start. The script can have several steps where you install missing dependencies before running your program.
Recall that you do not need to create a virtual environment: the whole
machine is yours (and would be removed afterwards), so you can install
things globally. Recall the example above where we executed pip install
without starting a virtual environment.
There can be multiple jobs defined that are run in parallel (actually, there can be quite complex dependencies between them, but in the following example, all jobs are started at once).
The example below shows a fragment of .gitlab-ci.yml
that tests the
project on multiple Python versions.
# Default image if no other is specified
image: python:3.10
stages:
- test
# Commands executed before each "script" section (for any job)
before_script:
# To have a quick check that the version is correct
- python --version
# Install the project
- python -m pip install ...
# Run unit tests under different versions
unittests3.7:
stage: test
image: "python:3.7"
script:
- pytest --log-level debug tests/
unittests3.8:
stage: test
image: "python:3.8"
script:
- pytest --log-level debug tests/
unittests3.9:
stage: test
image: "python:3.9"
script:
- pytest --log-level debug tests/
unittests3.10:
stage: test
image: "python:3.10"
script:
- pytest --log-level debug tests/
Hodnocené úlohy (deadline: 29. května)
14/shellcheck.sh
(+ .gitlab-ci.yml
) (60 bodů)
Napište skript, který spustí ShellCheck nad všemi skripty ve vašem repozitáři.
Upravte váš .gitlab-ci.yml
tak, aby spouštěl tento skript při každém
commitu (push).
Pipeline má selhat, pokud libovolný skript obsahuje nějaký ShellCheckový
problém.
Pojmenujte pipeline shellcheck
, abychom ji mohli dobře najít.
Můžete se inspirovat nebo použít části kódu z funkce
assert_is_shellchecked
z našich
testů.
Také zvažte, zda nepoužít části kódu z příkladu na testování shebangu výše.
AKTUALIZACE: vaší definice pipeline můžete s klidem přidat na konec
existujícího .gitlab-ci.yml
(takže současné pipeline jsou stále aktivní).
Budete muset přidat stage: tests
k definici pipeline
(jinak se můžete potkat s chybou shellcheck job: chosen stage does not exist; available stages are .pre, tests, .post).
Podívejte se na definici pipeline unittests3.10
výše pro konkrétní příklad.
14/command.txt
(15 bodů)
Obraz registry.gitlab.com/mffd3s/nswi177/labs-2022-command:latest
obsahuje
příkaz nswi177-task-command
.
Spusťte tento příkaz s vaším GitLabovým loginem a zkopírujte jeho výstup do
14/command.txt
.
14/volume.txt
(25 bodů)
Obraz registry.gitlab.com/mffd3s/nswi177/labs-2022-volume:latest
obsahuje
příkaz nswi177-task-volume
.
Připojte váš repozitář s úkoly jako /srv/nswi177/
v kontejneru a spusťte v
něm tento příkaz.
Pokud je vše ok, příkaz vytiskne dva hexadecimální řetězce.
Zkopírujte je do 14/volume.txt
.
Váš repozitář musí být naklonovaný přes SSH.
Učební výstupy
Znalosti konceptů
Znalost konceptů znamená, že rozumíte významu a kontextu daného tématu a jste schopni témata zasadit do většího rámce. Takže, jste schopni …
-
vysvětlit, co je to kontejner (porovnat s virtuálním strojem a procesem)
-
vysvětlit, kde se hodí izolace, kterou nabízí kontejnery
-
vysvětlit životní cyklus kontejneru
-
vysvětlit principy continous integration (a důvody, proč existuje)
-
vysvětlit, proč další sandboxování (např. virtualenv) není potřeba uvnitř kontejneru
Praktické dovednosti
Praktické dovednosti se obvykle týkají použití daných programů pro vyřešení různých úloh. Takže, dokážete …
-
spustit interaktivní kontejner v Podmanu
-
spustit kontejner Podmanu se službou
-
zpřístupnit (expose) porty kontejneru
-
připojit svazek dovnitř kontejneru
-
vymazat nepoužívané obrazy a kontejnery
-
připravit konfigurace pro GitLab CI, která sestaví a otestuje Pythoní program