Lab #14 | NSWI177 Labs | D3S

Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.

Preflight checklist
GitLab CI
Signals
Network Manager
Other networking utilities
Periodically running tasks with Cron
Repairing corrupted disks
Tasks to check your understanding
Learning outcomes

There are several topics in this lab: we will look at asynchronous process communication and have a look at several interesting networking utilities. Briefly we will also touch the topic of repairing corrupted disk drives.

But the main topic is how to setup continuous integration in GitLab so that you can keep your software in a healthy state with as little effort as possible.

There is no running example and the topics can be read and tried in any order.

Preflight checklist

You understand how automated (unit) tests are designed and used.
You know what are TCP ports.
You know what is a disk image.
You remember how the kill utility was used.
You know what are Podman/Docker containers and images.

GitLab CI

If you have never heard the term continuous integration (CI), then it is the following in a nutshell.

About continuous integration

To ensure that the software you build is in healthy state, you should run tests on it often and fix broken things as soon as possible (because the cost of bug fixes rises dramatically with each day they are undiscovered).

The answer to this is that the developer shall run tests on each commit. Since that is difficult to enforce, it is better to do this automatically. CI in its simplest form refers to state where automated tests (e.g., BATS-based or Python-based) are executed automatically after each push, e.g. after pushing changes to any branch to GitLab.

But CI can do much more: if the tests are passing, the pipeline of jobs can package the software and publish it as an artifact (e.g., as an installer). Or it can trigger a job to deploy to a production environment and make it available to the customers. And so on.

Often this is named as CI/CD, meaning continuous integration and continuous delivery (or deployment).

Setting up CI on GitLab

In this text, you will see how to setup GitLab CI to your needs.

The important thing to know is that GitLab CI can run on top of Podman containers. Hence, to setup a GitLab pipeline, you choose a Podman image and the commands which should be executed inside the container. GitLab will run the container for you and run your commands in it.

Depending on the outcome of the whole script (i.e., its exit code), it will mark the pipeline as passing or failing.

In this course we will focus on the simplest configuration where we want to execute tests after each commit. GitLab can be configured for more complex tasks where software can be even deployed to a virtual cloud machine but that is unfortunately out of scope.

If you are interested in this topic, GitLab has an extensive documentation. The documentation is often densely packed with a lot of information, but it is a great source of knowledge not only about GitLab, but about many software engineering principles in general.

`.gitlab-ci.yml`

The configuration of the GitLab CI is stored inside file .gitlab-ci.yml that has to be stored in the root directory of the project.

We expect you have your own fork of the web repository and that you have extended the original Makefile.

We will now setup a CI job that only builds the web. It will be the most basic CI one can imagine. But at least it will ensure that the web is always in a buildable state.

However, it speed things up, we will remove the generation of PDF from our Makefile as OpenOffice installation requires downloading of 400MB which is quite a lot to be done for each commit.

image: fedora:37

build:
  script:
    - dnf install -y make pandoc
    - make

It specifies a pipeline job build (you will see this name in the web UI) that is executed using fedora image and it executes two commands. The first one installs a dependency and the second one runs make.

We are not using Alpine because installation of Pandoc to Alpine is a bit more complicated. It requires that we either install it via package management tools of the Haskell programming language or fetch a prebuilt static binary.

Add the .gitlab-ci.yml to your Git repository (i.e. your fork), commit and push it.

If you open the project page in GitLab, you should see the pipeline icon next to it and it should eventually turn green.

The log of the job would probably look like this.

Running with gitlab-runner 15.11.0 (436955cb)
  on gitlab.mff docker Mtt-jvRo, system ID: s_7f0691b32461
Preparing the "docker" executor 00:03
Using Docker executor with image fedora:37 ...
Pulling docker image fedora:37 ...
Using docker image sha256:34354ac2c458e89615b558a15cefe1441dd6cb0fc92401e3a39a7b7012519123 for fedora:37 with digest fedora@sha256:e3012fe03ccee2d37a7940c4c105fb240cbb566bf228c609d9b510c9582061e0 ...
Preparing environment 00:00
Running on runner-mtt-jvro-project-11023-concurrent-0 via gitlab-runner...
Getting source from Git repository 00:01
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/horkv6am/nswi-177-web/.git/
Checking out 58653aa3 as detached HEAD (ref is master)...
Removing out/index.html
Removing out/main.css
Removing out/rules.html
Removing out/score.html
Removing tmp/
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:33
Using docker image sha256:34354ac2c458e89615b558a15cefe1441dd6cb0fc92401e3a39a7b7012519123 for fedora:37 with digest fedora@sha256:e3012fe03ccee2d37a7940c4c105fb240cbb566bf228c609d9b510c9582061e0 ...
$ dnf install -y make pandoc
Fedora 37 - x86_64                               43 MB/s |  82 MB     00:01
Fedora 37 openh264 (From Cisco) - x86_64        4.3 kB/s | 2.5 kB     00:00
Fedora Modular 37 - x86_64                       17 MB/s | 3.8 MB     00:00
Fedora 37 - x86_64 - Updates                     24 MB/s |  29 MB     00:01
Fedora Modular 37 - x86_64 - Updates            4.8 MB/s | 2.9 MB     00:00
Dependencies resolved.
================================================================================
 Package              Architecture  Version                 Repository     Size
================================================================================
Installing:
 make                 x86_64        1:4.3-11.fc37           fedora        542 k
 pandoc               x86_64        2.14.0.3-18.fc37        fedora         21 M
Installing dependencies:
 gc                   x86_64        8.0.6-4.fc37            fedora        103 k
 guile22              x86_64        2.2.7-6.fc37            fedora        6.5 M
 libtool-ltdl         x86_64        2.4.7-2.fc37            fedora         37 k
 pandoc-common        noarch        2.14.0.3-18.fc37        fedora        472 k
Transaction Summary
================================================================================
Install  6 Packages
Total download size: 29 M
Installed size: 204 M
Downloading Packages:
(1/6): libtool-ltdl-2.4.7-2.fc37.x86_64.rpm     846 kB/s |  37 kB     00:00
(2/6): make-4.3-11.fc37.x86_64.rpm              9.4 MB/s | 542 kB     00:00
(3/6): gc-8.0.6-4.fc37.x86_64.rpm               595 kB/s | 103 kB     00:00
(4/6): pandoc-common-2.14.0.3-18.fc37.noarch.rp 8.4 MB/s | 472 kB     00:00
(5/6): guile22-2.2.7-6.fc37.x86_64.rpm           18 MB/s | 6.5 MB     00:00
(6/6): pandoc-2.14.0.3-18.fc37.x86_64.rpm        56 MB/s |  21 MB     00:00
--------------------------------------------------------------------------------
Total                                            51 MB/s |  29 MB     00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                        1/1
  Installing       : pandoc-common-2.14.0.3-18.fc37.noarch                  1/6
  Installing       : libtool-ltdl-2.4.7-2.fc37.x86_64                       2/6
  Installing       : gc-8.0.6-4.fc37.x86_64                                 3/6
  Installing       : guile22-2.2.7-6.fc37.x86_64                            4/6
  Installing       : make-1:4.3-11.fc37.x86_64                              5/6
  Installing       : pandoc-2.14.0.3-18.fc37.x86_64                         6/6
  Running scriptlet: pandoc-2.14.0.3-18.fc37.x86_64                         6/6
  Verifying        : gc-8.0.6-4.fc37.x86_64                                 1/6
  Verifying        : guile22-2.2.7-6.fc37.x86_64                            2/6
  Verifying        : libtool-ltdl-2.4.7-2.fc37.x86_64                       3/6
  Verifying        : make-1:4.3-11.fc37.x86_64                              4/6
  Verifying        : pandoc-2.14.0.3-18.fc37.x86_64                         5/6
  Verifying        : pandoc-common-2.14.0.3-18.fc37.noarch                  6/6
Installed:
  gc-8.0.6-4.fc37.x86_64               guile22-2.2.7-6.fc37.x86_64
  libtool-ltdl-2.4.7-2.fc37.x86_64     make-1:4.3-11.fc37.x86_64
  pandoc-2.14.0.3-18.fc37.x86_64       pandoc-common-2.14.0.3-18.fc37.noarch
Complete!
$ make
pandoc --template template.html -o out/index.html index.md
pandoc --template template.html -o out/rules.html rules.md
./table.py <score.csv | pandoc --template template.html --metadata title="Score" - >out/score.html
cp main.css out/
Cleaning up project directory and file based variables 00:01
Job succeeded

Note that GitLab will mount the Git repository into the container first and then execute the commands inside the clone. The commands are executed with set -e: the first failing command terminates the whole pipeline.

Try to emulate the above run locally. Hint. Solution.

Other bits

Notice how using the GitLab pipeline is easy. You find the right image, specify your script, and GitLab takes care of the rest.

From now on, every project you create on GitLab should have a pipeline that runs the tests (this includes Shellcheck, Pylint etc.). Set it up NOW for your assignments in other courses. Set it up for your Individual Software Project (NPRG045) next year. Use the chance to have your code regularly tested. It will save your time in the long run.

If you are unsure about which image to choose, official images are a good start. The script can have several steps where you install missing dependencies before running your program.

Recall that you do not need to create a virtual environment: the whole machine is yours (and would be removed afterwards), so you can install things globally.

There can be multiple jobs defined that are run in parallel (actually, there can be quite complex dependencies between them, but in the following example, all jobs are started at once).

The example below shows a fragment of .gitlab-ci.yml that tests the project on multiple Python versions.

# Default image if no other is specified
image: python:3.10

stages:
  - test

# Commands executed before each "script" section (for any job)
before_script:
    # To have a quick check that the version is correct
    - python3 --version
    # Install the project
    - python3 -m pip install ...

# Run unit tests under different versions
unittests3.7:
  stage: test
  image: "python:3.7"
  script:
    - pytest --log-level debug tests/

unittests3.8:
  stage: test
  image: "python:3.8"
  script:
    - pytest --log-level debug tests/

unittests3.9:
  stage: test
  image: "python:3.9"
  script:
    - pytest --log-level debug tests/

unittests3.10:
  stage: test
  image: "python:3.10"
  script:
    - pytest --log-level debug tests/

Signals

Linux systems use the concept of signals to communicate asynchronously with a running program (process). The word asynchronously means that the signal can be sent (and delivered) to the process regardless of its state. Compare this with communication via standard input (for example), where the program controls when it will read from it (by calling appropriate I/O read function).

However, signals do not provide a very rich communication channel: the only information available (apart from the fact that the signal was sent) is the signal number. Most signal numbers are defined by the kernel, which also handles some signals by itself. Otherwise, signals can be received by the application and acted upon. If the application does not handle the signal, it is processed in the default way. For some signals, the default is terminating the application; other signals are ignored by default.

This is actually expressed in the fact that the utility used to send signals is called kill. We have already seen it earlier and used it to terminate processes.

By default, the kill utility sends signal 15 (also called TERM or SIGTERM) that instructs the application to terminate. An application may decide to catch this signal, flush its data to the disk etc., and then terminate. But it can do virtually anything and it may even ignore the signal completely. Apart from TERM, we can instruct kill to send the KILL signal (number 9) which is handled by kernel itself. It immediately and forcefully terminates the application (even if the application decides to mask or handle the signal).

Many other signals are sent to the process in reaction to a specific event. For example, the signal PIPE is sent when a process tries to write to a pipe, whose reading end was already closed – the “Broken pipe” message you already saw is printed by the shell if the command was terminated by this signal. Terminating a program by pressing Ctrl-C in the terminal actually sends the INT (interrupt) signal to it. If you are curious about the other signals, see signal(7).

For example, when the system is shutting down, it sends TERM to all its processes. This gives them a chance to terminate cleanly. Processes which are still alive after some time are killed forcefully with KILL.

Use of `kill` and `pkill`

Recall lab 5 how we can use kill to terminate processes.

As a quick recap, open two terminals now.

Run sleep in the first, look up its PID in second and kill it with TERM (default signal) and then re-do the exercise with KILL (-9).

Solution.

Similarly, you can use pkill to kill processes by name (but be careful as with great power comes great responsibility). Consult the manual pages for more details.

There is also killall command that behaves similarly. On some Unix systems (e.g., Solaris), this command has completely different semantics and is used to shut down the whole machine.

Reacting to signals in Python

Your program would typically react to TERM (the default “soft” termination), INT (Ctrl-C from the keyboard) and perhaps to USR1 or USR2 (the only user-defined signals). System daemons often react to HUP by reloading their configuration.

The following Python program reacts to Ctrl-C by terminating. Store it as show_signals.py and hit Ctrl-C while running ./show_signals.py.

Avoid a common trap when trying the below code: do not store this into file signal.py as the name would collide with the standard package.

#!/usr/bin/env python3

import sys
import time
import signal

# Actual signal callback
def on_signal(signal_number, frame_info):
    print("")
    print("Caught signal {} ({})".format(signal_number, frame_info))
    sys.exit()

def main():
    # Setting signal callback
    signal.signal(signal.SIGINT, on_signal)
    while True:
        time.sleep(0.5)
        print("Hit Ctrl-C...")

if __name__ == '__main__':
    main()

Exercise

Write a program that tries to print all prime numbers. When terminating, it stores the highest number found so far and on the next invocation, it continues from there.

Solution.

Reacting to signals in shell

Reaction to signals in shell is done through a trap command.

Note that a typical action for a signal handler in a shell script is clean-up of temporary files.

#!/bin/bash

set -ueo pipefail

on_interrupt() {
    echo "Interrupted, terminating ..." >&2
    exit 17
}

on_exit() {
    echo "Cleaning up..." >&2
    rm -f "$MY_TEMP"
}

MY_TEMP="$( mktemp )"

trap on_interrupt INT TERM
trap on_exit EXIT

echo "Running with PID $$"

counter=1
while [ "$counter" -lt 10 ]; do
    date "+%Y-%m-%d %H:%M:%S | Waiting for Ctrl-C (loop $counter) ..."
    echo "$counter" >"$MY_TEMP"
    sleep 1
    counter=$(( counter + 1 ))
done

The command trap receives as the first argument the command to execute on the signal. Other arguments list the signals to react to. Note that a special signal EXIT means normal script termination. Hence, we do not need to call on_exit after the loop terminates.

We use exit 17 to report termination through the Ctrl-C handler (the value is arbitrary by itself).

Feel free to check the return value with echo $? after the command terminates. The special variable $? contains the exit code of the last command.

If your shell script starts with set -e, you will rarely need $? as any non-zero value will cause script termination.

However, following construct prevents the termination and allows you to branch your code based on exit value if needed.

set -e

...
# Prevent termination through set -e
rc=0
some_command_with_interesting_exit_code || rc=$?
if [ $rc -eq 0 ]; then
    ...
elif [ $rc -eq 1 ]; then
    ...
else
    ...
fi

Using - (dash) instead of the handler causes the respective handler to be set to default.

Your shell scripts shall always include a signal handler for clean-up of temporary files.

Note the use of $$ which prints the current PID.

Run the above script, note its PID and run the following in a new terminal.

kill THE_PID_PRINTED_BY_THE_ABOVE_SCRIPT

The script was terminated and the clean-up routine was called. Compare with situation when you comment out the trap command.

Run the script again but pass -9 to kill to specify that you want to send signal nine (i.e., KILL).

What happened? Answer.

While signals are a rudimentary mechanism, which passes binary events with no additional data, they are the primary way of process control in Linux.

If you need a richer communication channel, you can use D-Bus instead.

Reasonable reaction to basic signals is a must for server-style applications (e.g., a web server should react to TERM by completing outstanding requests without accepting new connections, and terminating afterwards). In shell scripts, it is considered good manners to always clean up temporary files.

Deficiencies in signal design and implementation

Signals are the rudimentary mechanism for interprocess communication on Unix systems. Unfortunately their design has several flaws that complicate their safe usage.

We will not dive into details but you should bear in mind that signal handling can be tricky in situations where you cannot afford to lose any signal or when signals can come quickly one after another. And there is a whole can of worms when using signals in multithreaded programs.

On the other hand, for simple shell scripts where we want to clean-up on forceful termination the pattern we have shown above is sufficient. It guards our script when user hits Ctrl-C because they realized that it is working on wrong data or something similar.

But note that it contains a bug for the case when the user hits Ctrl-C very early during script execution.

MY_TEMP="$( mktemp )"
# User hits Ctrl-C here
trap on_interrupt INT TERM
trap on_exit EXIT

The temporary file was already created but the handler was not yet registered and thus the file will not be removed. But changing the order complicates the signal handler as we need to test that $MY_TEMP was already initialized.

But the fact that signals can be tricky does not mean that we should abandon the basic means of ensuring that our scripts clean-up after themselves even when they are forcefully terminated.

In other programming languages the clean-up is somewhat simpler because it is possible to create a temporary file that is always automatically removed once a process terminates.

It relies on a neat trick where we can open a file (create it) and immediately remove it. However, as long as we keep the file descriptor (i.e., the result of Pythonic open) the system keeps the file contents intact. But the file is already gone (in the sense of the label of the contents) and closing the file removes it completely.

Because shell is based on running multiple processes, the above trick does not work for shell scripts.

Check your understanding

Select all correct statements about signals and processes. You need to have enabled JavaScript for the quiz to work.

Network Manager

There are several ways how to configure networking in Linux. Server admins often prefer to use the bare ip command; on desktops most distributions today use the NetworkManager, so we will show it here too. Note that the ArchLinux Wiki page about NetworkManager contains a lot of information, too.

NetworkManager has a GUI (you probably used its applet without knowing about it), a TUI (which can be run with nmtui), and finally a CLI.

We will (for obvious reasons) focus on the command-line interface here. Without parameters, nmcli will display information about current connections:

wlp58s0: connected to TP-Link_1CE4
        "Intel 8265 / 8275"
        wifi (iwlwifi), 44:03:2C:7F:0F:76, hw, mtu 1500
        ip4 default
        inet4 192.168.0.105/24
        route4 0.0.0.0/0
        route4 192.168.0.0/24
        inet6 fe80::9ba5:fc4b:96e1:f281/64
        route6 fe80::/64
        route6 ff00::/8

p2p-dev-wlp58s0: disconnected
        "p2p-dev-wlp58s0"
        wifi-p2p, hw

enp0s31f6: unavailable
        "Intel Ethernet"
        ethernet (e1000e), 54:E1:AD:9F:DB:36, hw, mtu 1500

vboxnet0: unmanaged
        "vboxnet0"
        ethernet (vboxnet), 0A:00:27:00:00:00, hw, mtu 1500

lo: unmanaged
        "lo"
        loopback (unknown), 00:00:00:00:00:00, sw, mtu 65536

DNS configuration:
        servers: 192.168.0.1 8.8.8.8
        interface: wlp58s0

...

Compare the above with the output of ip addr. Notice that NetworkManager explicitly states the routes by default and also informs you that some interfaces are not controlled by it (here, lo or vboxnet0).

Changing IP configuration

While most networks offer DHCP (at least those you will connect to with your desktop), sometimes you need to set up IP addresses manually.

A typical case is when you need to connect two machines temporarily, e.g., to transfer a large file over a wired connection.

The only thing you need to decide on is which network you will create. Do not use the same one as your home router uses; our favourite selection is 192.168.177.0/24.

Assuming the name from above, the following command adds a connection named wired-static-temp on enp0s31f6:

sudo nmcli connection add \
    con-name wired-static-temp \
    ifname enp0s31f6 \
    type ethernet \
    ip4 192.168.177.201/24

It is often necessary to bring this connection up with the following command:

sudo nmcli connection up wired-static-temp

Follow the same procedure on the second host, but use a different address (e.g., .202). You should be able to ping the other machine now:

ping 192.168.177.201

To demonstrate how ping behaves when the connection goes down, you can try unplugging the wire, or doing the same in software:

sudo nmcli connection down wired-static-temp

Other networking utilities

We will not substitute the networking courses here, but we mention some basic commands that could help you debug basic network-related problems.

You already know ping: the basic tool to determine whether a machine with a given IP address is up (and responding to a network traffic).

ping is the basic tool if you suddenly lose a connection to some server. Ping the destination server and also some other well-known server. If the packets are going through, you know that the problem is on a different layer. If only packets to the well-known server gets through, the problem is probably with the server in question. If both fail, your network is probably down.

But there are more advanced tool available too.

traceroute (a.k.a. the path is the goal)

Sometimes, it can be handy to know the precise path, which the packets travel. For this kind of task, we can use traceroute.

Similarly to ping, we need to just specify the destination.

traceroute 1.1.1.1

traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  _gateway (10.16.2.1)  2.043 ms  1.975 ms  1.948 ms
 2  10.17.0.1 (10.17.0.1)  1.956 ms  1.971 ms  1.961 ms
 3  gw.sh.cvut.cz (147.32.30.1)  1.947 ms  1.973 ms  1.977 ms
 4  r1sh-sush.net.cvut.cz (147.32.252.238)  2.087 ms  2.262 ms  2.527 ms
 5  r1kn-konv.net.cvut.cz (147.32.252.65)  1.856 ms  1.849 ms  1.847 ms
 6  kn-de.net.cvut.cz (147.32.252.57)  1.840 ms  1.029 ms  0.983 ms
 7  195.113.144.172 (195.113.144.172)  1.894 ms  1.900 ms  1.885 ms
 8  195.113.235.99 (195.113.235.99)  4.793 ms  4.748 ms  4.723 ms
 9  nix4.cloudflare.com (91.210.16.171)  2.264 ms  2.807 ms  2.814 ms
10  one.one.one.one (1.1.1.1)  1.883 ms  1.800 ms  1.834 ms

The first column corresponds to the hop count. The second column represents the address of that hop and after that, you see three space-separated times in milliseconds. traceroute command sends three packets to the hop and each of the time refers to the time taken by the packet to reach the hop. So from the foregoing output we can see that the packages visited 10 hops on its way between the local computer and the destination.

This tool is especial useful, when you have network troubles and you are not sure where the issue is.

traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  10.21.20.2 (10.21.20.2)  0.798 ms  0.588 ms  0.699 ms
 2  10.21.5.1 (10.21.5.1)  0.593 ms  0.506 ms  0.611 ms
 3  192.168.88.1 (192.168.88.1)  0.742 ms  0.637 ms  0.534 ms
 4  10.180.2.113 (10.180.2.113)  1.696 ms  4.106 ms  1.483 ms
 5  46.29.224.17 (46.29.224.17)  14.343 ms  13.749 ms  13.806 ms
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

From this log we can see that the last visited hop was 46.29.224.17, so we can focus our attention to this network element.

`nmap` (a.k.a. let me scan your network)

nmap is a very powerful tool. Unfortunately, even an innocent – but repeated – usage could be easily misinterpreted as a malicious scan of vulnerabilities that are susceptible to attack. Use this tool with care and experiment in your home network. Reckless scanning of the university network can actually ban your machine from connecting at all for quite some time.

nmap is the basic network scanning tool. If you want to know which network services are running on a machine you can try connecting to all of its ports to check which are opened. Nmap does that and much more.

Try first scanning your loopback device for internal services running on your machine:

nmap localhost

The result could look like this (the machine has a print server and a proxy HTTP server):

Starting Nmap 7.91 ( https://nmap.org ) at 2021-05-04 16:38 CEST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00011s latency).
Other addresses for localhost (not scanned): ::1
rDNS record for 127.0.0.1: localhost.localdomain
Not shown: 998 closed ports
PORT     STATE SERVICE
631/tcp  open  ipp
3128/tcp open  squid-http

Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds

If you want to see more information, you can try adding -A switch.

nmap -A localhost

And if you run it under root (i.e. sudo nmap -A localhost) nmap can try to detect the remote operating system, too.

By default, nmap scans only ports frequently used by network services. You can specify a different range with the -p option:

nmap -p1-65535 localhost

This instructs nmap to scan all TCP ports (-p1-65535) on localhost.

Again: do not scan all TCP ports on machines in the university network!

As an exercise, which web server is used on our GitLab? And which one is on our University website? Solution.

`nc` (netcat)

Let us examine how to create network connections from the shell. This is essential for debugging of network services, but it is also useful for using the network in scripts.

The Swiss-army knife of network scripting is called netcat or nc.

Unfortunately, there exist multiple implementations of netcat, which differ in options and capabilities. We will show ncat, which is installed by default in Fedora. Your system might have a different one installed.

Trivial things first: if you want to connect to a given TCP port on a remote machine, you can run nc machine port. This establishes the connection and wires the stdin and stdout to this connection. You can therefore interact with the remote server.

Netcat is often connected to other commands using pipes. Let us write a rudimentary HTTP client:

echo -en "GET / HTTP/1.1\r\nHost: www.kernel.org\r\n\r\n" | nc www.kernel.org 80

We are using \r\n, since the HTTP protocol wants lines terminated by CR+LF. The Host: header is mandatory, because HTTP supports multiple web sites running on the same combination of IP address and port.

We see that http://www.kernel.org/ redirects us to https://www.kernel.org/, so we try again using HTTPS. Fortunately, our version of netcat knows how to handle the TLS (transport-layer security) protocol used for encryption:

echo -en "GET / HTTP/1.1\r\nHost: www.kernel.org\r\n\r\n" | nc --ssl www.kernel.org 443

Now, let us build a simple server. It will listen on TCP port 8888 and when somebody connects to it, the server will send contents of a given file to the connection:

nc --listen 8888 <path-to-file

We can open a new shell and try receiving the file:

nc localhost 8888

We receive the file, but netcat does not terminate – it still waits for input from the stdin. Pressing Ctrl-D works, but it is easier to tell netcat to work in one direction only:

nc localhost 8888 --recv-only

OK, this works for transferring a single file over the network. (But please keep in mind that the transfer is not encrypted, so it is not wise to use it over a public network.)

When the file is transferred, the server terminates. What if we want to run a server, which can handle multiple connections? Here, redirection is not enough since we need to read the file multiple times. Instead, we can ask netcat to run a shell command for every connection and wire the connection to its stdin and stdout:

nc --listen 8888 --keep-open --sh-exec 'cat path-to-file'

Of course, this can be used for much more interesting things than sending a file. You can take any program which interacts over stdin and stdout and make it into a network service.

Periodically running tasks with Cron

There are many tasks in your system that needs to be executed periodically. Many of them are related to system maintenance, such as log rotation (removal of outdated logs), but even normal users may want to perform regular tasks.

A typical example might be backing up your $HOME or a day-to-day change of your desktop wallpaper.

From the administrator’s point of view, you need to install the cron daemon and start it. On Fedora, the actual package is called cronie, but the service is still named crond.

System-wide jobs (tasks) are defined /etc/cron.*/, where you can directly place your scripts. For example, to perform a daily backup of your machine, you could place your backup.sh script directly into /etc/cron.daily/ Of course, there are specialized backup tools (e.g. duplicity), your solution from Lab 06 is a pretty good start for a homebrew approach.

If you want more fine-grained specification than the one offered by the cron.daily or cron.hourly directories, you can specify it in a custom file inside /etc/cron.d/.

There, each line specifies a single job: a request to run a specified command at specified time under the specified user (typically root). The time is given as a minute (0-59), hour (0-23), day of month (1-31), month (1-12), and day of week (0-6, 0 is Sunday). You can use * for “any” in every field. For more details, see crontab(5).

Therefore, the following will execute /usr/local/bin/backup.sh every day 85 minutes after midnight (i.e., at 1:25 am). The second line will call big-backup.sh on every Sunday morning.

25 1 * * * root /usr/local/bin/backup.sh
0  8 * * 0 root /usr/local/bin/big-backup.sh

Note that cron.d will typically contain a special call of the following form which ensures that the cron.hourly scripts are executed (i.e., the cronie deamon itself looks only inside /etc/cron.d/, the use of cron.daily or cron.monthly is handled by special jobs).

01 * * * * root run-parts /etc/cron.hourly

Running as a normal user

Normal (i.e., non-root) users cannot edit files under /etc/cron.d/. Instead, they have a command called crontab that can be used to edit their personal cron table (i.e., their list of cron jobs).

Calling crontab -l will list current content of your cron table. It will probably print nothing.

To edit the cron table, run crontab -e. It will launch your favourite editor where you can add lines in the above-mentioned format, this time without the user specification.

For example, adding the following entry will change your desktop background every day:

1 1 * * * /home/intro/bin/change_desktop_background.sh

Of course, assuming you have such script in the given location. If you really want to try it, the following script works for Xfce and uses Lorem Picsum.

#!/bin/bash

# Update to your hardware configuration
screen_width=1920
screen_height=1080

wallpaper_path="$HOME/.wallpaper.jpg"

curl -L --silent "https://picsum.photos/$screen_width/$screen_height" >"$wallpaper_path"

# Xfce
# Select the right path from xfconf-query -lvc xfce4-desktop
xfconf-query -c xfce4-desktop -p /backdrop/screen0/monitor0/workspace0/last-image -s "$wallpaper_path"

# LXDE
pcmanfm -w "$wallpaper_path"

# Sway
# For more details see `man 5 sway-output`
# You can also set a different wallpaper for each output (display)
# Run `swaymsg -t get_outputs` for getting specific output name
swaymsg output '*' bg "$wallpaper_path" fill

Repairing corrupted disks

The primary Linux tool for fixing broken volumes is called fsck (filesystem check). Actually, the fsck command is a simple wrapper, which selects the right implementation according to file system type. For the ext2/ext3/ext4 family of Linux file systems, the implementation is called e2fsck. It can be more useful to call e2fsck directly, since the more specialized options are not passed through the general fsck.

As we already briefly mentioned in Lab 10, it is safer to work on a copy of the volume, especially if you suspect that the volume is seriously broken. This way, you do not risk breaking it even more. This can be quite demanding in terms of a disk space: in the end it all comes down to money – are the data worth more than buying an extra disk or even bringing it completely to a professional company focusing on this sort of work.

Alternatively, you can run e2fsck -n first, which only checks for errors, and judge their seriousness yourself.

Sometimes, the disk is too broken for fsck to repair it. (In fact, this happens rarely with ext filesystems – we have witnessed successful repairs of disks whose first 10 GB were completely rewritten. But on DOS/Windows filesystems like vfat and ntfs, automated repairs are less successful.)

Even if this happens, there is still a good chance of recovering many files. Fortunately, if the disk was not too full, most files were stored continuously. So we can use a simple program scanning the whole image for signatures of common file formats (recall, for example, how the GIF format looks like). Of course, this does not recover file names or the directory hierarchy.

The first program we will show is photorec (sudo dnf install testdisk). Before starting it, prepare an empty directory where to store the results.

It takes a single argument: the file image to scan. It then starts an interactive mode where you select where to store the recovered files and also guess on file system type (for most cases, it will be FAT or NTFS). Then it tries to recover the files. Nothing more, nothing less.

photorec is able to recover plenty of file formats including JPEG, MP3, ZIP files (this includes also ODT and DOCX) or even RTF files.

Another tool is recoverjpeg that focuses on photo recovery. Unlike photorec, recoverjpeg runs completely non-interactively and offers some extra parameters that allow you to fine-tune the recovery process.

recoverjpeg is not packaged for Fedora: you can try installing it manually or play with photorec only (and hope you will never need it).

Tasks to check your understanding

We expect you will solve the following tasks before attending the labs so that we can discuss your solutions during the lab.

Add your own pipeline to GitLab that would check that you never use /usr/bin/python in a shebang.

Hint.

Solution.

Learning outcomes

Learning outcomes provide a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).

Conceptual knowledge

Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …

explain why use of nmap is often prohibited/limited by network administrators
explain what is a process signal
explain principles of continuous integration
explain advantages of using continuous integration
explain in broad sense how GitLab CI works

Practical skills

Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …

use nc for basic operations
use pgrep to find specific processes
send a signal to a running process
use nmap for basic network scanning
use ip command to query current network configuration
use ping and traceroute as basic tools for debugging networking issues
setup GitLab CI for simple projects
optional: use NetworkManager to set up static IP addressing
optional: fix corrupted file systems using the family of fsck programs
optional: use PhotoRec to restore files from a broken file system

podman run --rm -v .:/root/repo fedora:37 /bin/bash -c "cd /root/repo && dnf install -y make pandoc && make"

Note that -v takes . as we assume we are inside the web repository. We mount it to some directory (recall that by default, you run as root inside the container) and then execute the script mentioned in .gitlab-ci.yml.

sleep 999

In the second terminal, find the PID of this process.

ps ... | grep sleep
# or
pgrep sleep

And then terminate it.

kill THE_PID

kill -9 THE_PID

Note that we have inserted time.sleep to demonstrate the function of the program without filling the screen too much.

Note that this program requires an ugly hack of using a global variable as the signal handler is called asynchronously and accessing local variables is out of question.

#!/usr/bin/env python3

import math
import signal
import sys
import time

HIGHEST_SO_FAR = 2

def is_prime(number):
    if number <= 1:
        return False
    elif number <= 3:
        return True
    elif number % 2 == 0:
        return False
    else:
        for i in range(3, number, 2):
            if i*i > number:
                return True
            if number % i == 0:
                return False

def on_signal(signal_number, frame_info):
    global HIGHEST_SO_FAR
    # print("=== signal_handler ==")
    with open(".highest.prime", "w") as f:
        print("{}".format(HIGHEST_SO_FAR), file=f)
    sys.exit()

def main():
    global HIGHEST_SO_FAR
    try:
        with open(".highest.prime", "r") as f:
            HIGHEST_SO_FAR = int(f.read())
    except FileNotFoundError:
        pass
    signal.signal(signal.SIGINT, on_signal)
    i = HIGHEST_SO_FAR + 1
    while True:
        time.sleep(0.5)
        if is_prime(i):
            print("The next prime is {}".format(i))
            HIGHEST_SO_FAR = i
        i = i + 1

if __name__ == '__main__':
    main()

gitlab.mff.cuni.cz:

sudo nmap -A -p 80 gitlab.mff.cuni.cz

...
80/tcp   open     http         nginx
...

www.cuni.cz:

...
80/tcp  open  http    Apache httpd 2.2.15 ((CentOS))
...

find [0-9][0-9]/ -type f | while read -r fname; do if head -n 1 "$fname" | grep -q '^#!/usr/bin/python'; then echo "Bad Python shebang for $fname."; fi; done

We can safely assume that we do not have crazy file names in our repository, so we can use find | while read filename (instead of using the safer null-byte separator).

Here is the shell script:

#!/bin/bash

set -ueo pipefail

find [0-9][0-9]/ -type f | (
    exit_code=0
    while read -r fname; do
        if head -n 1 "$fname" | grep -q '^#!/usr/bin/python'; then
            echo "Bad Python shebang for $fname." >&2
            exit_code=1
        fi
    done
    exit $exit_code
)

And here is the extension of your .gitlab-ci.yml.

# Python shebang
check-bad-python-shebang:
  image: registry.gitlab.com/mffd3s/nswi177/student:latest
  script:
    - bin/check_python_shebang.sh

podman run --rm -v .:/root/repo fedora:37 /bin/bash -c "cd /root/repo && dnf install -y make pandoc && make"

sleep 999

In the second terminal, find the PID of this process.

ps ... | grep sleep
# or
pgrep sleep

And then terminate it.

kill THE_PID

kill -9 THE_PID

Note that we have inserted time.sleep to demonstrate the function of the program without filling the screen too much.

Note that this program requires an ugly hack of using a global variable as the signal handler is called asynchronously and accessing local variables is out of question.

#!/usr/bin/env python3

import math
import signal
import sys
import time

HIGHEST_SO_FAR = 2

def is_prime(number):
    if number <= 1:
        return False
    elif number <= 3:
        return True
    elif number % 2 == 0:
        return False
    else:
        for i in range(3, number, 2):
            if i*i > number:
                return True
            if number % i == 0:
                return False

def on_signal(signal_number, frame_info):
    global HIGHEST_SO_FAR
    # print("=== signal_handler ==")
    with open(".highest.prime", "w") as f:
        print("{}".format(HIGHEST_SO_FAR), file=f)
    sys.exit()

def main():
    global HIGHEST_SO_FAR
    try:
        with open(".highest.prime", "r") as f:
            HIGHEST_SO_FAR = int(f.read())
    except FileNotFoundError:
        pass
    signal.signal(signal.SIGINT, on_signal)
    i = HIGHEST_SO_FAR + 1
    while True:
        time.sleep(0.5)
        if is_prime(i):
            print("The next prime is {}".format(i))
            HIGHEST_SO_FAR = i
        i = i + 1

if __name__ == '__main__':
    main()

gitlab.mff.cuni.cz:

sudo nmap -A -p 80 gitlab.mff.cuni.cz

...
80/tcp   open     http         nginx
...

www.cuni.cz:

...
80/tcp  open  http    Apache httpd 2.2.15 ((CentOS))
...

find [0-9][0-9]/ -type f | while read -r fname; do if head -n 1 "$fname" | grep -q '^#!/usr/bin/python'; then echo "Bad Python shebang for $fname."; fi; done

We can safely assume that we do not have crazy file names in our repository, so we can use find | while read filename (instead of using the safer null-byte separator).

Here is the shell script:

#!/bin/bash

set -ueo pipefail

find [0-9][0-9]/ -type f | (
    exit_code=0
    while read -r fname; do
        if head -n 1 "$fname" | grep -q '^#!/usr/bin/python'; then
            echo "Bad Python shebang for $fname." >&2
            exit_code=1
        fi
    done
    exit $exit_code
)

And here is the extension of your .gitlab-ci.yml.

# Python shebang
check-bad-python-shebang:
  image: registry.gitlab.com/mffd3s/nswi177/student:latest
  script:
    - bin/check_python_shebang.sh

Preflight checklist

GitLab CI

About continuous integration

Setting up CI on GitLab

.gitlab-ci.yml

Other bits

Signals

Use of kill and pkill

Reacting to signals in Python

Exercise

Reacting to signals in shell

Deficiencies in signal design and implementation

Check your understanding

Network Manager

Changing IP configuration

Other networking utilities

traceroute (a.k.a. the path is the goal)

nmap (a.k.a. let me scan your network)

nc (netcat)

Periodically running tasks with Cron

Running as a normal user

Repairing corrupted disks

Tasks to check your understanding

Learning outcomes

Conceptual knowledge

Practical skills

`.gitlab-ci.yml`

Use of `kill` and `pkill`

`nmap` (a.k.a. let me scan your network)

`nc` (netcat)