Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.

There are several topics in this lab: we will look at asynchronous process communication, have a look at several interesting networking utilities and set up SSH port forwarding. Briefly we will also look at tools for printing and scanning in Linux and also see how to repair corrupted disk drives.

There is no running example and the topics can be read and tried in any order.

Signals

Linux systems use the concept of signals to communicate asynchronously with a running program (process). The word asynchronously means that the signal can be sent (and delivered) to the process regardless of its state. Compare this with communication via standard input (for example), where the program controls when it will read from it (by calling appropriate I/O read function).

However, signals do not provide a very rich communication channel: the only information available (apart from the fact that the signal was sent) is the signal number. Most signal numbers are defined by the kernel, which also handles some signals by itself. Otherwise, signals can be received by the application and acted upon. If the application does not handle the signal, it is processed in the default way. For some signals, the default is terminating the application; other signals are ignored by default.

This is actually expressed in the fact that the utility used to send signals is called kill. We have already seen it earlier and used it to terminate processes.

By default, the kill utility sends signal 15 (also called TERM or SIGTERM) that instructs the application to terminate. An application may decide to catch this signal, flush its data to the disk etc., and then terminate. But it can do virtually anything and it may even ignore the signal completely. Apart from TERM, we can instruct kill to send the KILL signal (number 9) which is handled by kernel itself. It immediately and forcefully terminates the application (even if the application decides to mask or handle the signal).

Many other signals are sent to the process in reaction to a specific event. For example, the signal PIPE is sent when a process tries to write to a pipe, whose reading end was already closed – the “Broken pipe” message you already saw is printed by the shell if the command was terminated by this signal. Terminating a program by pressing Ctrl-C in the terminal actually sends the INT (interrupt) signal to it. If you are curious about the other signals, see signal(7).

For example, when the system is shutting down, it sends TERM to all its processes. This gives them a chance to terminate cleanly. Processes which are still alive after some time are killed forcefully with KILL.

Use of kill and pkill

Recall lab 5 how we can use kill to terminate processes.

As a quick recap, open two terminals now.

Run sleep in the first, look up its PID in second and kill it with TERM (default signal) and then re-do the exercise with KILL (-9).

Solution.

Similarly, you can use pkill to kill processes by name (but be careful as with great power comes great responsibility). Consult the manual pages for more details.

There is also killall command that behaves similarly. On some Unix systems (e.g., Solaris), this command has completely different semantics and is used to shut down the whole machine.

Reacting to signals in Python

Your program would typically react to TERM (the default “soft” termination), INT (Ctrl-C from the keyboard) and perhaps to USR1 or USR2 (the only user-defined signals). System daemons often react to HUP by reloading their configuration.

The following Python program reacts to Ctrl-C by terminating. Store it as show_signals.py and hit Ctrl-C while running ./show_signals.py.

Avoid a common trap when trying the below code: do not store this into file signal.py as the name would collide with the standard package.
#!/usr/bin/env python3

import sys
import time
import signal

# Actual signal callback
def on_signal(signal_number, frame_info):
    print("")
    print("Caught signal {} ({})".format(signal_number, frame_info))
    sys.exit()

def main():
    # Setting signal callback
    signal.signal(signal.SIGINT, on_signal)
    while True:
        time.sleep(0.5)
        print("Hit Ctrl-C...")

if __name__ == '__main__':
    main()

Exercise

Write a program that tries to print all prime numbers. When terminating, it stores the highest number found so far and on the next invocation, it continues from there.

Solution.

Reacting to signals in shell

Reaction to signals in shell is done through a trap command.

Note that a typical action for a signal handler in a shell script is clean-up of temporary files.

#!/bin/bash

set -ueo pipefail

on_interrupt() {
    echo "Interrupted, terminating ..." >&2
    exit 17
}

on_exit() {
    echo "Cleaning up..." >&2
    rm -f "$MY_TEMP"
}

MY_TEMP="$( mktemp )"

trap on_interrupt INT TERM
trap on_exit EXIT

echo "Running with PID $$"

counter=1
while [ "$counter" -lt 10 ]; do
    date "+%Y-%m-%d %H:%M:%S | Waiting for Ctrl-C (loop $counter) ..."
    echo "$counter" >"$MY_TEMP"
    sleep 1
    counter=$(( counter + 1 ))
done

The command trap receives as the first argument the command to execute on the signal. Other arguments list the signals to react to. Note that a special signal EXIT means normal script termination. Hence, we do not need to call on_exit after the loop terminates.

We use exit 17 to report termination through the Ctrl-C handler (the value is arbitrary by itself).

Feel free to check the return value with echo $? after the command terminates. The special variable $? contains the exit code of the last command.

If your shell script starts with set -e, you will rarely need $? as any non-zero value will cause script termination.

However, following construct prevents the termination and allows you to branch your code based on exit value if needed.

set -e

...
# Prevent termination through set -e
rc=0
some_command_with_interesting_exit_code || rc=$?
if [ $rc -eq 0 ]; then
    ...
elif [ $rc -eq 1 ]; then
    ...
else
    ...
fi

Using - (dash) instead of the handler causes the respective handler to be set to default.

Your shell scripts shall always include a signal handler for clean-up of temporary files.

Note the use of $$ which prints the current PID.

Run the above script, note its PID and run the following in a new terminal.

kill THE_PID_PRINTED_BY_THE_ABOVE_SCRIPT

The script was terminated and the clean-up routine was called. Compare with situation when you comment out the trap command.

Run the script again but pass -9 to kill to specify that you want to send signal nine (i.e., KILL).

What happened? Answer.

While signals are a rudimentary mechanism, which passes binary events with no additional data, they are the primary way of process control in Linux.

If you need a richer communication channel, you can use D-Bus instead.

Reasonable reaction to basic signals is a must for server-style applications (e.g., a web server should react to TERM by completing outstanding requests without accepting new connections, and terminating afterwards). In shell scripts, it is considered good manners to always clean up temporary files.

Deficiencies in signal design and implementation

Signals are the rudimentary mechanism for interprocess communication on Unix systems. Unfortunately their design has several flaws that complicate their safe usage.

We will not dive into details but you should bear in mind that signal handling can be tricky in situations where you cannot afford to lose any signal or when signals can come quickly one after another. And there is a whole can of worms when using signals in multithreaded programs.

On the other hand, for simple shell scripts where we want to clean-up on forceful termination the pattern we have shown above is sufficient. It guards our script when user hits Ctrl-C because they realized that it is working on wrong data or something similar.

But note that it contains a bug for the case when the user hits Ctrl-C very early during script execution.

MY_TEMP="$( mktemp )"
# User hits Ctrl-C here
trap on_interrupt INT TERM
trap on_exit EXIT

The temporary file was already created but the handler was not yet registered and thus the file will not be removed. But changing the order complicates the signal handler as we need to test that $MY_TEMP was already initialized.

But the fact that signals can be tricky does not mean that we should abandon the basic means of ensuring that our scripts clean-up after themselves even when they are forcefully terminated.

In other programming languages the clean-up is somewhat simpler because it is possible to create a temporary file that is always automatically removed once a process terminates.

It relies on a neat trick where we can open a file (create it) and immediately remove it. However, as long as we keep the file descriptor (i.e., the result of Pythonic open) the system keeps the file contents intact. But the file is already gone (in the sense of the label of the contents) and closing the file removes it completely.

Because shell is based on running multiple processes, the above trick does not work for shell scripts.

Check your understanding

Select all correct statements about signals and processes. You need to have enabled JavaScript for the quiz to work.

SSH port forwarding

Generally, services provided by a machine should not be exposed over the network for random “security researchers” to play with. Therefore, a firewall is usually configured to control access to your machine from the network.

If a service should be provided only locally, it is even easier to let it listen on the loopback device only. This way, only local users (including users connected to the machine via SSH) can access it.

As an example, you will find that there is a web server listening on port 8080 of linux.ms.mff.cuni.cz. This web server is not available when you try to access it as linux.ms.mff.cuni.cz, but accessing it locally (when logged to linux.ms.mff.cuni.cz) works.

you@laptop$ curl http://linux.ms.mff.cuni.cz:8080                # Fails
you@laptop$ ssh linux.ms.mff.cuni.cz curl --silent http://localhost:8080  # Works

While using cURL to access this web server is possible, it is not the most user-friendly way to browse a web page.

Local Port Forwarding

SSH can be used to create a secure tunnel, through which a local port is forwarded to a port accessible from the remote machine. In essence, you will connect to a loopback device on your machine and SSH will forward that communication to the remote server, effectively making the remote port accessible.

The following command will make local port 8888 behave as port 8080 on the remote machine. The 127.0.0.1 part refers to the loopback on the remote server (you can write localhost there, too.)

ssh -L 8888:127.0.0.1:8080 -N linux.ms.mff.cuni.cz

You always first specify which local port to forward (8888) and then the destination as if you were connecting from the remote machine (127.0.0.1:8080).

The -N makes this connection usable only for forwarding – use Ctrl-C to terminate it (without it, you will log in to the remote machine, too).

Open http://localhost:8888 in your browser to check that you can see the same content as with the ssh linux.ms.mff.cuni.cz curl http://localhost:8080 command above.

You will often forward (local) port N to the same (remote) port N hence it is very easy to forgot about the proper order. However, the ordering of -L parameters is important and switching the numbers (e.g. 8888:127.0.0.1:9090 instead of 9090:127.0.0.1:8888) will forward different ports (usually, you will learn about it pretty quickly, though).

But do not worry if you are unable to remember it. That is why you have manual pages and even every-day users of Linux use them. It is not something to be ashamed or afraid of :-).

Remote/Reverse Port Forwarding

SSH allows to create also a so-called remote port forward.

It basically allows you to open a connection from the remote server to your local machine (in reverse to the ssh connection).

Practically, you can set up a remote port forwarding by connecting from your desktop you have at home to a machine in IMPAKT/Rotunda, for example, and then use it to connect from IMPAKT/Rotunda back to your desktop.

This feature will work even if your machine is behind NAT, which makes direct connections from the outside impossible.

The following command sets the remote port forwarding such that connecting to port 2222 on the remote machine will be translated to connection to port 22 (ssh) on the local machine:

ssh -N -R 2222:127.0.0.1:22 u-plN.ms.mff.cuni.cz

You first specify the remote port to forward (2222) and then the destination as if you were connecting from the local machine (127.0.0.1:22).

When trying this, ensure that your sshd daemon is running (recall lab 10 and systemctl command) and use a different port than 2222 to prevent collisions.

In order to connect to your desktop via this port forward, you have to do so from IMPAKT/Rotunda lab via the following command.

ssh -p 2222 your-desktop-login@localhost

We use localhost as the connection is only bound to the loopback interface, not to the actual network adapter available on lab computers. (Actually, ssh allows to bind the port forward on the public IP address, but this is often disabled by the administrator for security reasons.)

Network Manager

There are several ways how to configure networking in Linux. Server admins often prefer to use the bare ip command; on desktops most distributions today use the NetworkManager, so we will show it here too. Note that the ArchLinux Wiki page about NetworkManager contains a lot of information, too.

NetworkManager has a GUI (you probably used its applet without knowing about it), a TUI (which can be run with nmtui), and finally a CLI.

We will (for obvious reasons) focus on the command-line interface here. Without parameters, nmcli will display information about current connections:

wlp58s0: connected to TP-Link_1CE4
        "Intel 8265 / 8275"
        wifi (iwlwifi), 44:03:2C:7F:0F:76, hw, mtu 1500
        ip4 default
        inet4 192.168.0.105/24
        route4 0.0.0.0/0
        route4 192.168.0.0/24
        inet6 fe80::9ba5:fc4b:96e1:f281/64
        route6 fe80::/64
        route6 ff00::/8

p2p-dev-wlp58s0: disconnected
        "p2p-dev-wlp58s0"
        wifi-p2p, hw

enp0s31f6: unavailable
        "Intel Ethernet"
        ethernet (e1000e), 54:E1:AD:9F:DB:36, hw, mtu 1500

vboxnet0: unmanaged
        "vboxnet0"
        ethernet (vboxnet), 0A:00:27:00:00:00, hw, mtu 1500

lo: unmanaged
        "lo"
        loopback (unknown), 00:00:00:00:00:00, sw, mtu 65536

DNS configuration:
        servers: 192.168.0.1 8.8.8.8
        interface: wlp58s0

...

Compare the above with the output of ip addr. Notice that NetworkManager explicitly states the routes by default and also informs you that some interfaces are not controlled by it (here, lo or vboxnet0).

Changing IP configuration

While most networks offer DHCP (at least those you will connect to with your desktop), sometimes you need to set up IP addresses manually.

A typical case is when you need to connect two machines temporarily, e.g., to transfer a large file over a wired connection.

The only thing you need to decide on is which network you will create. Do not use the same one as your home router uses; our favourite selection is 192.168.177.0/24.

Assuming the name from above, the following command adds a connection named wired-static-temp on enp0s31f6:

sudo nmcli connection add \
    con-name wired-static-temp \
    ifname enp0s31f6 \
    type ethernet \
    ip4 192.168.177.201/24

It is often necessary to bring this connection up with the following command:

sudo nmcli connection up wired-static-temp

Follow the same procedure on the second host, but use a different address (e.g., .202). You should be able to ping the other machine now:

ping 192.168.177.201

To demonstrate how ping behaves when the connection goes down, you can try unplugging the wire, or doing the same in software:

sudo nmcli connection down wired-static-temp

Other networking utilities

We will not substitute the networking courses here, but we mention some basic commands that could help you debug basic network-related problems.

You already know ping: the basic tool to determine whether a machine with a given IP address is up (and responding to a network traffic).

ping is the basic tool if you suddenly lose a connection to some server. Ping the destination server and also some other well-known server. If the packets are going through, you know that the problem is on a different layer. If only packets to the well-known server gets through, the problem is probably with the server in question. If both fail, your network is probably down.

But there are more advanced tool available too.

traceroute (a.k.a. the path is the goal)

Sometimes, it can be handy to know the precise path, which the packets travel. For this kind of task, we can use traceroute.

Similarly to ping, we need to just specify the destination.

traceroute 1.1.1.1
traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  _gateway (10.16.2.1)  2.043 ms  1.975 ms  1.948 ms
 2  10.17.0.1 (10.17.0.1)  1.956 ms  1.971 ms  1.961 ms
 3  gw.sh.cvut.cz (147.32.30.1)  1.947 ms  1.973 ms  1.977 ms
 4  r1sh-sush.net.cvut.cz (147.32.252.238)  2.087 ms  2.262 ms  2.527 ms
 5  r1kn-konv.net.cvut.cz (147.32.252.65)  1.856 ms  1.849 ms  1.847 ms
 6  kn-de.net.cvut.cz (147.32.252.57)  1.840 ms  1.029 ms  0.983 ms
 7  195.113.144.172 (195.113.144.172)  1.894 ms  1.900 ms  1.885 ms
 8  195.113.235.99 (195.113.235.99)  4.793 ms  4.748 ms  4.723 ms
 9  nix4.cloudflare.com (91.210.16.171)  2.264 ms  2.807 ms  2.814 ms
10  one.one.one.one (1.1.1.1)  1.883 ms  1.800 ms  1.834 ms

The first column corresponds to the hop count. The second column represents the address of that hop and after that, you see three space-separated times in milliseconds. traceroute command sends three packets to the hop and each of the time refers to the time taken by the packet to reach the hop. So from the foregoing output we can see that the packages visited 10 hops on its way between the local computer and the destination.

This tool is especial useful, when you have network troubles and you are not sure where the issue is.

traceroute to 1.1.1.1 (1.1.1.1), 30 hops max, 60 byte packets
 1  10.21.20.2 (10.21.20.2)  0.798 ms  0.588 ms  0.699 ms
 2  10.21.5.1 (10.21.5.1)  0.593 ms  0.506 ms  0.611 ms
 3  192.168.88.1 (192.168.88.1)  0.742 ms  0.637 ms  0.534 ms
 4  10.180.2.113 (10.180.2.113)  1.696 ms  4.106 ms  1.483 ms
 5  46.29.224.17 (46.29.224.17)  14.343 ms  13.749 ms  13.806 ms
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  * * *
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

From this log we can see that the last visited hop was 46.29.224.17, so we can focus our attention to this network element.

nmap (a.k.a. let me scan your network)

nmap is a very powerful tool. Unfortunately, even an innocent – but repeated – usage could be easily misinterpreted as a malicious scan of vulnerabilities that are susceptible to attack. Use this tool with care and experiment in your home network. Reckless scanning of the university network can actually ban your machine from connecting at all for quite some time.

nmap is the basic network scanning tool. If you want to know which network services are running on a machine you can try connecting to all of its ports to check which are opened. Nmap does that and much more.

Try first scanning your loopback device for internal services running on your machine:

nmap localhost

The result could look like this (the machine has a print server and a proxy HTTP server):

Starting Nmap 7.91 ( https://nmap.org ) at 2021-05-04 16:38 CEST
Nmap scan report for localhost (127.0.0.1)
Host is up (0.00011s latency).
Other addresses for localhost (not scanned): ::1
rDNS record for 127.0.0.1: localhost.localdomain
Not shown: 998 closed ports
PORT     STATE SERVICE
631/tcp  open  ipp
3128/tcp open  squid-http

Nmap done: 1 IP address (1 host up) scanned in 0.11 seconds

If you want to see more information, you can try adding -A switch.

nmap -A localhost

And if you run it under root (i.e. sudo nmap -A localhost) nmap can try to detect the remote operating system, too.

By default, nmap scans only ports frequently used by network services. You can specify a different range with the -p option:

nmap -p1-65535 localhost

This instructs nmap to scan all TCP ports (-p1-65535) on localhost.

Again: do not scan all TCP ports on machines in the university network!

As an exercise, which web server is used on our GitLab? And which one is on our University website? Solution.

nc (netcat)

Let us examine how to create network connections from the shell. This is essential for debugging of network services, but it is also useful for using the network in scripts.

The Swiss-army knife of network scripting is called netcat or nc.

Unfortunately, there exist multiple implementations of netcat, which differ in options and capabilities. We will show ncat, which is installed by default in Fedora. Your system might have a different one installed.

Trivial things first: if you want to connect to a given TCP port on a remote machine, you can run nc machine port. This establishes the connection and wires the stdin and stdout to this connection. You can therefore interact with the remote server.

Netcat is often connected to other commands using pipes. Let us write a rudimentary HTTP client:

echo -en "GET / HTTP/1.1\r\nHost: www.kernel.org\r\n\r\n" | nc www.kernel.org 80

We are using \r\n, since the HTTP protocol wants lines terminated by CR+LF. The Host: header is mandatory, because HTTP supports multiple web sites running on the same combination of IP address and port.

We see that http://www.kernel.org/ redirects us to https://www.kernel.org/, so we try again using HTTPS. Fortunately, our version of netcat knows how to handle the TLS (transport-layer security) protocol used for encryption:

echo -en "GET / HTTP/1.1\r\nHost: www.kernel.org\r\n\r\n" | nc --ssl www.kernel.org 443

Now, let us build a simple server. It will listen on TCP port 8888 and when somebody connects to it, the server will send contents of a given file to the connection:

nc --listen 8888 <path-to-file

We can open a new shell and try receiving the file:

nc localhost 8888

We receive the file, but netcat does not terminate – it still waits for input from the stdin. Pressing Ctrl-D works, but it is easier to tell netcat to work in one direction only:

nc localhost 8888 --recv-only

OK, this works for transferring a single file over the network. (But please keep in mind that the transfer is not encrypted, so it is not wise to use it over a public network.)

When the file is transferred, the server terminates. What if we want to run a server, which can handle multiple connections? Here, redirection is not enough since we need to read the file multiple times. Instead, we can ask netcat to run a shell command for every connection and wire the connection to its stdin and stdout:

nc --listen 8888 --keep-open --sh-exec 'cat path-to-file'

Of course, this can be used for much more interesting things than sending a file. You can take any program which interacts over stdin and stdout and make it into a network service.

Printing with CUPS

Printing in Linux is handled by the CUPS subsystem that works out-of-the box with virtually every printer supporting IPP (internet printing protocol) and supports also many legacy printers.

Simple sudo dnf install cups installs the basic subsystem, extra drivers might be needed for specific models. OpenPrinting.org contains a searchable database to determine which (if any) drivers are needed. For example, for most HP printers, you would need to install the hplip package.

You typically want CUPS up and running on your system all the time, hence you need to enable it:

sudo systemctl enable --now cups

CUPS has a nice web interface that you can use to configure your printers. For many modern network-connected printers, even that is often unnecessary as they will be auto-discovered correctly.

If you have started CUPS already, try visiting http://localhost:631/. Under the Administration tab, you can add new printers. Selecting the right model helps CUPS decide which options to show in the printing dialog and enables proper functioning of grayscale printing and similar features.

Scanning images and documents with Sane

Scanner support on Linux is handled with SANE (Scanner Access Now Easy). As with printing, most scanners will be autodetected and if you already know GIMP, it has SANE support.

Add it with sudo dnf install xsane-gimp.

Actual scanning of the image can be done from File -> Create -> XSane dialog where you select your device, scanning properties (e.g., resolution or colors) and then you can start the actual scan.

Periodically running tasks with Cron

There are many tasks in your system that needs to be executed periodically. Many of them are related to system maintenance, such as log rotation (removal of outdated logs), but even normal users may want to perform regular tasks.

A typical example might be backing up your $HOME or a day-to-day change of your desktop wallpaper.

From the administrator’s point of view, you need to install the cron daemon and start it. On Fedora, the actual package is called cronie, but the service is still named crond.

System-wide jobs (tasks) are defined /etc/cron.*/, where you can directly place your scripts. For example, to perform a daily backup of your machine, you could place your backup.sh script directly into /etc/cron.daily/ Of course, there are specialized backup tools (e.g. duplicity), your solution from Lab 06 is a pretty good start for a homebrew approach.

If you want more fine-grained specification than the one offered by the cron.daily or cron.hourly directories, you can specify it in a custom file inside /etc/cron.d/.

There, each line specifies a single job: a request to run a specified command at specified time under the specified user (typically root). The time is given as a minute (0-59), hour (0-23), day of month (1-31), month (1-12), and day of week (0-6, 0 is Sunday). You can use * for “any” in every field. For more details, see crontab(5).

Therefore, the following will execute /usr/local/bin/backup.sh every day 85 minutes after midnight (i.e., at 1:25 am). The second line will call big-backup.sh on every Sunday morning.

25 1 * * * root /usr/local/bin/backup.sh
0  8 * * 0 root /usr/local/bin/big-backup.sh

Note that cron.d will typically contain a special call of the following form which ensures that the cron.hourly scripts are executed (i.e., the cronie deamon itself looks only inside /etc/cron.d/, the use of cron.daily or cron.monthly is handled by special jobs).

01 * * * * root run-parts /etc/cron.hourly

Running as a normal user

Normal (i.e., non-root) users cannot edit files under /etc/cron.d/. Instead, they have a command called crontab that can be used to edit their personal cron table (i.e., their list of cron jobs).

Calling crontab -l will list current content of your cron table. It will probably print nothing.

To edit the cron table, run crontab -e. It will launch your favourite editor where you can add lines in the above-mentioned format, this time without the user specification.

For example, adding the following entry will change your desktop background every day:

1 1 * * * /home/intro/bin/change_desktop_background.sh

Of course, assuming you have such script in the given location. If you really want to try it, the following script works for Xfce and uses Lorem Picsum.

#!/bin/bash

# Update to your hardware configuration
screen_width=1920
screen_height=1080

wallpaper_path="$HOME/.wallpaper.jpg"

curl -L --silent "https://picsum.photos/$screen_width/$screen_height" >"$wallpaper_path"

# Xfce
# Select the right path from xfconf-query -lvc xfce4-desktop
xfconf-query -c xfce4-desktop -p /backdrop/screen0/monitor0/workspace0/last-image -s "$wallpaper_path"

# LXDE
pcmanfm -w "$wallpaper_path"

# Sway
# For more details see `man 5 sway-output`
# You can also set a different wallpaper for each output (display)
# Run `swaymsg -t get_outputs` for getting specific output name
swaymsg output '*' bg "$wallpaper_path" fill

Repairing corrupted disks

The primary Linux tool for fixing broken volumes is called fsck (filesystem check). Actually, the fsck command is a simple wrapper, which selects the right implementation according to file system type. For the ext2/ext3/ext4 family of Linux file systems, the implementation is called e2fsck. It can be more useful to call e2fsck directly, since the more specialized options are not passed through the general fsck.

As we already briefly mentioned in Lab 11, it is safer to work on a copy of the volume, especially if you suspect that the volume is seriously broken. This way, you do not risk breaking it even more. This can be quite demanding in terms of a disk space: in the end it all comes down to money – are the data worth more than buying an extra disk or even bringing it completely to a professional company focusing on this sort of work.

Alternatively, you can run e2fsck -n first, which only checks for errors, and judge their seriousness yourself.

Sometimes, the disk is too broken for fsck to repair it. (In fact, this happens rarely with ext filesystems – we have witnessed successful repairs of disks whose first 10 GB were completely rewritten. But on DOS/Windows filesystems like vfat and ntfs, automated repairs are less successful.)

Even if this happens, there is still a good chance of recovering many files. Fortunately, if the disk was not too full, most files were stored continuously. So we can use a simple program scanning the whole image for signatures of common file formats (recall, for example, how the GIF format looks like). Of course, this does not recover file names or the directory hierarchy.

The first program we will show is photorec (sudo dnf install testdisk). Before starting it, prepare an empty directory where to store the results.

It takes a single argument: the file image to scan. It then starts an interactive mode where you select where to store the recovered files and also guess on file system type (for most cases, it will be FAT or NTFS). Then it tries to recover the files. Nothing more, nothing less.

photorec is able to recover plenty of file formats including JPEG, MP3, ZIP files (this includes also ODT and DOCX) or even RTF files.

Another tool is recoverjpeg that focuses on photo recovery. Unlike photorec, recoverjpeg runs completely non-interactively and offers some extra parameters that allow you to fine-tune the recovery process.

recoverjpeg is not packaged for Fedora: you can try installing it manually or play with photorec only (and hope you will never need it).

Before-class tasks (deadline: start of your lab, week May 8 - May 12)

The following tasks must be solved and submitted before attending your lab. If you have lab on Wednesday at 10:40, the files must be pushed to your repository (project) at GitLab on Wednesday at 10:39 latest.

For virtual lab the deadline is Tuesday 9:00 AM every week (regardless of vacation days).

All tasks (unless explicitly noted otherwise) must be submitted to your submission repository. For most of the tasks there are automated tests that can help you check completeness of your solution (see here how to interpret their results).

13/word.txt (50 points, group net)

Solution is the word displayed on the following page on server running on linux.ms.mff.cuni.cz (replace LOGIN with your GitLab login, as usual).

http://localhost:8080/13/LOGIN

You will probably need to use a graphical browser to access the page.

The word you are supposed to see is a hexadecimal string.

This task is not fully checked by the automated tests.

13/server.txt (50 points, group net)

Find out on which (non-standard) port is listening a web server on linux.ms.mff.cuni.cz. And what kind (manufacturer) of web server it is?

The server is listening on ens18 interface only.

Store the answer in a plain text file in the following format.

PORT:MANUFACTURER

We expect you would use nmap here. Use it carefully (recall the warnings above).

In this case using nmap is fine as you would run it against a specific interface, it would produce no actual network traffic and we are explicitly allowing it for that particular setup.

For example, the above mentioned web server on port 8080 would result in the following file.

8080:nginx

Neither the ports 8080, 9090 nor 80 (usual web server ports) are correct answers.

Do not forget to query the server to verify it is the right one. You will recognize the right one immediately.

Do not forget to check that on the given port is really a webserver (i.e. you shall see a HTML page in your browser).

As a hint, the correct server is not Nginx and not Apache.

This task is not fully checked by the automated tests.

Post-class tasks (deadline: May 28)

We expect you will solve the following tasks after attending the labs and hearing feedback to your before-class solutions.

All tasks (unless explicitly noted otherwise) must be submitted to your submission repository. For most of the tasks there are automated tests that can help you check completeness of your solution (see here how to interpret their results).

Run own SSH server (60 points, group net)

In this task we want from you to try running SSH server on your own machine (it can be a virtualized installation too). Because most of your machines are behind NAT, we will check that via reverse port forward on linux.ms.mff.cuni.cz.

It is not necessary to run the SSH server for a long time (i.e. no need for systemctl enable, simple start and stop later is sufficient). We will also not actually log in on your machine but only check that your SSH server is listening (i.e. accepting connections).

Recall that you really do not have to worry about security of your machine. The connection is hidden inside the SSH tunnel so it is rather difficult to even learn about its presence at all. The reverse connection will be active only when your session is active, so it can be literally for few seconds only. It will be thus only visible to other users on linux.ms.mff.cuni.cz.

We strongly warn you against any attempt to try to access machines of your colleagues or any other form of malicious action with respect to this task.

To prevent collision of port numbers, run the following on linux.ms.mff.cuni.cz to learn the port number you are supposed to use.

nswi177-reverse-port id

To actually complete the task, set-up the reverse port forward on that port to your own SSH server. Then execute with parameter test to check the connection (it will try to connect to the server but without trying any form of authentication). We will call the ssh-keyscan program for that.

nswi177-reverse-port test

If the script prints the following message, you are good to go.

Looks good, there is SSH daemon at the other side.

Then execute with submit and follow the instructions on the screen. Note that for the actual submission, you will have only one try (so make sure test works first).

nswi177-reverse-port submit

Note that you check the SSH access by actually logging back to your machine prior running the above commands.

This task is not fully checked by the automated tests.

13/signal.txt (40 points, group admin)

Run program nswi177-signals on linux.ms.mff.cuni.cz.

You will need to send specific signals in given order to this program to complete this task.

The program will guide you: it will print which signals you are supposed to send.

Copy the last line of output (there will be two numbers) of this program to 13/signal.txt.

This task is not fully checked by the automated tests.

Learning outcomes

Learning outcomes provide a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).

Conceptual knowledge

Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …

  • explain why use of nmap is often prohibited/limited by network administrators

  • explain the difference between a normal SSH port forward and a reverse port forward

  • explain what is a process signal

Practical skills

Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …

  • use nc for basic operations

  • use pgrep to find specific processes

  • send a signal to a running process

  • use SSH port forward to access service available on loopback device

  • use reverse SSH port forward to connect to a machine behind a NAT

  • use nmap for basic network scanning

  • use ip command to query current network configuration

  • use ping and traceroute as basic tools for debugging networking issues

  • optional: use SSH agent and passphrase-protected SSH keys

  • optional: use NetworkManager to set up static IP addressing

  • optional: use printing and scanning in Linux

  • optional: fix corrupted file systems using the family of fsck programs

  • optional: use PhotoRec to restore files from a broken file system

This page changelog

  • 2023-05-02: Added graded tasks

  • 2023-05-19: Reformulated SSH port forwarding