Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.

There will be an on-site test at the beginning of this lab, the topic is Git CLI (including branching and merging).

The test will be held during the first half of the lab, please, make sure you arrive on time (other details are on this page).

Make sure you can clone from gitolite3@linux.ms.mff.cuni.cz repositories (they are used in Lab 07, for example) from your school machines.

This lab will span several topics that are not very big. They are also only partially connected with each other so you might read the big sections in virtually any order you prefer.

We will learn about regular expressions that are used to find text patterns and have a look how to test our programs automatically with BATS. We will also have a look at user accounts in Linux, see how software is installed and how services are executed.

Preflight checklist

  • You understand the concept of automated tests
  • You are ready for the second on-site test :-).

Regular expressions (a.k.a. regexes)

We already mentioned that systems from the Unix family are built on top of text files. The utilities we have seen so far offered basic operations, but none of them was really powerful. Use of regular expressions will change that.

We will not cover the theoretical details – see the course on Automata and grammars for that. We will view regular expressions as simple tools for matching of patterns in text.

For example, we might be interested in:

  • lines starting with date and containing HTTP code 404,
  • files containing our login,
  • or a line preceding a line with a valid filename.
While regular expressions are very powerful, their use is complicated by the unfortunate fact that different tools use slightly different syntax. Keep this in mind when using grep and sed, for example. Libraries for matching regular expressions are also available in most programming languages, but again beware of variations in their syntax.

The most basic tool for matching files against regular expressions is called grep. If you run grep regex file, it prints all lines of file which match the given regex (with -F, the pattern is considered a fixed string, not a regular expression).

There is a legend that the g in the name stands for “globally”, meaning the whole file, while re is regex, and p is print.

Regex syntax

In its simplest form, a regex searches for the given string (usually in case-sensitive manner).

system

This matches all substrings system in the text. In grep, this means that all lines containing system will be printed.

If we want to search lines starting with this word, we need to add an anchor ^.

^system

If the line is supposed to end with a pattern, we need to use the $ anchor. Note that it is safer to use single quotes in the shell to prevent any variable expansion.

system$

Moreover, we can find all lines starting with either r, s or t using the [...] list.

^[rst]

This looks like a wildcard, but regexes are more powerful and the syntax differs a bit.

Let us find all three-digit numbers:

[0-9][0-9][0-9]

This matches all three-digit numbers, but also four-digit ones: regular expressions without anchors do not care about surrounding characters at all.

We can also find lines not starting with any of letter between r and z. (The first ^ is an anchor, while the second one negates the set in [].)

^[^r-z]

The quantifier * denotes that the previous part of the regex can appear multiple times or never at all. For example, this finds all lines which consist of digits only:

^[0-9]*$

Note that this does not require that all digits are the same.

A dot . matches any single character (except newline). So the following regex matches lines starting with super and ending with ious:

^super.*ious$

When we want to apply the * to a more complex subexpression, we can surround it with (...). The following regex matches bana, banana, bananana, and so on:

ba(na)*na

If we use + instead of *, at least one occurrence is required. So this matches all decimal numbers:

[0-9]+

The vertical bar ("|" a.k.a. the pipe) can separate alternatives. For example, we can match lines composed of Meow and Quork:

^(Meow|Quork)*$

The [abc] construct is therefore just an abbreviation for (a|b|c).

Another useful shortcut is the {N} quantifier: it specifies that the preceding regex is to be repeated N times. We can also use {N,M} for a range. For example, we can match lines which contain 4 to 10 lower-case letters enclosed in quotation marks:

^"[a-z]{4,10}"$

Finally, the backslash character changes whether the next character is considered special. The \. matches a literal dot, \* a literal asterisk. Beware that many regex dialects (including grep without further options) require +, (, |, and { to be escaped to make them recognized as regex operators. (You can run grep -E or egrep to activate extended regular expressions, which have all special characters recognized as operators without backslashes.)

grep will terminate with zero exit code only if it matched at least one line.

Therefore, it can be used like this:

if ! echo "$input" | grep 'regex'; then
    echo "Input is not in correct format." >&2
    ...
fi

While regular expressions share some similarities with shell wildcards, they are different beasts. Regular expressions are much more powerful and also much more complicated.

Shell uses only wildcard matching (unless you are using Bash extensions).

Text substitution

The full power of regular expressions is unleashed when we use them to substitute patterns. We will show this on sed (a stream editor) which can perform regular expression-based text transformations.

sed and grep use a slightly different regex syntax. Always check with the man page if you are not sure. Generally, the biggest differences across tools/languages are in handling of special characters for repetition or grouping ((), {}).

In its simplest form, sed replaces one word by another. The command reads: substitute (s), then a single-character delimiter, followed by the text to be replaced (the left-hand side of the substitution), again the same delimiter, then the replacement (the right-hand side), and one final occurrence of the delimiter. (The delimiter is typically :, /, or #, but generally it can be any character that is not used without escaping in the rest of the command.)

sed 's:magna:angam:' lorem.txt

Note that this replaces only the first occurrence on each line. Adding a g modifier (for global) at the end of the command causes it to replace all occurrences:

sed 's:magna:angam:g' lorem.txt

The text to be replaced can be any regular expression, for example:

sed 's:[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]:DATE-REDACTED-OUT:g' lorem.txt

The right-hand side can refer to the text matched by the left-hand side. We can use & for the whole left-hand side or \n for the n-th group (...) in the left-hand side.

The following example transforms the date into the Czech form (DD. MM. YYYY). We have to escape the ( and ) characters to make them act as grouping operators instead of literal ( and ).

sed 's:\([0-9][0-9][0-9][0-9]\)-\([0-9][0-9]\)-\([0-9][0-9]\):\3. \2. \1:g'

Testing with BATS

In this section we will briefly describe BATS – the testing system that we use for automated tests that are run on every push to GitLab.

Generally, automated tests are the only reasonable way to ensure that your software is not slowly rotting and decaying. Good tests will capture regressions, ensure that bugs are not reappearing and often serve as documentation of the expected behavior.

The motto write tests first may often seem exaggerated and difficult, but it contains a lot of truth (several reasons are listed for example in this article).

BATS is a system written in shell that targets shell scripts or any programs with CLI interface. If you are familiar with other testing frameworks (e.g., Pythonic unittest, pytest or Nose), you will find BATS probably very similar and easy to use.

Generally, every test case is one shell function and BATS offers several helper functions to structure your tests.

Let us look at the example from BATS homepage:

#!/usr/bin/env bats

@test "addition using bc" {
  result="$(echo 2+2 | bc)"
  [ "$result" -eq 4 ]
}

The @test "addition using bc" is a test definition. Internally, BATS translates this into a function (indeed, you can imagine it as running simple sed script over the input and piping it to sh) and the body is a normal shell code.

BATS uses set -e to terminate the code whenever any program terminates with non-zero exit code. Hence, if [ terminates with non-zero, the test fails.

Apart from this, there is nothing more about it in its basic form. Even with this basic knowledge, you can start using BATS to test your CLI programs.

Executing the tests is simple – make the file executable and run it. You can choose from several outputs and with -f you can filter which tests to run. Look at bats --help or here for more details.

Commented example

Let’s write a test for our factorial function from Lab 08 (the function was created as part of one of the examples).

For testing purposes, we will assume that we have our implementation in factorial.sh and we put our tests into test_factorial.bats.

For now, we will have a bad implementation of factorial.sh so that we can see how tests should be structured.

#!/bin/bash

num="$1"
echo $(( num * (num - 1 ) ))

Our first version of test can look like this.

#!/usr/bin/env bats

@test "factorial 2" {
    run ./factorial.sh 2
    test "$output" = "2"
}

@test "factorial 3" {
    run ./factorial.sh 3
    test "$output" = "6"
}

We use a special BATS command run to execute our program that also captures its stdout into a variable named $output.

And then we simply verify the correctness.

Executing the command will probably print something like this (maybe even in colors).

test_factorial.bats
 ✓ factorial 2
 ✓ factorial 3

2 tests, 0 failures

Let’s add another test case:

@test "factorial 4" {
    run ./factorial.sh 4
    test "$output" = "20"
}

This will fail, but the error message is not very helpful.

test_factorial.bats
 ✓ factorial 2
 ✓ factorial 3
 ✗ factorial 4
   (in test file test_factorial.bats, line 15)
     `test "$output" = "20"' failed

3 tests, 1 failure

This is because BATS is a very thin framework that basically checks only the exit codes and not much more.

But we can improve that.

#!/usr/bin/env bats

check_it() {
    local num="$1"
    local expected="$2"

    run ./factorial.sh "$num"
    test "$output" = "$expected"
}

@test "factorial 2" {
    check_it 2 2
}

@test "factorial 3" {
    check_it 3 6
}

@test "factorial 4" {
    check_it 4 24
}

The error message is not much better but the test is much more readable this way.

Of course, run the above version yourself.

Let’s improve the check_it function a bit more.

check_it() {
    local num="$1"
    local expected="$2"

    run ./factorial.sh "$num"

    if [ "$output" != "$expected" ]; then
        echo "Factorial of $num computed as $output but expecting $expected." >&2
        return 1
    fi
}

Let’s run the test again:

test_factorial.bats
 ✓ factorial 2
 ✓ factorial 3
 ✗ factorial 4
   (from function `check_it' in file test_factorial.bats, line 11,
    in test file test_factorial.bats, line 24)
     `check_it 4 24' failed
   Factorial of 4 computed as 12 but expecting 24.

3 tests, 1 failure

This provides output that is good enough for debugging.

Adding more test cases is now a piece of cake. After this trivial update, our test suite will actually start making sense.

And it will be useful to us.

Better assertions

BATS offers extensions for writing more readable tests.

Thus, instead of calling test directly, we can use assert_equal that produces nicer message.

assert_equal "expected-value" "$actual"

NSWI177 tests

Our tests are packed with the assert extension plus several of our own. All of them are part of the repository that is downloaded by run_tests.sh in your repositories.

Feel free to execute the *.bats file directly if you want to run just certain test locally (i.e., not on GitLab).

User accounts

We already touched this topic few times: take it also as a refresher for things you already know.

User accounts on Linux are of two basic types. Normal user accounts for end-users, i.e., accounts to which you log via SSH or graphical interface and where you work. There are also system accounts that exist solely for the purpose of being able to run processes under different users for better isolation. One usually do not log in under these accounts at all.

Your accounts on linux.ms.mff.cuni.cz are of the first type and if you run ps -ef --forest you will see what other users are running. System accounts are for example chrony or nginx that are used to run special services of the system.

Each user account has a numerical id (which is how the operating system identifies the user) and a username that is usually mapped via /etc/passwd.

Among user accounts on a Linux system, one user has special privileges. The user is called root (or superuser), has numerical id of 0 and has virtually unlimited power over the running machine. For example, access rights are actually ignored for root user (i.e., a process running under root ignores any of the rw privileges and can read/write any file).

To switch to the superuser account, you can either use sudo (see below) or use su. Often it is executed like this to ensure you start a login shell (among other things this also ensures that $HOME points to /root and not to the home directory of the normal user):

su -

Unlike on other systems, Linux is designed in such way that end-user programs are always executed under normal users and never require root privileges. As a matter of fact, some programs (historically, this was a very common behaviour for IRC chat programs) would not even start under root.

root is needed for actions that modify the whole system. This includes system upgrade, formatting of a hard-drive or modification of system-wide configuration files.

The strict separation of normal (work) accounts and a superuser comes from the fact that Linux was designed as a multi-user system. The philosophy dates back 50+ years where system was shared by many users and only one of them – root – was the administrator of the machine. Today, when typical notebook installations contain just a single account, the separation is often more artificial, but it still exists.

The truth is that contemporary users are threatened more by a malicious webpage rather than an unauthorized system software update. Superuser account was designed to prevent the latter rather than the former. However, the idea of separate user accounts is still valid today and a careful user can use different accounts for different activities (e.g., browsing social media vs. working with your bank account).

User account management

We consider this an advanced topic and in this course we will limit ourselves to pointing you to the documentation of useradd, userdel and usermod commands that create, delete and modify user accounts respectively.

Recall that there is also getent for retrieving information about existing accounts.

You should be also familiar with passwd that can be used to change the user’s password.

Running passwd modifies the password of the current user. When executed under root, we can specify a username and modify the password for a different user. For a typical installation, that is the way to reset a password easily.

sudo

Some programs need privilege escalation, i.e., run with higher privileges and wider permissions than other programs. Some need this by design and we already mentioned the set-uid bit on executables that is used when the application always needs the elevated rights (and for any user actually launching the program).

However, some commands require higher privileges only once in a while, so running them as set-uid broadens the possible attack vectors unnecessarily.

For these situations, one option is sudo (homepage). As the name suggests, it executes (does) one command with superuser privileges. The advantage of sudo is that system admin can specify who can run which command with elevated permissions. Thus it does not give the allowed user unlimited power over the machine, but only over a selected subset of commands.

For example, it is possible to give a user option to restart a specific service (e.g., we want to allow a tester to restart a web server) without giving him control over the whole machine.

Note that the granularity of sudo stops at the level of programs. It does not restrict what sudo does inside the program. For example, it is possible to impose a restriction, that alice can execute dangerous_command only with --safe-option. However, if the dangerous_command reads options also from a ~/.dangerousrc, alice can provide --unsafe-option there and sudo cannot prevent that. In other words, once the initial check is completed, the program runs as if it was launched under root.

This is extremely important for shared machines where the administrator typically wants to restrict all users as much as possible. On the other hand, for desktop installations, the typical default is that the first user created (usually during installation) can sudo anything. The reasoning is that it is the only (physical) user, who knows the root password anyway. This is why most tutorials on web usually provide the commands for system maintenance including the sudo prefix.

However, you should always understand why you need to run sudo. Never get into the habit if it does not work, let’s try prepending sudo. Also note that there are multiple options for gaining a root shell (i.e., sudo bash).

As a safe example what you can try is to execute fdisk -l to list partitions on your system. When executed without root privileges, it will probably fail with several messages about denied access. Running it with sudo should work.

Note that you enter your password, not the one of superuser (after all, if it would be superuser password, you will not need sudo because you would be able to execute su - and get the root shell directly).

sudo fdisk -l

Note that sudo is not the only security mechanism present. We will not discuss other mechanisms in great detail, but to give you pointers to documentation: there is also SELinux or AppArmor and a high-level overview on this Wikipedia page.

User accounts overview: check you understand the basics

Select all true statements. You need to have enabled JavaScript for the quiz to work.

Software installation (a.k.a. package management)

Software in Linux is usually installed by means of a package manager. The package manager is a special program that takes care of installation, upgrading, and removing packages. A package can be anything that could be installed; this includes:

  • a program (for example, package ranger installs the program ranger),
  • data files or configuration (e.g., libreoffice-langpack-cs for Czech support inside LibreOffice),
  • a library (e.g., gmp or gmp-devel providing the GNU arbitrary-precision arithmetics library),
  • or a meta package (e.g., xfce that covers xfce4-terminal, xfwm4-themes etc.).

In this sense, Linux is very similar to what you know from the shopping-center-style management of applications on your smartphones. It is very irregular to install software on Linux using a graphical install wizard.

The advantage of using centralized package management is the ability to upgrade the whole system at once without the need to check updates of individual applications.

Individual packages often have dependencies – installing one package results in transitive installation of other packages the first one depends on (for example, a web browser will require basic graphical support etc.). It makes the upgrading process a bit more complicated (for the package manager, not for the user, though). But it can save some disk space. And the most important advantage is that different application share the same libraries (on Linux, they have .so extension and are somewhat similar to DLLs on Windows) and it is possible to upgrade a library even for an otherwise abandoned application. That is essential when patching security vulnerabilities.

Note that it is possible to install software manually too. From the file-system point of view, there is no difference – the package manager also just copies files to the right directories. However, manually installed software has to be upgraded manually too and generally complicates the setup. So avoid it when possible.

A typical package manager works with several software repositories. You can think about it as if your cell-phone has multiple marketplaces where to choose applications from. Typically, you will encounter the following types of repositories. It is up to each user (administrator) to decide which to use.

  • Stable and testing, where the latter provides newer versions of software with slight possibility of bugs (usually, there is a third repository, often called unstable, for bleeding-edge software).
  • Free and non-free, where the former contains only software without any legal surprises. Non-free software can be encumbered by patent or royalty issues (usually based on US law), or by a license which restricts use or redistribution.

It is also possible to set up your own repository. This can be useful if you want to distribute your software to multiple machines (and you cannot publish the packages in the normal repositories because it is, for example, proprietary).

Most distributions also offer some kind of user-repository support where virtually anyone can publish their software. For Fedora, this is done via Copr.

Note that both official and unofficial repositories offer no guarantees in the legal sense. However, using the official repositories of a given distribution is considered safe, the amount of attacks on software repositories is low and – unlike with many commercial organizations – distribution maintainers are very open in informing about security incidents. It is probably much easier to encounter a malicious application in your smartphone marketplace than to encounter it in an official repository of a Linux distribution.

dnf (a.k.a. package manager in Fedora)

Fedora used to have yum as the package manager and it can be found in many tutorials on the Internet (even in quite recent ones). It is considered obsolete and you should better avoid it.

If you are used to yum from older versions of Fedora or from other RPM-based distributions, you will find dnf very similar and in many situations faster than yum.

The package manager for Fedora is called DNF.

If you decided to use a different distribution, you will need to edit the commands to match your system. Generally, the operations would be rather similar but we cannot provide a tutorial for every package manager here.

You can use the search command to get a list of packages which match the given name. Note that searching is not a privileged operation, hence it does not require sudo.

dnf search arduino
dnf search atool

Note that searching for a very generic term can yield hundreds of results.

The output is in the following format:

atool.noarch : A perl script for managing file archives of various types
ratools.x86_64

The .noarch and .x86_64 strings describe the nature of the package. noarch usually refers to a data package or package using interpreted languages, while .x86_64 denotes a package with binaries for the x86-64 architecture (e.g., written in C or Rust and then compiled to machine code).

To install a software package, run dnf with the install subcommand, giving it the name of the package to install. Here, sudo is needed as we are modifying the system.

sudo dnf install atool

Some applications are not a part of any software repository, but you can still download them in a format understandable by your package manager. That is a better situation than installing the files manually, because your package manager knows about the files (although it cannot upgrade it automatically). One such example is the Zoom client which has to be installed like this:

sudo dnf install "https://zoom.us/client/latest/zoom_x86_64.rpm"

To upgrade the whole system, simply run the following. DNF will ask for confirmation and then upgrade all available packages.

sudo dnf upgrade

Note that unlike on other systems, you can always choose when to upgrade. The system will never reboot the machine for you or display a message about needed restart, unless you explicitly ask for it.

If you want to install a whole group of packages, you can use dnf grouplist to view their list and sudo dnf install @GROUP_NAME to install it.

The commands above contain the basics for maintaining your Fedora installation with respect to package management. The following links provide more information. The official Wiki page is a good source of information if you already know the system a bit.

For beginners, this guide about DNF and this tutorial are probably a better starting point.

Alternatives to classic package managers

The existence of various package managers has its disadvantages – when using multiple distributions, the user has to know how to operate different package managers. Furthermore, different distributions need to create different packages (compatible with their package managers), which results in more work.

Therefore, an effort has been made to unite the package managers. Snap was created in order to install packages among distributions uniformly. While for some users it is a way to get the software they want as quickly as possible, for some the proprietary nature of Snap and a need for an account at the package store presents potential dangers and shift in Linux open-source ideology.

To demonstrate a problematic example, let’s attempt to install PyCharm. PyCharm is an IDE for Python, which is (unfortunately) mostly directed at Windows users and also offers a paid professional version. No PyCharm package is offered in Fedora.

This is rather an exception – you won’t encounter problems with most open-source software. Actually, even companies that were traditionally oriented towards different OSes offer DNF-based repositories for their products these days. Note that in this case, providing a full repository is the ideal choice. Users can choose whether to enable this repository or not, distribution maintainers can focus on other tools and the company keeps full control over the release cycle and the distribution process.

There are these two options to install PyCharm:

  1. Use Snap
  2. Use the ad-hoc installation script. It is downloaded with the PyCharm installation.

Note that the second option is usually frowned-at in general. It requires running a shell script that the user downloads which is generally considered dangerous – you should always examine such scripts. (Obviously, using a package manager also involves downloading and running scripts but the attack surface is a bit smaller.)

Another issue is that any application downloaded in this way will not be automatically updated.

Which one to use

Snap is not the only alternative to the classic package managers.

Among others, there is Flatpak or AppImage. They can co-exist and it is up to the user to decide which one to choose.

The decision which one to use is influenced by many factors. Generally, using pre-packaged software distributed with your system (distribution) should be preferred.

As a last note – even if the software you want to install does not provide packages for your distribution, you can always create them yourself. The process is out-of-scope for this course but it is actually not very difficult.

Package management: check you understand the basics

Select all true statements. You need to have enabled JavaScript for the quiz to work.

Services (and daemons too)

In the context of an operating system, the term service usually refers to any program that is running on the background (typically, no GUI, stdin from /dev/null) and provides some kind of service to other programs.

A typical example can be a printing service that takes care of printer discovery and provides end-user applications with list of printers (i.e., the end-user applications do not need to make the discovery themselves). Another example is a web server: it provides files over the HTTP protocol to web browsers.

In the world of Unix systems, such programs are often called daemons (this probably comes from ancient Greek mythology where daemon is a being working in the background), traditionally names of such programs end with the letter d. For example, the popular Apache web server is actually launched as a program httpd and the SSH server is running as sshd.

Daemons operate differently from normal programs. When started, they read their configuration (typically from a file under /etc/), start and listen for requests (imagine a web server listening on port 80). Changing their behavior is usually done by changing their configuration file and restarting them. Because they are started in background, they do not have access to an interactive stdin and the restart (or shutdown) is performed via signals.

Recall that we have seen earlier the kill utility for stopping programs. The utility is actually more versatile as it can also send a signal that can be intercepted by the program and the program can react to it (we will see details later on). Such example is reacting to a special signal that instructs the program to reload its configuration.

Because the need to restart a running daemon is quite common (and sending signals is not very straightforward as you need to know the PID), there are special programs that are able to find the PID for you and send the right signal. We can call them control scripts and for some services you will find files serviced (with the actual daemon code) and servicectl for controlling it.

Unified daemon control

As the principles stated above are essentially the same for all daemons, there usually exists a set of scripts unifying this behavior. So, instead of calling a specific servicectl, the distribution will typically offer a special command with which you can control any daemon. Usually, one will use something along the following lines:

service [start|stop|restart] name-of-daemon

Currently, the most often used program for this task is called systemctl.

About logging

Most services provide so-called logs. There they record every significant action they performed.

For example, a web server typically logs which pages it served together with information about the client.

Usually, for each service you can specify how detailed the logging shall be. Debugging a configuration issue requires a more detailed level, on a production server you usually limit the amount of logged information to minimum for performance reasons.

Systemd

Systemd is one of the most widely used system service management tools in today’s Linux world.

We will not go into detail and just review the two most important commands: systemctl and journalctl.

Notice that systemd is a daemon, while systemctl is a command for controlling this daemon.

Starting and stopping a service

Starting a service with systemd is very simple. The following command starts sshd, the SSH server:

sudo systemctl start sshd

If the service was already running, nothing happens.

Check that you can now connect to your machine via the following command:

ssh your-login@localhost

To check the state of the service, the status subcommand is used (note that status can be run without sudo, but may display less information):

sudo systemctl status sshd
● sshd.service - OpenSSH Daemon
     Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: disabled)
     Active: active (running) since Mon 2021-03-01 14:31:40 CET; 2 months 3 days ago
   Main PID: 560 (sshd)
      Tasks: 1 (limit: 9230)
     Memory: 900.0K
        CPU: 16ms
     CGroup: /system.slice/sshd.service
             └─560 sshd: /usr/bin/sshd -D [listener] 0 of 10-100 startups

Warning: journal has been rotated since unit was started, output may be incomplete.

We see that the service is running, most items shall be self explanatory. The /usr/lib/systemd/system/sshd.service file contains service configuration itself (e.g., how to start/stop/restart the service), not the actual configuration of SSH daemon that is inside /etc/ssh.

It is safer to stop SSH daemon on your laptop if you are not going to use it:

sudo systemctl stop sshd

Enabling and disabling a service

If you wish to start the service with each boot, you can enable the service:

sudo systemctl enable sshd

Systemd will take care of proper ordering of the individual services (so that SSH server is started only after the network is initialized etc.).

If you no longer wish to have the SSH daemon started by default, call the command with disable instead.

Note that both enable and disable do not change the current state of the service: you still need to start/stop it if you do not want to wait for reboot. (For convenience, there is systemctl enable --now sshd, which also starts the service.)

Logs

Most system services keep logs of their work. The logs are usually stored under /var/log/. Some services produce logs on their own. Such logs are simple textual files, but their format is specific to the individual services and their configuration.

Many services use a central logging service, which keeps all its logs in a unified format and which can be configured for sorting logs, sending them over the network, removing old records, and so on.

On Fedora, the logging service is called journald. It keeps the log files in cryptographically signed binary files, which are not directly readable. But you can read the logs using the journalctl command.

For example, the following command shows logs for the SSH daemon:

journalctl -u sshd

More …

If you are interested in this topic, please, consult the relevant manual pages. Take these few paragraphs as a very brief introduction to the topic that allows you basic management of your system.

Tasks to check your understanding

We expect you will solve the following tasks before attending the labs so that we can discuss your solutions during the lab.

Find all lines in /etc/passwd that contain the digit 9.

Accounts with /sbin/nologin in /etc/passwd are generally system accounts not used by a human user. Print the list of these accounts.

Solution.

Find all lines in /etc/passwd that start with any of the letters A, B, C or D (case-insensitive).

Solution.

Find all lines which contain an even number of characters.

Solution.

Find all e-mail addresses. Assume that a valid e-mail address has a format <s1>@<s2>.<s3>, where each sequence <sN> is a non-empty string of characters from English alphabet and sequences <s1> and <s2> may also contain digits or a dot ..

Solution.

Print all lines containing a word (in English alphabet) which begins with capital letter and all other letters are lowercase. Test that the word TeX will not be matched.

Solution.

Remove all trailing spaces and tabulators.

Solution.

Put every word (non-empty sequence of characters of the English alphabet) in parentheses.

Solution.

Replace “Name Surname” by “Surname, N.”.

Solution.

Delete all empty lines. Hint.

Solution.

Reformat input to contain each sentence on a separate line. Assume that each sentence begins with a capital English letter and ends with ., !, or ?; there may be any number of spaces between sentences. Hint.

Solution.

Write a filter for the output of ip addr that prints device name followed by its IPv4 address and network prefix length.

For an interface which has no IPv4 address assigned, print a special address 0.0.0.0/0 instead.

The example from Lab 08 would be processed into the following output:

lo 127.0.0.1/8
enp0s31f6 0.0.0.0/0
wlp58s0 192.168.0.105/24
vboxnet0 0.0.0.0/0

The case for a missing IP address will probably complicate the control flow of your script a lot. Start with a version that assumes all interfaces have the address assigned.

This example can be checked via GitLab automated tests. Store your solution as 09/netcfg.sh and commit it (push it) to GitLab.

Write a script that normalizes a given path.

The script will accept single argument: the path to normalize. You can safely assume that the argument will be always provided.

The script will normalize the provided path in the following way:

  • references to the current directory ./ will be removed as they are redundant
  • references to the parent directory will be removed in such way not to change the actual meaning of the path (possibly repeatedly)
  • the script will not convert relative to absolute path or vice versa
  • the script will not check whether the file actually exists

Following examples illustrates the expected behaviour.

  • /etc/passwd/etc/passwd
  • a/b/././c/da/b/c/d
  • /a/b/../c/a/c
  • /usr/../etc//etc/

You can assume that components of the path will not contain new-lines or other special characters such as :, ", ' or any kind of escape sequences.

Hint: sed ':x; s/abb/ba/; tx' causes that s/abb/ba/ is called repeatedly as long as substitution is performed (:x defines a label while tx is a conditional jump to that label if the previous substitution changed the input). Try with echo 'abbbb' | sed ....

The point of the exercises is to check your regex skills, not use of realpath or anything similar.

This example can be checked via GitLab automated tests. Store your solution as 09/normalize.sh and commit it (push it) to GitLab.

The purpose of this task is that you practice writing tests. You are supposed to write tests for a factorial implementation in shell.

The factorial will be computed by a factorial function that will take one argument and print the computed factorial. As a well behaving shell function, on error it will return with non-zero return code.

Your tests should be written against the specification above.

Our tests will inject various bugs into the implementation and we expect that your tests will catch these bugs. However, you provide only one battery of tests that we execute with different implementations.

Technically, your tests must source factorial.sh from the current directory that will contain the actual implementation of factorial function in shell.

Thus factorial.sh can contain the following as a starting point (your tests should detect a lot of issues in this one).

factorial() {
    local n="$1"
    echo $(( n * ( n - 1 ) ))
}

The following is a good start of the implementation for this task.

#!/usr/bin/env bats

source "factorial.sh"

check_it() {
    local num="$1"
    local expected="$2"

    run factorial "$num"
    if [ "$status" -ne 0 ]; then
        echo "Function not terminated with zero exit code." >&2
        false
    fi
    if [ "$output" != "$expected" ]; then
        echo "Wrong output for $num: got '$output', expecting '$expected'." >&2
        false
    fi
}

@test "factorial 2" {
    check_it 2 2
}

@test "factorial 3" {
    check_it 3 6
}

@test "factorial 4" {
    check_it 4 24
}

Our tests will not check for exact error messages but will check that some tests of the suite are failing (using BATS return code).

Look at the test implementation to better understand what we are after.

Do not test for factorial of numbers greater than 10.

Note that the automated tests are rather crude and actually adding the following test will allow you to pass most of them without any effort. However, that is not the purpose of this task.

@test "gaming the tests" {
    false
}

This example can be checked via GitLab automated tests. Store your solution as 09/factorial.bats and commit it (push it) to GitLab.

Rewrite the netcfg.sh task into Python to learn how regular expressions are used in Python.

This example can be checked via GitLab automated tests. Store your solution as 09/netcfg.py and commit it (push it) to GitLab.

Learning outcomes

Learning outcomes provide a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).

Conceptual knowledge

Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …

  • explain how and why is software distributed in the forms of packages

  • explain what is a regular expression (regex)

  • explain difference between the root account and other accounts

  • explain why doing non-administrative tasks with root account is generally discouraged

  • explain in broad terms how sudo can be used for system administration

  • understand the dangers of using sudo

  • explain what is a service (daemon)

  • explain life cycle and possible states of a service

  • explain what is a program log and how it can be managed

  • explain advantages of using automated functional tests

Practical skills

Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …

  • create and use simple regular expressions to filter text with grep

  • perform pattern substitution using sed

  • use getent to retrieve information about user accounts

  • use sudo to elevate privileges of an executed program

  • use a package manager to install or uninstall packages

  • use a package manager to perform a system update

  • use systemctl to start and stop services

  • use systemctl to ensure service is automatically started at machine boot

  • optional: use journalctl to view service logs

  • optional: use useradd to create a new user account

  • execute BATS tests

  • understand basic structure of BATS tests

  • optional: create simple BATS tests