Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
Please, see latest news in issue #112 (from April 15).
There are several unrelated topics in this lab. There is no running example and the topics can be read and tried in any order.
Preflight checklist
- You understand how automated (unit) tests are designed and used.
xargs
(and parallel
) utilities
xargs
in its simplest form reads standard input and converts it to
program arguments for a user-specified program.
Assume we have the following files in a directory:
2025-04-16.txt 2025-04-24.txt 2025-05-02.txt 2025-05-10.txt
2025-04-17.txt 2025-04-25.txt 2025-05-03.txt 2025-05-11.txt
2025-04-18.txt 2025-04-26.txt 2025-05-04.txt 2025-05-12.txt
2025-04-19.txt 2025-04-27.txt 2025-05-05.txt 2025-05-13.txt
2025-04-20.txt 2025-04-28.txt 2025-05-06.txt 2025-05-14.txt
2025-04-21.txt 2025-04-29.txt 2025-05-07.txt 2025-05-15.txt
2025-04-22.txt 2025-04-30.txt 2025-05-08.txt
2025-04-23.txt 2025-05-01.txt 2025-05-09.txt
As a mini-task, write a shell one-liner to create these files.
Solution.Our task is to remove files that are older than 20 days. In this version, we only echo the command so that we do not need to recreate them again when debugging our solution.
cutoff_date="$( date -d "20 days ago" '+%Y%m%d' )"
for filename in 202[0-9]-[01][0-9]-[0-3][0-9].txt; do
date_num="$( basename "$filename" .txt | tr -d '-' )"
if [ "$date_num" -lt "$cutoff_date" ]; then
echo rm "$filename"
fi
done
This means that the program rm
would be called several times, always
removing just one.
The overhead of starting a new process could become a serious bottleneck
for larger scripts (think about thousands of files, for example).
It would be much better if we would call rm
just once, giving it a list
of files to remove (i.e., as multiple arguments).
xargs
is the solution here. Let’s modify the program a little bit:
cutoff_date="$( date -d "20 days ago" '+%Y%m%d' )"
for filename in 202[0-9]-[01][0-9]-[0-3][0-9].txt; do
date_num="$( basename "$filename" .txt | tr -d '-' )"
if [ "$date_num" -lt "$cutoff_date" ]; then
echo "$filename"
fi
done | xargs echo rm
Instead of removing the file right away, we just print its name and pipe the
whole loop to xargs
where any normal arguments refer to the program to be
launched.
Instead of many lines with rm ...
we will se just one long line with single
invocation of rm
.
Another situation where xargs
can come handy is when you are building
a complex command-line or when using command substitution ($( ... )
) would
make the script unreadable.
Of course, tricky filenames can still cause issues as xargs
assumes that
arguments are delimited by whitespace.
(Note that for above, we were safe as the filenames were reasonable.)
That can be changed with --delimiter
.
If you are piping input to xargs
from your program, consider delimiting
items with zero byte (i.e., the C string terminator, \0
).
Recall what you have heard about C strings – and how they are terminated –
in your Arduino course.
That is the safest option as this character cannot appear anywhere inside any
argument.
And tell xargs
about it via -0
or --null
.
Note that xargs
is smart enough to realize when the command-line would be
too long and splits it automatically (see manual for details).
It is also good to remember that xargs
can execute the command in parallel
(i.e., split the stdin into multiple chunks and call the program multiple times
with different chunks) via -P
.
If your shell scripts are getting slow but you have plenty of CPU power, this
may speed things up quite a lot for you.
parallel
This program can be used to execute multiple commands in parallel, hence speeding up the execution.
parallel
behaves almost exactly as xargs
but has much better support for
concurrent execution of individual jobs (not mixing their output, execution
on a remote machine etc. etc.).
The differences are rather well described in
parallel
documentation.
Please, also refer to parallel_tutorial(1)
(yes, that is a man page) and
for parallel(1)
for more details.
Storage management II
We will continue here what we started in lab 11.
Advanced disk mounting
Mounting disks is not limited to physical drives only. We will talk about disk images in the next section but there are other options, too. It is possible to mount a network drive (e.g., NFS or AFS used in MFF labs) or even create a network block device and then mount it.
Working with disk images
Linux has built-in support for working with disk images. That is, with files with content mirroring a real disk drive. As a matter of fact, you probably already worked with them when you set up Linux in a virtual machine or when you downloaded the USB disk image at the beginning of the semester.
Linux allows you to mount such image as if it was a real physical drive and modify the files on it. That is essential for the following areas:
- Developing and debugging file systems (rare)
- Extracting files from virtual machine hard drives
- Recovering data from damaged drives (rare, but priceless)
In all cases, to mount the disk image we need to tell the system to
access the file in the same way as it accesses other block devices
(recall /dev/sda1
from the example above).
Mounting disk images
Disk images can be mounted in almost the same way as block devices,
you only have to add the -o loop
option to mount
.
Recall that mount
requires root (sudo
) privileges hence you need to execute
the following example on your own machine, not on any of the shared ones.
To try that, you can download this FAT image and mount it.
sudo mkdir /mnt/photos-fat
sudo mount -o loop photos.fat.img /mnt/photos-fat
... (work with files in /mnt/photos-fat)
sudo umount /mnt/photos-fat
Alternatively, you can run udisksctl loop-setup
to add the disk image as
a removable drive that could be automatically mounted in your desktop:
# Using udisksctl and auto-mounting in GUI
udisksctl loop-setup -f fat.img
# This will probably print /dev/loop0 but it can have a different number
# Now mount it in GUI (might happen completely automatically)
... (work with files in /run/media/$(whoami)/07C5-2DF8/)
udisksctl loop-delete -b /dev/loop0
Repairing corrupted disks
If you cannot mount a disk but you can copy its content (usually, one
would use the dd(1)
utility but cat /dev/sdX >image.raw
works as well)
you can try to repair it yourself.
Inspecting and modifying volumes (partitions)
We will leave this topic to a more advanced course. If you wish to learn by yourself, you can start with the following utilities:
fdisk(8)
btrfs(8)
mdadm(8)
lvm(8)
Check you understand it all
Testing with BATS
In this section we will briefly describe BATS – the testing system that we use for automated tests that are run on every push to GitLab.
Generally, automated tests are the only reasonable way to ensure that your software is not slowly rotting and decaying. Good tests will capture regressions, ensure that bugs are not reappearing and often serve as documentation of the expected behavior.
The motto write tests first may often seem exaggerated and difficult, but it contains a lot of truth (several reasons are listed for example in this article).
BATS is a system written in shell that targets shell scripts or any programs with CLI interface. If you are familiar with other testing frameworks (e.g., Pythonic unittest, pytest or Nose), you will find BATS probably very similar and easy to use.
Generally, every test case is one shell function and BATS offers several helper functions to structure your tests.
Let us look at the example from BATS homepage:
#!/usr/bin/env bats
@test "addition using bc" {
result="$(echo 2+2 | bc)"
[ "$result" -eq 4 ]
}
The @test "addition using bc"
is a test definition. Internally, BATS
translates this into a function
(indeed, you can imagine it as running simple sed
script over the
input and piping it to sh
)
and the body is a normal shell code.
BATS uses set -e
to terminate the code whenever any program
terminates with non-zero exit code.
Hence, if [
terminates with non-zero, the test fails.
Apart from this, there is nothing more about it in its basic form. Even with this basic knowledge, you can start using BATS to test your CLI programs.
Executing the tests is simple – make the file executable and run it.
You can choose from several outputs and with -f
you can filter which
tests to run.
Look at bats --help
or here for
more details.
Commented example
Let’s write a test for our factorial function from Lab 09 (the function was created as part of one of the examples).
For testing purposes, we will assume that we have our implementation
in factorial.sh
and we put our tests into test_factorial.bats
.
For now, we will have a bad implementation of factorial.sh
so that we
can see how tests should be structured.
#!/bin/bash
num="$1"
echo $(( num * (num - 1 ) ))
Our first version of test can look like this.
#!/usr/bin/env bats
@test "factorial 2" {
run ./factorial.sh 2
test "$output" = "2"
}
@test "factorial 3" {
run ./factorial.sh 3
test "$output" = "6"
}
We use a special BATS command run
to execute our program that also captures
its stdout into a variable named $output
.
And then we simply verify the correctness.
Executing the command will probably print something like this (maybe even in colors).
test_factorial.bats
✓ factorial 2
✓ factorial 3
2 tests, 0 failures
Let’s add another test case:
@test "factorial 4" {
run ./factorial.sh 4
test "$output" = "20"
}
This will fail, but the error message is not very helpful.
test_factorial.bats
✓ factorial 2
✓ factorial 3
✗ factorial 4
(in test file test_factorial.bats, line 15)
`test "$output" = "20"' failed
3 tests, 1 failure
This is because BATS is a very thin framework that basically checks only the exit codes and not much more.
But we can improve that.
#!/usr/bin/env bats
check_it() {
local num="$1"
local expected="$2"
run ./factorial.sh "$num"
test "$output" = "$expected"
}
@test "factorial 2" {
check_it 2 2
}
@test "factorial 3" {
check_it 3 6
}
@test "factorial 4" {
check_it 4 24
}
The error message is not much better but the test is much more readable this way.
Of course, run the above version yourself.
Let’s improve the check_it
function a bit more.
check_it() {
local num="$1"
local expected="$2"
run ./factorial.sh "$num"
if [ "$output" != "$expected" ]; then
echo "Factorial of $num computed as $output but expecting $expected." >&2
return 1
fi
}
Let’s run the test again:
test_factorial.bats
✓ factorial 2
✓ factorial 3
✗ factorial 4
(from function `check_it' in file test_factorial.bats, line 11,
in test file test_factorial.bats, line 24)
`check_it 4 24' failed
Factorial of 4 computed as 12 but expecting 24.
3 tests, 1 failure
This provides output that is good enough for debugging.
Adding more test cases is now a piece of cake. After this trivial update, our test suite will actually start making sense.
And it will be useful to us.
Better assertions
BATS offers extensions for writing more readable tests.
Thus, instead of calling test
directly, we can use assert_equal
that
produces nicer message.
assert_equal "expected-value" "$actual"
NSWI177 tests
Our tests are packed with the assert extension plus several of our own.
All of them are part of the
repository that is downloaded
by run_tests.sh
in your repositories.
Feel free to execute the *.bats
file directly if you want to
run just certain test locally (i.e., not on GitLab).
Tasks to check your understanding
We expect you will solve the following tasks before attending the labs so that we can discuss your solutions during the lab.
Learning outcomes and after class checklist
This section offers a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).
Conceptual knowledge
Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …
-
use
xargs
program -
explain advantages of using automated functional tests
-
explain what is a disk image
-
explain why no special tools are required for working with disk images
Practical skills
Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …
-
execute BATS tests
-
understand basic structure of BATS tests
-
optional: use
lsblk
to view available block (storage) devices -
optional: create simple BATS tests
-
optional: fix corrupted file systems using the family of
fsck
programs -
optional: use PhotoRec to restore files from a broken file system