Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
In this lab we will look at two big topics. First we will look at utilities related to storage management and then we will explore how to develop Python projects in a sandboxed environment that is easily distributed among individual developers in a big software team. In between, we will squeeze in a quick note about file compression (and archival) on Linux systems.
Storage management
Before proceeding, recall that files reside on file systems that are the structures on the actual block devices (typically, disks).
Working with file systems and block devices is necessary when installing a new system, rescuing from a broken device, or simply checking available free space.
You are already familiar with normal files and directories. But there are other types of files that you can find on a Linux system.
Symbolic links
Linux allows to create a symbolic link to another file. This special file does not contain any content by itself and merely points to another file.
An interesting feature of a symbolic link is that it is transparent to
standard file I/O API. If you call Pythonic open
on a symbolic link, it
will transparently open the file the symbolic link points to. That is the
intended behavior.
The purpose of symbolic links is to allow different perspectives on the same files without need for any copying and synchronization.
For example, a movie player is able to play only files in directory Videos
.
However, you actually have the movies elsewhere because they are on a shared
hard drive.
With the use of a symbolic link, you can make Videos
a symbolic link to
the actual storage and make the player happy.
(For the record, we do not know about any movie player with such behaviour,
but there are plenty of other programs where such magic can make them work
in a complex environment they were not originally designed for.)
Note that a symbolic link is something else than what you may know as Desktop shortcut or similar. Such shortcuts are actually normal files where you can specify which icon to use and also contain information about the actual file. Symbolic links operate on a lower level.
Special files
There are also other special files that represent physical devices or files that serve as a spy-hole into the state of the system.
The reason is that it is much simpler for the developer that way. You do not need special utilities to work with a disk, you do not need a special program to read the amount of free memory. You simply read the contents of a well-known file and you have the data.
It is also much easier to test such programs because you can easily give them mock files by changing the file paths – a change that is unlikely to introduce a serious bug into the program.
Usually Linux offers the files that reveal state of the system in a textual
format.
For example, the file /proc/meminfo
can look like this:
MemTotal: 7899128 kB
MemFree: 643052 kB
MemAvailable: 1441284 kB
Buffers: 140256 kB
Cached: 1868300 kB
SwapCached: 0 kB
Active: 509472 kB
Inactive: 5342572 kB
Active(anon): 5136 kB
Inactive(anon): 5015996 kB
Active(file): 504336 kB
Inactive(file): 326576 kB
...
This file is nowhere on the disk but when you open this path, Linux creates the contents on the fly.
Notice how the information is structured: it is a textual file, so reading it requires no special tools and the content is easily understood by a human. On the other hand, the structure is quite rigid: each line is a single record, keys and values are separated by a colon. Easy for machine parsing as well.
File system hierarchy
We will now briefly list some of the key files you can find on virtually any Linux machine.
Do not be afraid to actually display contents of the files we mention here.
hexdump -C
is really a great tool.
/boot
contains the bootloader for loading the operating system.
You would rarely touch this directory once the system is installed.
/dev
is a very special directory where hardware devices have their
file counterparts.
You will probably see there a file sda
or nvme0
that represents your
hard (or SSD) drive.
Unless you are running under a superuser,
you will not have access to these
files, but if you would hexdump
them, you would see the bytes as they
are on the actual hard drive.
It is important to note that these files are not physical files on your disk (after all, it would mean having a disk inside a disk). When you read from them, the kernel recognizes that and returns the right data.
This directory also contains several special but very useful files for software development.
/dev/urandom
returns random bytes indefinitely.
It is probably internally used inside your favorite programming language
to implement its random()
function.
Try to run hexdump
on this file (and recall that <Ctrl>-C
will
terminate the program once you are tired of the randomness).
/etc/
contains system-wide configuration.
Typically, most programs in UNIX systems are configured via text files.
The reasoning is that an administrator needs to learn only one tool – a good
text editor – for system management.
The advantage is that most configuration files have support for comments and
it is possible to comment even on the configuration.
For an example of such a configuration file, you can have a look at
/etc/systemd/system.conf
to get the feeling.
Perhaps the most important file is /etc/passwd
that contains a list of user
accounts.
Note that it is a plain text file where each row represents one record and
individual attributes are simply separated by a colon :
.
Very simple to read, very simple to edit, and very simple to understand.
In other words, the KISS principle in practice.
/home
contains home directories for normal user accounts (i.e., accounts
for real – human – users).
/lib
and /usr
contain dynamic libraries, applications, and system-wide
data files.
/var
is for volatile data. If you would install a database or a web server
on your machine, its files would be stored here.
/tmp
is a generic location for temporary files.
This directory is automatically cleaned at each reboot, so do not use it for permanent
storage. Many systems also automatically remove files which were not modified in the
last few days.
/proc
is a virtual file system that allows controlling and reading of
kernel (operating system) settings.
For example, the file /proc/meminfo
contains quite detailed information about
RAM usage.
Again, /proc/*
are not normal files, but virtual ones.
Until you read them, their contents do not exist physically anywhere.
Mounts and mount-points
Each file system (that we want to access) is accessible as a directory somewhere (compared to a drive letter in other systems, for example).
When we can access /dev/sda3
under /home
we say that /dev/sda3
is mounted under /home
, /home
is then called the mount point,
/dev/sda3
is often called a volume.
Most devices are mounted automatically during boot.
This includes /
(root) where the system is as well as /home
where your
data reside.
File systems under /dev
or /proc
are actually special file systems that are
mounted to these locations.
Hence, the file /proc/uptime
does not physically exist (i.e., there is no
disk block with its content anywhere on your hard drive) at all.
The file systems that are mounted during boot are listed in /etc/fstab
.
You will rarely need to change this file on your laptop and this file was
created for you during installation.
Note that it contains volume identification (such as path to the partition),
the mount point and some extra options.
When you plug-in a removable USB drive, your desktop environment will typically
mount it automatically.
Mounting it manually is also possible using the mount
utility.
However, mount
has to be run under root
to work
(this thread explains several aspects
why mounting a volume could be a security risk).
Therefore, you need to play with this on your installations where you can become
root
.
It will not work on any of the shared machines.
Mounting disks is not limited to physical drives only. We will talk about disk images in the next section but there are other options, too. It is possible to mount a network drive (e.g., NFS or AFS used in MFF labs) or even create a network block device and then mount it.
Working with disk images
Linux has built-in support for working with disk images. That is, with files with content mirroring a real disk drive. As a matter of fact, you probably already worked with them when you set up Linux in a virtual machine or when you downloaded the USB disk image at the beginning of the semester.
Linux allows you to mount such image as if it was a real physical drive and modify the files on it. That is essential for the following areas:
- Developing and debugging file systems (rare)
- Extracting files from virtual machine hard drives
- Recovering data from damaged drives (rare, but priceless)
In all cases, to mount the disk image we need to tell the system to
access the file in the same way as it accesses other block devices
(recall /dev/sda1
from the example above).
Mounting disks manually
sudo mkdir /mnt/flash
sudo mount /dev/sdb1 /mnt/flash
Your data shall be visible under /mnt/flash
.
To unmount, run the following command:
sudo umount /mnt/flash
Note that running mount
without any arguments prints a list of currently active mounts.
For this, root privileges are not required.
Mounting disk images
Disk images can be mounted in almost the same way as block devices,
you only have to add the -o loop
option to mount
.
Recall that mount
requires root (sudo
) privileges hence you need to execute
the following example on your own machine, not on any of the shared ones.
To try that, you can download this FAT image and mount it.
sudo mkdir /mnt/photos-fat
sudo mount -o loop photos.fat.img /mnt/photos-fat
... (work with files in /mnt/photos-fat)
sudo umount /mnt/photos-fat
Alternatively, you can run udisksctl loop-setup
to add the disk image as
a removable drive that could be automatically mounted in your desktop:
# Using udisksctl and auto-mounting in GUI
udisksctl loop-setup -f fat.img
# This will probably print /dev/loop0 but it can have a different number
# Now mount it in GUI (might happen completely automatically)
... (work with files in /run/media/$(whoami)/07C5-2DF8/)
udisksctl loop-delete -b /dev/loop0
Disk space usage utilities
The basic utility for checking available disk space is df
(disk free).
Filesystem 1K-blocks Used Available Use% Mounted on
devtmpfs 8174828 0 8174828 0% /dev
tmpfs 8193016 0 8193016 0% /dev/shm
tmpfs 3277208 1060 3276148 1% /run
/dev/sda3 494006272 7202800 484986880 2% /
tmpfs 8193020 4 8193016 1% /tmp
/dev/sda1 1038336 243188 795148 24% /boot
In the default execution (above), it uses one-kilobyte blocks.
For a more readable output, run it with -BM
or -BG
(megas and gigas)
or with -h
to let it select the most suitable unit.
df
with du
which can be used to estimate file space usage.
Typically, you would run du
as du -sh DIR
to print total space occupied
by all files in DIR
.
You could use du -sh ~/*
to print summaries for top-level directories in your
$HOME
.
But be careful as it can take quite some time to scan everything.
Also, you can observe that the space usage reported by du
is not equal
to the sum of all file sizes. This happens because files are organized in blocks,
so file sizes are typically rounded to a multiple of the block size. Besides that,
directories also consume some space.
To see how volumes (partitions) are nested and which block devices are recognized
by your kernel, you can use lsblk
.
On the shared machine, the following will appear:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 480G 0 disk
├─sda1 8:1 0 1G 0 part /boot
├─sda2 8:2 0 7.9G 0 part [SWAP]
└─sda3 8:3 0 471.1G 0 part /
This shows that the machine has a 480G disk divided into three partitions:
a tiny /boot
for boostrapping the system, a 8G swap partition, and finally
470G left for system and user data. We are not using a separate volume for /home
.
You can find many other output formats in the man page.
Inspecting and modifying volumes (partitions)
We will leave this topic to a more advanced course. If you wish to learn by yourself, you can start with the following utilities:
fdisk(8)
btrfs(8)
mdadm(8)
lvm(8)
File archiving and compression
A somewhat related topic to the above is how Linux handles file archival and compression.
Archiving on Linux systems typically refers to merging multiple files into one (for easier transfer) and compression of this file (to save space). Sometimes, only the first step (i.e., merging) is considered archiving.
While these two actions are usually performed together, Linux keeps the distinction as it allows combination of the right tools and formats for each part of the job. Note that on other systems where the ZIP file is the preferred format, these actions are blended into one.
The most widely used program for archiving is tar
.
Originally, its primary purpose was archiving on tapes, hence the name: tape archiver.
It is always run with an option specifying the mode of operation:
-c
to create a new archive from existing files,-x
to extract files from the archive,-t
to print the table of files inside the archive.
The name of the archive is given via the -f
option; if no name is specified,
the archive is read from standard input or written to standard output.
As usually, the -v
option increases verbosity. For example, tar -cv
prints names
of files added to the archive, tar -cvv
prints also file attributes (like ls -l
).
(Everything is printed to stderr, so that stdout can be still used for the archive.)
Plain tar -t
prints only file names, tar -tv
prints also file attributes.
An uncompressed archive can be created this way:
tar -cf archive.tar dir_to_archive/
A compressed archive can be created by piping the output of tar
to gzip
:
tar -c dir_to_archive/ | gzip >archive.tar.gz
As this is very frequent, tar
supports a -z
switch, which automatically calls
gzip
, so that you can write:
tar -czf archive.tar.gz dir_to_archive/
tar
has further switches for other (de)compression programs: bzip2
, xz
, etc..
Most importantly, the -a
switch chooses the (de)compression program according
to the name of the archive file.
If you want to compress a single file, plain gzip
without tar
is often used.
Some tools or APIs can even process gzip-compressed files transparently.
To unpack an archive, you can again pipe gzip -d
(decompress) to tar
,
or use -z
as follows:
tar -xzf archive.tar.gz
tar
will
overwrite existing files without any warning.
We recommend to install atool
as a generic wrapper around tar
, gzip
,
unzip
and plenty of other utilities to simplify working with archives.
For example:
apack archive.tar.gz dir_to_archive/
aunpack archive.tar.gz
Note that atool
will not overwrite existing files by default
(which is another very good reason for using it).
It is a good practice to always archive a single directory. That way, user that unpacks your archive will not have your files scattered in the current directory but neatly prepared in a single new directory.
To view the list of files inside an archive, you can execute als
.
Sandboxed software development
During the previous lab, we showed that the preferred way of installing applications (and libraries and data files) on Linux is via the package manager. It installs the application for all users, it allows system-wide upgrades, and it generally keeps your system in a much cleaner state.
However, system-wide installation may not be always suitable. One typical example are project-specific dependencies. These are often not installed system-wide, mainly for the following reasons:
- You need different versions of dependencies for different projects.
- You do not want to remember to uninstall them when you stop working on the project.
- You want to control when you upgrade them: an upgrade of the OS should not affect your project.
- The versions you need are different from those available through the package manager.
- Or they may not be packaged at all.
For the above reasons, it is much better to create a project-specific
installation that is better isolated from the system.
Note that installing the dependency per-user (i.e., somewhere into $HOME
)
may not provide the isolation you wish to achieve.
Such approach is supported by most reasonable programming languages and can be usually found under names such as virtual environment, local repository, sandbox or similar (note that the concepts do not map 1:1 across languages and tools, but the general idea remains the same).
With a virtual environment, your dependencies are usually installed into a specific directory inside your project, kept outside version control. The compiler/interpreter is then told to use this location.
The directory-local installation then keeps your system clean. It also allows working on multiple projects with incompatible dependencies, because they are completely isolated.
Each developer can then recreate the environment without polluting the main repository with distribution-specific or even OS-dependent files. Yet the configuration file ensures that all developers will be working in the same environment (i.e., same versions of all the dependencies).
It also means that new members of software teams can easily set up their environment using the provided configuration file.
Dependency installation
Inside the virtual environment, the project usually does not use generic package managers (such as DNF). Instead, they install dependencies using language-specific package managers.
These are usually cross-platform and use their own software repository. Such repository then hosts only libraries for that particular language. Again, there can be multiple such repositories and it is up to the developers how they configure their projects
In our scenario, the language-specific managers would install only into the virtual environment directory without ever touching the system itself.
Python Package Index (PyPI)
The rest of the text will focus mostly on Python tools supporting the above-mentioned principles. Similar tools are available for other languages, but we believe that demonstrating them on Python is sufficient to understand the principles in practice.
Python has a repository called the Python Package Index (PyPI) where anyone can publish their Python programs and/or libraries.
The repository can be used through a web browser, but also through a command-line
client called pip
.
pip
behaves rather similar to DNF.
You can use it to install, upgrade, or uninstall Python modules.
Typical workflow practically
While the actual tools will differ across different programming languages, the general steps for developing project in some kind of a sandbox are generally the same.
- The developer clones the project (e.g., from a Git repository).
- The sandbox (virtual environment) is initialized. Usually this means that a new directory with a fresh language environment is created.
- The virtual environment must be activated. Often the virtual environment
needs to modify
$PATH
(or rather some language-specific variant of such path that is used to search for libraries or modules), so the developer mustsource
(or.
) some activation script that modifies the path. - Then the developer can install dependencies of the project. They are usually stored in a file that can be passed to the package manager (of the given programming language).
- Only now the developer can actually work on the project. The project is fully isolated, removing the virtual environment directory removes all traces of the installed packages.
Everyday job then often involves only steps 3 (some kind of activation) and step 5 (actual development).
Note that activation of the virtual environment typically removes access to libraries installed globally. That is, inside the virtual environment, the developer starts with a fresh and clean environment with a bare compiler. That is actually a very sane decision as it ensures that system-wide installation does not affect the project-specific environment.
In other words, it improves on reproducibility of the whole setup. It also means that the developer needs to specify every dependency into the configuration file even if the dependency can be considered as one of those that are usually present everywhere.
Virtual environment for Python (a.k.a. virtualenv
or venv
)
To try installing Python packages safely, we will first setup a virtual environment for our project. Fortunately, Python has built-in support for creating a virtual environment.
We will demonstrate this on the following example:
#!/usr/bin/env python3
import argparse
import shutil
import sys
import fs
class FsCatException(Exception):
pass
def fs_cat(filesystem, filename, target):
try:
with fs.open_fs(filesystem) as my_fs:
try:
with my_fs.open(filename, 'rb') as my_file:
shutil.copyfileobj(my_file, target)
except fs.errors.FileExpected as e:
raise FsCatException(f"{filename} on {filesystem} is not a regular file") from e
except fs.errors.ResourceNotFound as e:
raise FsCatException(f"{filename} does not exist on {filesystem}") from e
except Exception as e:
if isinstance(e, FsCatException):
raise e
raise FsCatException(f"unable to read {filesystem}, perhaps misspelled path or protocol ({e})?") from e
def main():
args = argparse.ArgumentParser(description='Filesystem cat')
args.add_argument(
'filesystem',
nargs=1,
metavar='FILESYSTEM',
help='Filesystem specification, e.g. tar://path/to/file.tar'
)
args.add_argument(
'filename',
nargs=1,
metavar='FILENAME',
help='File path on FILESYSTEM, e.g. /README.md'
)
config = args.parse_args()
try:
fs_cat(config.filesystem[0], config.filename[0], sys.stdout.buffer)
except FsCatException as e:
print(f"Fatal: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()
Save this snippet into fscat.py
and set the executable bit.
Note that fs.open_fs
is able to open various filesystems and access
files on them like if you use the builtin Pythonic open
.
In our program, we provide path to a filesystem and a file (residing on this filesystem)
to print to the
screen (hence the name, fscat
as it simulates cat
inside a different
filesystem).
Try running the fscat.py
program.
Unless you have already installed the python3-fs
package system-wide,
it should fail with ModuleNotFoundError: No module named 'fs'
.
The chances are that you do not have that module installed.
If you have installed the python3-fs
, uninstall it now and try again
(just for this demo).
But double-check that you would not remove some other program that may require it.
We could now install the python3-fs
with DNF but we already described
why that is a bad idea.
We could also install it with pip
globally but that is not the best course
of action either.
Instead, we will create a new virtual environment for it.
python3 -m venv my-venv
The above command creates a new directory my-venv
that contains a bare installation
of Python.
Feel free to investigate the contents of this directory.
We now need to activate the environment.
source my-venv/bin/activate
Your prompt should have changed: it is prefixed by (my-venv)
now.
Running fscat.py
will still terminate with ModuleNotFoundError
.
We will now install the dependency:
pip install fs
This will take some time as Python will also download transitive dependencies of this
library (and their dependencies etc.).
Once the installation finishes, run fscat.py
again.
This time, it should work.
./fscat.py
Okay, it printed an error message about required arguments. Download this tarball and run the script as follows:
./fscat.py tar://test.tar.gz testdir/test.txt
It should print Test string
as it is able to even handle tarballs as
filesystems and print files on them (verify that the file is really
there using either atool
, MC or using tar
directly).
Once we are finished with the development, we can deactivate the
environment by calling deactivate
(this time, without sourcing anything).
Running fscat.py
outside the environment shall again terminate
with ModuleNotFoundError
.
Installing Python-specific packages with pip
We have already seen one usage of pip
in practice, but pip
can do much more.
The nice walkthrough over all pip
capabilities can be found in
Using Python’s pip to Manage Your Projects’ Dependencies.
Here we provide a brief summary of the most important concepts and commands.
By default, pip install
is searching through the package registry PyPI,
in order to install the package specified in the command-line. We wouldn’t be far from truth,
by saying that all packages inside this registry are just archived directories, which
contain Python source code organized in a prescribed way.
If you would like to change this default package registry, you can use the --index-url
argument.
In a later section, we will learn how to turn a directory with code into a proper Python package.
Assuming that we have already done it, we can install that package directly (without archiving/packing)
by running pip install /path/to/python_package
.
For example, imagine a situation where you are interested in a third-party open-source package.
This package is available in a remote git repository (typically on GitHub or GitLab),
but it is NOT packed and published in PyPI. You can simply clone the repository
and run pip install .
. However, thanks to
pip VCS Support, you
can avoid the cloning phase and install the package directly with:
pip install git+https://git.example.com/MyProject
In order to upgrade a specific package, you run pip install --upgrade [packages]
.
Finally, for removing package you run pip uninstall [packages]
.
Dependency versioning
You might have heard about semantic versioning. Python uses a more or less compatible versioning, which is described in PEP 440 – Version Identification and Dependency Specification.
When you install dependencies from the package registry, you can specify this version.
pkgname # latest version
pkgname == 4.2 # specific version
pkgname >= 4.2 # minimal version
pkgname ~= 4.2 # equivalent to >= 4.2, == 4.*
Truth is that a version specifier consists of a series of version clauses, separated by commas. Therefore you can type:
pkgname >= 1.0, != 1.3.4.*, < 2.0
Sometimes it is helpful to save a list of all currently installed packages (including transitive dependencies). For example, you have recently noticed a new bug in your project and you would like to keep record of the precise version of currently installed dependencies, so that your co-worker can reproduce the bug.
In order to do that, it is possible to use pip freeze
and create a list
that sets specific versions, ensuring the same environment for every developer.
It is recommended to store these in requirements.txt
file.
# Generating requirements file
pip freeze > requirements.txt
# Installing package from it
pip install -r requirements.txt
Packaging Python Projects
Let’s say that you come up with a super cool algorithm and you want to enrich the world by sharing it. Python official documentation offers a step-by-step tutorial on how to achieve it.
Python Package Directory Structure
The very first step, before you can publish it, is to
transform it into a proper Python package. We need to create files called pyproject.toml
and setup.cfg
. These files contain information about the project,
a list of dependencies, and also information for project installation.
In fscat,
you can find a Python package with the same functionality as our previous
fscat.py
script.
setup.cfg
.
Try to install this package with VCS Support with following command:
pip install git+http://gitlab.mff.cuni.cz/teaching/nswi177/2023/common/fscat.git
You perhaps noticed that the setup.cfg
file contained the section
[options.entry_points]
.
This section specifies what the actual scripts of your project are.
Note that after running the above command, you can execute the fscat
command directly.
Pip created a wrapper script for you and added it to the sandbox $PATH
.
fscat tar://tests/test.tar.gz testdir/test.txt
Now uninstall the package with:
pip uninstall matfyz-nswi177-fscat
Clone the repository to your local machine and change directory to it. Now run:
pip install -e .
pip install -e
produces an editable installation
for easy debugging. Instead of copying your code to the virtual environment,
it installs only a symlink-like thing (actually, an fscat.egg-link
file, which has a similar effect on Python’s mechanism for finding modules)
referring to the directory with your source files.
Building a Python package
Now, when we already have the proper directory structure, we are only two steps from publishing it to Package Registry.
Now, we prepare distribution packages for our code. First, we install the build
package by invoking pip install build
. Then we can run
python3 -m build
Two files are created in the dist
subdirectory:
-
matfyz-nswi177-fscat-0.0.1.tar.gz
– a source code archive -
matfyz_nswi177_fscat-0.0.1-py3-none-any.whl
– a wheel file, which is the built package (py3
is the Python version required,none
andany
tell that this is a platform-independent package).
You can now switch to a different virtualenv and install the package
using pip install
package.whl.
Publishing a Python package
If you think that the package could be useful to other people, you can publish it in the Python Package Index. This is usually accomplished using the twine tool. The precise steps are described in Uploading the distribution archives.
Higher-level tools
We can think of pip
and virtualenv
as low-level tools. However, there
are also tools that combine both of them and bring more comfort to package
management. In Python, there are at least two favorite choices, namely
Poetry and
Pipenv.
Internally, these tools use pip
and venv
, so you are still able to
have independent working spaces as well as the possibility to install a
specific package from the Python Package Index (PyPI).
The complete introduction of these tools is out of the scope for this course. Generally, they follow the same principles, but they add some extra functions that are nice to have. Briefly, the major differences are:
- They can freeze specific versions of dependencies, so that the project
builds the same on all machines (using
poetry.lock
file). - Packages can be removed together with their dependencies.
- It is easier to initialize a new project.
Other languages
Other languages have their own tools with similar functions:
Before-class tasks (deadline: start of your lab, week April 24 - April 28)
The following tasks must be solved and submitted before attending your lab. If you have lab on Wednesday at 10:40, the files must be pushed to your repository (project) at GitLab on Wednesday at 10:39 latest.
For virtual lab the deadline is Tuesday 9:00 AM every week (regardless of vacation days).
All tasks (unless explicitly noted otherwise) must be submitted to your submission repository. For most of the tasks there are automated tests that can help you check completeness of your solution (see here how to interpret their results).
11/romandate.py
(100 points, group devel
)
Write a Python program that uses the packages roman
and dateparser
to print the date specified by the user in Roman numerals.
The program is best described by examples provided below (assuming they were executed on April 24, 2023).
./romandate.py
XXIV.IV.MMXXIII
./romandate.py 2021-01-01
I.I.MMXXI
./romandate.py 40 years ago
XXIV.IV.MCMLXXXIII
The tests assume that they are already executed inside a virtual environment where the above-mentioned packages are installed (when executed on GitLab, the tests installs these two packages for you automatically).
The provided tests only check for exact dates (when evaluating it after submission deadline we would insert the current date into the tests).
Do not forget to check your solution that it also works
- when executed without parameters (time
now
) - when executed with relative dates such as
5 days ago
Post-class tasks (deadline: May 14)
We expect you will solve the following tasks after attending the labs and hearing feedback to your before-class solutions.
All tasks (unless explicitly noted otherwise) must be submitted to your submission repository. For most of the tasks there are automated tests that can help you check completeness of your solution (see here how to interpret their results).
11/project-name/
(70 points, group devel
)
Prepare a Python package that provides a project-name
command that
tries to auto-detect project name.
Similarly to one of your very first tasks in this course, the program
will look into README.md
and README
files for the first non-empty line
(again stripping extra whitespace and leading #
in *.md
files).
When neither README.md
or README
are present, the program will try to
find the top directory of a Git project
(consider using the search_parent_directories=True
constructor parameter
and the working_tree_dir
property of Repo
from GitPython)
and print its basename.
If the current directory is not a part of a Git project, the program will print the basename of the current directory.
We expect that the following would work (probably best executed in a virtual environment).
project-name
# Prints 'NSWI177 Submission Repository'
cd 01
project-name
# Prints 'student-LOGIN'
cd ../../
project-name
# Prints directory name of the parent directory of your submission repository clone
We expect that you will setup a proper src
subdirectory and organize your
package properly using setup.cfg
etc.
Feel free to reuse parts of your solution of the 01 task.
The automated tests always create a new virtual environment for each test case.
That is good for final check.
But it is also possible to execute the tests inside activated virtual
environment where they expect that the project-name
command is already
installed by setting NSWI177_LAB11_NO_INSTALL=true
(i.e., they skip the pip install 11/project-name
part which makes them
much faster):
env NSWI177_LAB11_NO_INSTALL=true ./bin/run_tests.sh 11-post/project_name
11/fat.txt
(30 points, group admin
)
The file linux.ms.mff.cuni.cz:~/lab11.fat.img
is a disk image with a single file.
Paste its (decompressed) content into 11/fat.txt
(to your GitLab submission repository).
Note that we can create the source file ~/lab11.fat.img
only after you
login to the remote machine for the first time.
If the file is not there, wait for the next work day for the file to appear.
Do not leave this task for the last minute and contact us if the file has not appeared as explained in the previous paragraph.
Learning outcomes
Learning outcomes provide a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).
Conceptual knowledge
Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …
-
explain what is a disk image
-
explain why no special tools are required for working with disk images
-
explain difference between normal files, directories, symbolic links, device files and system-state files (e.g. from
/proc
filesystem) -
list fundamental top-level directories on a typical Linux installation and describe their function
-
explain in general terms how the directory tree is formed by mounting individual (file) subsystems
-
explain what are requirements (library dependencies)
-
explain fundamentals of semantic versioning
-
explain what are pros and cons of installing dependencies system-wide vs installing them in a sandboxed environment
-
provide a high-level overview of a sandbox environment
-
explain pros and cons of specifying transitive requirements vs specification of top-level ones only
-
explain pros and cons of using exact versions vs minimal requirements
-
explain why Linux maintains separation of archiving and compression programs (e.g.
tar
andgzip
)
Practical skills
Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …
-
mount disks using the
mount
command (both physical disks as well as images) -
get summary information about disk usage with
df
command -
use either
tar
oratool
to work with standard Linux archives -
create a new virtual environment for Python using
python3 -m venv
-
activate and deactivate virtual environment
-
install project dependencies in a virtual environment with
pip
-
develop program inside a virtual environment (with projects using
setup.cfg
andpyproject.toml
files) -
install Python project from its
setup.cfg
-
optional: use
lsblk
to view available block (storage) devices -
optional: setup Python project for installation
This page changelog
-
2023-04-17: Replace occurences of
python
topython3
for better clarity. -
2023-04-20: Automated tests for before class tasks.
-
2023-04-28: Automated tests for post class tasks.
-
2023-06-14: Note about mounting disks in VirtualBox.