Cvičení: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
- Čtení před cvičením
-
Virtual environment for Python (a.k.a.
virtualenv
orvenv
) - How does it work?
-
Installing Python-specific packages with
pip
- Packaging Python Projects
- Building Python Package
- Publishing Python Package
- Creating distribution packages (e.g. for DNF)
- Higher-level tools
- Other languages
- Excercise
- Hodnocené úlohy (deadline: 12. května)
- Učební výstupy
Toto cvičení je věnováno základům pro reprodukovatelnou a izolovanou vývojařinu. Uvidíte, jak můžeme zajistit, že práce na projektu – který vyžaduje instalaci závislostí – nepotřebuje instalaci žádného systémového balíčku ani další zásah do systému jako takového.
Nezapomeňte, že Čtení před cvičením je povinné a je z něj kvíz, který musíte vyplnit před cvičením.
Virtual environment for Python (a.k.a. virtualenv
or venv
)
To try installing Python packages safely, we will first setup a virtual environment for our project. Fortunately, Python has built-in support for creating a virtual environment.
We will demonstrate this on following example:
#!/usr/bin/env python3
import sys
import dateparser
def main():
input_date = ' '.join(sys.argv[1:])
if input_date == '':
input_date = 'now'
date = dateparser.parse(input_date)
if not date:
print(f"Invalid date specification (`{input_date}').", file=sys.stderr)
sys.exit(1)
print(date.strftime('%Y-%m-%dT%H:%M:%S'))
if __name__ == '__main__':
main()
Save this snippet into timestamp2iso.py
and set the executable bit. Note
that dateparser.parse()
is able to parse various time specification into
the native Python date format. The time specification can be even text such
as three days ago
.
Make sure you understand the whole program before continuing.
Try running the timestamp2iso.py
program.
Unless you have already installed the python3-dateparser
package
system-wide, it should fail with ModuleNotFoundError: No module named 'dateparser'
. The chances are that you do not have that module installed.
If you have installed the python3-dateparser
, uninstall it now and try
again (just for this demo). But double-check that you would not remove some
other program that may require it.
We could now install the python3-dateparser
with DNF but we already
described why that is a bad idea. We could also install it with pip
globally but that is not the best course of action either.
Instead, we will create a new virtual environment for it.
python -m venv my-venv
The above command creates a new directory my-venv
that contains a bare
installation of Python. Feel free to investigate the contents of this
directory.
We now need to activate the environment.
source my-venv/bin/activate
Your prompt should have changed: it is prefixed by (my-venv)
now.
Running timestamp2iso.py
will still terminate with ModuleNotFoundError
.
We will now install the dependency:
pip install dateparser
This will take some time as Python will also download transitive
dependencies of this library (and their dependencies etc.). Once the
installation finishes, run timestamp2iso.py
again.
This time, it should work.
./timestamp2iso.py three days ago
Once we are finished with the development, we can deactivate the environment
by calling deactivate
(this time, without sourcing anything).
Running timestamp2iso.py
outside the environment shall again terminate
with ModuleNotFoundError
.
Installing Python-specific packages with pip
We have already seen one usage of pip
in practice, but pip
can do much
more. The nice walkthrough over all pip
capabilities can be found in
Using Python’s pip to Manage Your Projects'
Dependencies.
Here we provide a brief summary of the most important concepts and commands.
By default pip install
is searching through the package registry
PyPI, in order to install package specified in
command-line. We wouldn’t be far from truth, by saying that all packages
inside this registry are just archived directories, which contains
Python source code organized in a prescribed way.
If you would like to change this default package registry you can use
--index-url
argument.
In later section, we will learn how to turn a directory with code into
proper Python package. Assuming that we have already done it, we can that
package directly (without archiving/packing) by running pip install /path/to/python_package
.
For example, imagine a situation where you are interested in third-party
open-source package. This package is available in remote git repository
(typically on GitHub or GitLab), but it is NOT packed and published in
PyPI. You can simply clone the repository and run pip install .
. However,
thanks to pip VCS
Support you can avoid
the cloning phase and install the package directly with:
pip install git+https://git.example.com/MyProject
In order to upgrade a specific package you run pip install --upgrade [packages]
.
Finally, for removing package you run pip uninstall [packages]
.
Dependency versioning
We have already mentioned Semantic Versioning 2.0.0. Python uses more or less compatible versioning, which is described in PEP 440 – Version Identification and Dependency Specification.
When you install dependencies from package registry, you can specify this version.
pkgname # latest version
pkgname == 4.2 # specific version
pkgname >= 4.2 # minimal version
pkgname ~= 4.2 # equivalent to >= 4.2, == 4.*
Truth is that a version specifier consists of a series of version clauses, separated by commas. Therefore you can type:
pkgname >= 1.0, != 1.3.4.*, < 2.0
Dependency versioning
Sometimes it is helpful to save a list of all currently installed packages (including transitive dependencies). For example, you have recently noticed a new bug in you project and you would like to keep record of precise version of currently installed dependencies, so you co-worker can reproduce it.
In order to do that, it is possible to use pip freeze
and create a list
that sets specific versions, ensuring the same environment for every
developer.
It is recommended to store these in requirements.txt
file.
# Generationg requirements file
pip freeze > requirements.txt`
# Installing package from it
pip install -r requirements.txt
Packaging Python Projects
Let’s say that you come up with a super cool algorithm and you want to enrich the world by sharing it. Python official documentation offers step-by-step tutorial how to achieve it.
Python Package Directory Structure
The very first step, before you can publish it, is to transform it into a
proper Python package. We need to files called pyproject.toml
and
setup.cfg
. These files contain information about the project, a list of
dependencies, and also information for project installation.
In
timestamp2iso
you can find Python package with the same functionality as our previous
timestamp2iso.py
script.
Please study carefully the directory structure as well as the content of setup.cfg
.
Try to install this package with VCS Support with following command:
pip install git+http://gitlab.mff.cuni.cz/teaching/nswi177/2022/common/timestamp2iso.git
You perhaps noticed that the setup.cfg
contained section
[options.entry_points]
. This section specifies what are actual scripts of
your project. Note that after running the above command, you can execute
timestamp2iso
command directly. Pip created a wrapper script for you and
added it to the sandbox $PATH
.
timestamp2iso three days ago
Now uninstall the package with:
pip uninstall matfyz-nswi177-timestamp2iso
Clone the repository to you local machine and change directory to it. Now run:
pip install -e .
pip install -e
produces an editable installation for easy
debugging. Instead of copying your code to the virtual environment, it
installs only a symlink-like thing (actually, an timestamp2iso.egg-link
file which has a similar effect on Python’s mechanism for finding modules)
referring to the directory with your source files.
Add some nice prefix just before the ISO print statement and run
timestamp2iso three days ago
again.
Building Python Package
Now, when we already have the proper directory structure, we are only two step from publishing it to Package Registry.
Now, we prepare distribution packages for our code. Firstly, we install the
build
package by invoking pip install build
. Then we can run
python -m build
Two files are created in the dist
subdirectory:
-
matfyz-nswi177-timestamp2iso-0.0.1.tar.gz
– a source code archive -
matfyz_nswi177_timestamp2iso-0.0.1-py3-none-any.whl
– a wheel file, which is the built package (py3
is the Python version required,none
andany
tell that this is a platform-independent package).
You can now switch to a different virtualenv and install the package using
pip install
package.whl.
Publishing Python Package
If you think that the package could be useful to other people, you can publish it in the Python Package Index. This is usually accomplished using the twine tool. The precise steps are described in Uploading the distribution archives.
Higher-level tools
We can think of the pip
and virtualenv
as low-level tools. However,
there are also tools that combine both of them and bring more comfort to
package management. In Python there are at least two favorite choices,
namely Poetry and
Pipenv.
Internally, these tools use pip
and venv
, so you are still able to have
independent working spaces as well as the possibility to install a specific
package from the Python Package Index (PyPI).
The complete introduction of these tools is out of the scope for this course. Generally, they follow the same principles, but they add some extra functions that are nice to have. Briefly, the major differences are:
- They can freeze specific versions of dependencies, so that the project
builds the same on all machines (using
poetry.lock
file). - Packages can be removed together with their dependencies.
- It is easier to initialize a new project.
Other languages
Other languages have their own tools with similar functions:
Excercise
Setup program from examples
repository
(11/last_commit
) to be a proper Python project.
Hodnocené úlohy (deadline: 12. května)
11/tapsum2json
(100 bodů)
Napište program, který vytvoří souhrn TAP testů ve formátu JSON.
TAP – neboli Protokol pro testování čehokoliv – Test Anything Protocol je univerzální formát pro výsledky testů. Používá ho BATS a také pipelines v GitLabu.
1..4
ok 1 One
ok 2 Two
ok 3 Three
not ok 4 Four
#
# -- Report --
# filename:77:26: note: Something is wrong here.
# --
#
Váš program dostane seznam argumentů – názvy soubor – a přečte je pomocí
tzv. TAP consumeru.
Každý ze souborů bude samostatným TAP výsledkem (tj. to co třeba BATS
vytiskne s parametrem -t
).
Neexistující soubory budou přeskočeny a započteny jako soubory bez testů.
Program pak vytiskne souhrn testů v následujícím formátu.
{
"summary": [
{
"filename": "filename1.tap",
"total": 12,
"passed": 8,
"skipped": 3,
"failed": 1
},
{
...
}
]
}
Při řešení musíte použít knihovnu pro čtení TAP soubor: tap.py je určitě rozumná volba, ale můžete zkusit najít nějakou lepší.
Vaše řešení musí obshovat pyproject.toml
, setup.cfg
a
requirements.txt
se seznamem knihoven, na kterých váš program
závisí a které mohou být předány do pip install
.
Vaše řešení musí jít nainstalovat pomocí setup.cfg
a vytvoří
spustitelný tapsum2json
soubor v $PATH
.
Toto je povinná součást testů, protože tak budeme vaše řešení testovat
(podívejte se do testů).
Uložte vaše řešení do podadresáře 11/tapsum2json
.
Pokud chcete pustit automatické testy na svém počítači, budete potřebovat
prográmek json_reformat
z DNF balíčku yajl
(sudo dnf install yajl
).
Testy přeformátují JSONový výstup, aby ho šlo jednoduše porovnávat.
Nevyžadujeme, abyste formátovali výstup ve vašem programu, ale předání
indent=True
do funkce json.dump
určitě zjednoduší ladění.
Učební výstupy
Znalosti konceptů
Znalost konceptů znamená, že rozumíte významu a kontextu daného tématu a jste schopni témata zasadit do většího rámce. Takže, jste schopni …
-
vysvětlit, co jsou závislosti (ve smyslu požadovaných knihoven)
-
vysvětlit, proč instalace závislostí globálně do systému nemusí dobře fungovat pro více projektů
-
vysvětlit, jak fungují virtuální prostředí (sandboxing) (hlavní principy)
-
vysvětlit výhod a nevýhody uvedení tranzitivních závislostí ve srovnání s uvedením jen přímých; vysvětlit výhody a nevýhody uvedení přesné verze nebo jen minimální
Praktické dovednosti
Praktické dovednosti se obvykle týkají použití daných programů pro vyřešení různých úloh. Takže, dokážete …
-
vytvořit nové virtuální prostředí (pro Python)
-
aktivovat a deaktivovat existující virtuální prostředí
-
spustit/otestovat Pythoní projekt pomocí virtualenv (s
setup.cfg
apyproject.toml
) -
nainstalovat projekt, který používá
setup.cfg
apyproject.toml
-
nainstalovat nové závislosti pro projekt
-
aktualizovat seznam závislostí
-
nastavit projekt pro instalaci (volitelné)