Lab #10 (Apr 27 – May 1)
Make sure you have completed all the exercises from previous labs. These ones will carry-on where we stopped last week.
Read this Wikipedia article about Continuous integration.
- Git and remote branches.
- Keeping forks up-to-date.
- GitLab: CI.
Last time we started with a fork of the teaching/nswi177/2020-summer/upstream/examples repository and worked in it. In this lab, we will simulate that work in the upstream repository (i.e. the one you forked from) continues and you want to keep your repository up-to-date.
That is a common task, by the way. You work on a new feature but you do not want to miss important updates that are happening in master. As a matter of fact, failing to keep your branch up-to-date with master can complicate merging later on. Depending on the size and activity of the project, it might make sense to merge upstream changes every week or even every day!
In some cases, you may even pull changes from different forks. If you see that someone else is working on a new feature, you may want to try it out and test how it works with your changes.
With Git, all this is possible and (maybe surprisingly) there is very little difference whether you merge your own (local) branch or changes of someone else working in a complete different fork.
To merge changes from a different repository than the default one (e.g. a different project on GitLab), we need to set-up so called remotes.
remote is Git name for saying that your local clone also knows about other forks and can tell you whether there are differences. This is overly simplified way of looking at things but is sufficient for the how-do-you-do of Git remotes.
To see your remotes, run (inside your local clone of your fork of the examples repository)
git remote show
It would probably print only
origin. That is the default remote: when you
git pull or
git push, it uses
origin. Thus, you were using remotes
even without knowing about it ;-).
git remote -v show
It will print what are the specific URLs where the remote is located. As a matter of fact, you will probably see two remotes now: one for push, one for fetch (pull). You can even configure Git to pull from a different repository than you are pushing too. Not very useful for us at the moment, though.
We will now add a new remote to our repository. This will link it with a different project and we would be able to compare changes between them (again, a simplified view).
git remote add upstream firstname.lastname@example.org:teaching/nswi177/2020-summer/upstream/examples.git
We just added a remote named
upstream that points to the
given address (i.e. the original project). Note that Git is silent in this case.
git remote show again. How it changed?
By adding the remote, no data were exchanged yet. You have to tell Git to do everything, nothing happens automagically.
Let’s now fetch the changes from our new remote.
git fetch upstream
You should see the typical summary when cloning/pulling changes in Git, this time they referred to data from the upstream repository.
However, in your working tree (directory), nothing changed. That is fine, we only asked to fetch the changes, not apply them.
git branch and
git branch --all to see which
branches you now have access to.
We will now investigate how the newly added remote differs.
Let’s start with showing commits on the remote:
git log remotes/upstream/lab/10/csv-calc-tests
git log can show commits on certain branch only (yes, the
remote/... is actually a branch name: after all, you have seen
git branch -a).
And it also works on files (e.g.
git log README.md).
It is quite powerful command.
Fine. What about how the code differs? That is actually even more important: you want to see which changes to the code were made and whether it would be possible to merge them at all.
git diff remotes/upstream/lab/10/csv-calc-tests
You ought to see a patch that displays that the newly added remote differs in one file only: automated tests were added.
The tests look pretty good – we want them in our project too.
Let’s merge the remote branch, then:
git merge remotes/upstream/lab/10/csv-calc-tests
Since there shall be no conflicts (i.e. both branches –
remotes/upstream/lab/10/csv-calc-tests changed different files),
the merge should be automatically completed.
Check your project directory: is the
tests.sh file there?
Advanced hint: if you do not like the commit message
(generally, when you commit and you immediately realize that
your commit message has a typo), you can change it. Just type
git commit --amend to edit your last commit. If you have not
pushed your changes, it will work flawlessly. Otherwise, it is
a bit more complicated and it should not be tried by beginners.
git mergeyet) with
As you probably noticed, the second branch extended the tests but it also contains a typo fix for the README file.
But you already fixed the typo last week (if not, fix it before merging!).
So? The merge will lead to so-called conflict that we would need to resolve manually.
That is quite common and there is no need to be afraid of it. Git is able to help you a lot – when there are changes to different parts of a file, Git is able to merge the changes without any problems. But when both branches change the same lines, it is up to you to resolve it.
Note that even if both branches contain exactly the same fix (but introduced by different commits), Git fails on the safe side and informs you about the conflict.
merge command now:
git merge remotes/upstream/lab/10/csv-calc-hotfix
This merge will end with an error and Git will inform you about the conflict.
merge command tells you exactly?
git status and investigate its output.
We need to resolve the conflict. Edit the file and run
git add README.md
to mark the file as resolved from any conflicts.
Let’s finish the merge now by running
commit as with any
Do not forget to push the changes to your repository.
How would the graphical representation of the commits in GitLab look like now?
Try to sketch it on a paper before opening the Graphs page in GitLab.
On your own, merge with
lab/10/ci branch of the upstream repository.
What new files appeared in your repository?Hint.
The last merge brought a new file, called
your repository (to the root of it).
If you have not yet pushed to your fork, push the last merge there now as well.
Unless something went terribly wrong, after a while you should see a green tick next to your last commit in GitLab UI.
If you see a blue stopwatch-like icon, wait for a while.
If you a red X-mark, something is broken.
If you do not see any icon (even after a while), it is time to
verify that you did all the steps (is also the
in the root of your project visible on GitLab?)
and if you think so, contact us.
What is the green or red icon?
By adding the
.gitlab-ci.yml file to our repository, we have
enabled continuous integration for our project. GitLab picks this
file and runs the script in it for each commit.
The script typically executes tests, tries to package the software and sometimes can even deploy the application to production environment!
That is called continuous integration (CI) and continuous deployment (CD): our code is tested and shipped with every commit we make (actually, with every last commit we push to GitLab).
For big pieces of software, such automated pipeline can run for several hours. For our purposes, we would see results within minutes and we are not aiming for automated deployment yet :-)
We will start with simple things, such as Pylinting our code regularly or running our tests.
But enough of theory, let’s look at what actually happened.
Click on the green icon. You should see two jobs: one called
csv_calc-linter, the other called
(depending on where you are, you may need to first click on Status
icon and the see the two stages).
Open both of them.
You should see something that looks like a dump from a terminal.
And somewhere near the bottom you should see execution of
shellcheck and execution of
./tests.sh. And output from these tools.
What is happening there?
GitLab created a virtual machine for each of the jobs, installed GNU/Linux into it and then executed commands inside it. After the commands finished, the virtual machine was automatically destroyed.
That means that each of the jobs was running in a completely clean
That is extremely important as it practically checks that you have
specified all dependencies (in
requirements.txt, for example).
And it also ensures that you have actually committed all files,
set correct rights on them etc.
Quite a lot of things, actually. If you script works in CI, you can be pretty sure that your setup is fine and you have not forgotten anything.
Let’s have a look at how we have configured the virtual machines
and have we have told GitLab to actually run
.gitlab-ci.yml in your editor and try to understand what
is there without reading further.
Ok, have you at least tried looking in the file? Open it NOW.
Basically, for less than 15 lines we told GitLab to set-up virtual machine for us and run code in it. Not bad, don’t you think?
The file is in YAML format that you already know from the SSG task.
There are several top-level settings and then configuration for each of the jobs.
The top-level configuration specifies that we are using a virtual
machine with Fedora (
So, we are actually not installing the system per-se but we rather use
a prepared image of installed system. You can imagine it as if you
have installed Fedora on your machine and made a bit-copy of your hard-drive
at the moment you finished the installation.
The jobs have two parts:
Both of these specify a list of commands that are executed.
script contains the actual command related to your
before_script prepares the virtual machine.
before_script we install dependencies.
You already know DNF, so there shall not be any surprise.
Notice that there is no
sudo or similar action to switch to
Our whole script is running with
Not typical, but it is a usual approach with most CI/CD solutions
and since the script is contained in a virtual machine, it is
And in the
script, we run the actual commands.
To test how CI works when something breaks, insert an intentional error somewhere.
Easiest way to do this is to break the test itself. For example,
change the expected output CSV in
tests.sh. Commit this change
(remember to use a descriptive commit message) and push it to
If everything works as expected, you should receive an e-mail informing about the failure. If not, check your notification settings in GitLab.
Open an issue for this (artificial) problem. Fix the issue in a separate branch and close it via a merge request. Notice that CI is executed for all branches and for merge requests too.
That is great as it can prevent you from merging bad code at all
In big teams, the policy can be that pushing to
master directly is
prohibited and any change must go through a merge request.
The merge requests are then setup in such way to prevent merges where
CI tests failed.
Notice how everything is nicely connected in the web UI. If you have closed the issue via commit message, you should see link to respective commits in the issue description and also the merge request.
On your own, add more jobs to CI.
That includes editing the
.gitlab-ci.yml file, committing the changes
and pushing them to GitLab.
Add the following jobs:
- Run tests in SSG (name the job e.g.
ssg-tests). This requires
noseteststo be installed: install it with DNF first.
- Run Pylint on the SSG code. Again, this would require you to first install the right tool.
Note that we have emphasized to always develop your Python projects
In GitLab CI, it is a bit different: since your code runs with root privileges and the machine would be destroyed, it is simpler to install things directly.
Hence, instead of the trio
pip install ... you typically use only
pip install ... and install things system-wide.
Because you start with a clean-state of the machine (i.e. only basic packages are installed) and you destroy it without reusing it, it is perfectly okay to do it like that.
And it nicely simulates what would happen if somebody installs your project system-wide.
Add a reasonable checks to your task repository too to keep your code in good quality.
shellcheck */*.sh or similar might be a good start.