Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11.

The purpose of this lab is to demonstrate a key feature of Git: branches. They provide you with a powerful tool to manage day-to-day coding of team-developed software. But they can be useful in single-developer scenarios too.

Today’s software is rarely produced by a single person: it is a team-driven activity and the team members need tools to collaborate on the software. This lab will show you what Git offers you in terms of cooperation for teams of virtually any size. We will also see how GitLab integrates with Git and what tools it offers to simplify the management part of the job.

Typically, a team effort requires that multiple people can work on a shared codebase in such way that:

  • the work of each team member is isolated,
  • but it is possible to combine work of several team members effortlessly (at least in cases where their work is truly independent)

Git provides these features via branches. In fact, branches are a much more general concept, which is useful even in projects with a single developer. In this lab, we will have a look at how to use them.

Preflight checklist

  • You know basic Git commands by heart (clone, pull, push, log, status, diff, add, commit).
  • You have uploaded your public key to one of the 05/key.[0-9].pub files.
  • You are well rested and fresh: this is probably the most useful lab of this course: you basically cannot be a developer without knowing all this :-).

Before we actually start …

Some things mentioned here should be already familiar from earlier labs. That is fine and intentional: we would like to present a coherent picture of how Git works now, which will be likely useful in your future jobs.

If you are already familiar with Git branching, you will probably notice that we are simplifying technical details considerably. Formally, branches are really just pointers to the nodes of the acyclic graph of commits etc., but we believe that is not crucial for this text (and it is covered in NSWI154).

Please make sure that you do not skip any step in the examples below as otherwise some of things would stop making sense and you would not see the effects we want you to see.

Running example

Create a fork of the teaching/nswi177/2024/common/group-sum repository.

Fork is a complete copy of the repository of the original project. When forking, you will become the owner of the project and you can create any modifications without changing the original project.

Do not fork your student-LOGIN repository.

We suggest that you change project visibility of your fork to be private.

Throughout the lab we will work on the example in this repository. Do not forget to clone your fork (not the original repository). You will be making quite a lot of changes there.

This example will try to emulate work in a team – whenever we talk about a different feature (or a bug), imagine that you are working in a big team and the features/bugs are not single-line fixes, but multi-day efforts of individual team members.

Look at the implementation inside group_sum.py. We will later see that group_sum.py should be in src and repository should be properly configured for a Python project but we will keep things simple for now.

The implementation is a slightly extended version of our previous script that was able to sum integers based on line content.

Most notably, it uses argparse module for better parsing of the command-line. This allowed us to extend the functionality with use of custom separators and other small bits.

Because the whole utility is rather small, the actual computational core is dwarfed by the configuration of the command-line parser. Do not be afraid of that code, though. Argument parser in Python is quite useful module that makes Python programs much more readable (execution wise) with very little effort.

GitLab issues

Note that the script terminates with an exception if the input file does not exist (instead of printing a nicer error message).

That is certainly not very user friendly. Let us fix it.

However (let us face that too), people seldom have time to fix a problem at the moment when they discover it. It is therefore quite useful to keep track of all unresolved issues in a project, so that they will not be forgotten.

To keep track of the unresolved issues in our projects, we should record them permanently. In the sidebar, you should see a link to Issues for every GitLab project: following the link would tell you that The Issue Tracker is the place to add things that need to be improved or solved in a project. That is exactly what we need ;-).

Create a new issue describing the problem.

Each issue has a title – think about it as an e-mail subject –, it has to summarize what is wrong or what needs to be done. You also need to provide a description – in some cases you might be thinking that title Exception on missing file has it all. It does not! Always provide an example, and steps how to reproduce the issue.

This applies when you are reporting issues to us as well :-).

Back to our example. Bugs seldom go alone. There are more issues with the code. What happens on the following input:

alpha   45
charlie 32
alpha   HELLO

Oops, another exception.

Therefore, create another issue for this in your project. Again: use a descriptive title, provide a meaningful description. Really. Use this as a practice for the graded task.

View the list of your issues. Each issue should have a number next to it that we can reference later on.

Going further

Many projects even provide templates for gathering all the information that is needed. We will not use them here but they can come very useful for bigger projects. GitLab supports them too.

Note the Markdown link there that brings you help for Markdown formatting – knowing about ` and ``` is a must for every programmer :-).

For further reading, look up the query on how to write a good bug report in your favourite search engine and read at least one article about it. It is worth it. Seriously.

Git branches (let us fix the bugs)

Git has a concept of branches that represent a series of commits. So far, your commits were linear – every commit (except the very first and last, obviously) had one previous commit and one next commit, the ordering represented how the commits happened one after another in time.

Branching in Git allows you to break this linearity – a commit can have multiple successors: your work diverges and each branch follows its own course – for example, each one can be adding a different feature. A commit can also have two parents: you merge changes from the branches together.

A typical example of this is work in a team. Alice and Bob both work on the same project, sharing the same repository. Alice works on feature A, Bob works on feature B. They both started with the same commit (e.g., just after the previous version was released), but their work diverges: each adds new functions for the feature they work on. Once they are satisfied with their work, they merge their changes – i.e., they combine their code to get a new version which contains both features.

The following pictures show examples of simple branches (in MSIM) as well as a more complicated situation in a mid-size open-source project (HelenOS).

Actually, you have already worked with Git branches unaware of that. When you cloned a repository, you created an exact copy of the state that was on the server. When you added new commits locally, they appeared on a branch which is technically different from the branch on the server. A push then joined these two branches again. A pull works in the opposite direction: it brings new commits from the server and merges them to your local branch. But so far, the branches never diverged in your work – one branch was always a prefix of the other one, so the merges were implicit and all the branching was completely transparent.

If you forgot to git pull (or git push) the chances are that Git/GitLab already created some extra branches and merges for you. Now we will do it intentionally :-).

Commit messages (again)

Read How to Write a Git Commit Message by Chris Beams if you have not read it yet. If you did, feel free to read it again.

We are now hopefully past the state when Git was an enemy, so we should work on improving our habits.

From now on, start using reasonable commit messages. Commit messages are a part of the development and are almost as important as the code they refer to. Especially in big teams.

Test (exam) 02 will be about Git: the quality of your commit messages would be also evaluated. Do not postpone practicing the skill of naming things correctly and compactly.
Actually, if you have troubles naming your commit, you probably did too much in one commit. It is the same as with functions (yes, again!): if you cannot describe them in a single sentence, something is wrong.

Feature branches

To keep your code healthy, many projects follow a simple rule: commit as often as possible, but code in the main branch (while the name is configurable, most of the time you will encounter names master or main) must be always correct (in the sense that all tests are passing).

If you are working on a new feature, start a new branch. Work in that branch and merge your code into master only when the feature is completed. For some projects, pre-merge action often involves code review or even load testing.

It has the clear advantage that whenever someone starts on a new branch, they can be reasonably sure that they are starting from a healthy code.

Depending on the project, there may be also extra branches like development where merging does not require code review or a branch named production that denotes the code that will be shipped (possibly automatically) to the customer.

We recommend reading A Simple Git Branch Workflow (dev.to) if you are interested in more details how this model works.

We will follow the above for example in this lab. For each new feature (or a bugfix) we will start a new branch and merge it to the main one only after we have tested it.

Creating the branch in Git

We will start by fixing the issue with raised exception when file is not found. Let’s create a branch for that and switch to that branch.

Bigger teams often have conventions for branch naming. Let us keep things simple and use issue/N-name for branches that are supposed to fix an issue with number N.

To create the branch, we will use git branch command.

git branch issue/1-hide-traceback-file-not-found

We use the id of the issue to create a unique branch name (hence to prevent any clashes with other developers) but we also append a short summary so that we know what we are working on.

This command does not do anything visible. It only marks the current (last) commit as the starting point for a new branch.

To actually switch to a new branch, we need to execute

git checkout issue/1-hide-traceback-file-not-found

Right now, the switch has no visible effect – both master and issue/1-hide-traceback-file-not-found branches refer to the same state of files.

You can also remember a single command instead that both creates the branch and switches to it.

git checkout -b issue/1-hide-traceback-file-not-found
There is also a new command called git switch that would probably eventually replace git checkout. It is still marked as experimental so we intentionally do not use it here.

Sometimes an implementation details from Git leak. In case of branches, you generally cannot create branches issue and issue/1 at the same time.

Generally, naming conventions for branches etc. should be documented in some kind of development documentation.

Now, write a fix for this issue. Solution.

Commit the change.

Connecting commits with issues and git commit --amend

Git is quite flexible when working with commits. If you realize that you want to change the last commit, you can git add files and then call git commit --amend. It will open your text editor with the commit message already filled-in so you can change it.

Never --amend a commit that you have already pushed to the server. That commit could be already cloned by someone else and things would start to break (as a matter of fact, it would be possible to fix things because the commits would basically behave as branches but it is probably not something you want to do).

Also, if you have changed an already-pushed commit, you would need to do a forced push to overwrite the commit on the server. For many projects that is not possible on the master branch at all. So, preferably avoid amending pushed commits.

Technically, git commit --amend creates a new commit in place of the original one. That has several subtle implications, the most important is that the histories before and after the amending are different ones from Git’s perspective. This means that if you have already pushed the original commit, you should not amend it, because it will be difficult to push the new commit (because it doesn’t extend the history on the server).

Use this feature and add to your last commit message words fixes #1. This will have two effects once you push this commit to GitLab. First of all, the issue will contain a link to the commit and the #1 in the commit message will become clickable to open the mentioned issue.

Because our commit fixed the issue, we have added the special keyword fixes to the commit message to automatically close the issue (there are plenty of issue closing patterns out there).

The issue will be closed once the commit is merged to the master branch. That makes a lot of sense: the issue might be fixed but until the code is in the master branch, the program still contains the bug (recall that usually master branch is the code that is shipped to the customer).

Note that it serves too purposes – it saves time (we do not have to switch to the browser at all) and it provides a valuable reference to which commit was actually responsible for fixing the bug. Note that the issue on GitLab is not yet marked as fixed, as we didn’t push any commits to the GitLab yet.

You should not have any uncommitted changes in your project. Let’s switch back to the master branch. Check that the script (after the switch) does not contain your fix. Hint.

Have you checked it? Good, we can go on.

Note that if you have your script opened in a text editor, it should warn you about file being changed on disk. If not, reload the file manually.

Sidenote about git add --patch (or git add -p)

Sometimes you may encounter a situation when you change two things in the same file and yet they are unrelated. For example, when you fix a functional bug but also notice a typo in a comment. In such situation, you should split it into two commits.

When the changes are in two different files, it is easy: you simply call git add on the first file, git commit and then you git add the second file.

When the changes are in the same file, it is possible to use git add -p (for --patch) where Git interactively shows every hunk – basically one change – and asks whether it should be git add-ed or not.

With git add -p you can easily split your commits and also review what you have done without any crazy un-do/re-do mumbo-jumbo in your editor.

But note that there are limitations to what git can do when it comes to considering what a change is. Sometimes when the changes are close to each other (i.e., on the same line), git won’t consider them as two different hunks. You will be forced to manually edit the hunk (e) by following the instructions provided by Git.

Pushing a new branch

Switch back to the issue/1-hide-traceback-file-not-found branch and push it to GitLab. If you run git push (as you were used to), Git will complain that the current branch has no upstream branch. It means (more or less) that you are pushing this branch for the first time and Git wants to make sure how to name the branch at the server.

Nice thing is that Git offers you the command to run to ensure the branch is pushed.

git push --set-upstream origin issue/1-hide-traceback-file-not-found

For now, ignore the link that GitLab sent you back.

Open your project in the browser again. Check that your issue now contains a link to the commit that mentioned it and on the homepage of the project, you can select which branch to display.

Exercises

Try it on your own now.

Fix handling of non-integer values

Let’s now fix the second issue (the one when data contains non-integer values). Create a new branch, resolve the issue and commit the fix.

Do not push the branch yet.

Some questions and thoughts:

  • Why do you need to switch to master first? How would the branching look like if you branch from issue/1-hide-traceback-file-not-found? Why is that bad?
  • To actually fix the issue, simply skip the line (but warn the user about it!)
  • Do not forget to include closes #2 (or similar) in the commit message.

Solution.

Hot fix the typo in README.md

Let’s assume that you just now noticed the typo in README.md (look up the word valeus).

We want to fix that right away and we will do it (just this one time) directly in the master branch.

This is often called hot-fix: something you need to fix ASAP and where breaking the usual habit of feature branch, code review, testing etc. is a hinderance instead of help (though this really depends on the team and product you work on).

So, switch to the master branch (you already committed the fix to issue #2, right?), fix the typos and commit it.

Push your changes from the master branch.

Commit graph

Open the Repository -> Graph page in your browser (from your project). It should show you your branches graphically.

You should see a new branch, issue/1-hide-traceback-file-not-found next to master that stem from the same commit.

The graphical view is a good help if you get lost in a complicated branching model and you are not sure whether some changes should be visible or not in a specific branch.

The purpose is not to create a complicated graphs though sometimes it can be quite wild.

You can also use --graph parameter for git log to have a graphical representation in the terminal.

Merge requests

Merge requests are an advanced feature of GitLab which targets big teams. In big teams, code review is required before any code can be pushed to the master branch.

Code review usually means that a senior developer views your code, comments on it and can ask for further modifications. Think about renaming functions, using different data structures or fixing documentation. Anything from functional bugs to code style.

Merge requests can be used for exactly that. Before you actually merge your code (i.e., push your changes from the feature branch to the main development branch), you can open a so-called merge request.

Merge requests are akin to issues: as a matter of fact, the forms for creating an issue and a merge request look quite similar. This is because Issues describe known problems (or feature requests), while merge requests describe how the problem was fixed. As GitLab states it, merge requests are a place to propose changes you’ve made to a project and discuss those changes with others.

It is a good practice to mention which issue the merge request closes (or issues that are related).

Again, Markdown formatting can be used.

In big teams, other developers would comment on your code as a part of the merge request and also automated checks would be run (you will set up automated checks in some of the next labs).

Even for personal repositories, merge requests can still make sense: they allow the developer to quickly check that everything is okay (i.e., that all files were actually committed etc.).

Advancing the running example

Switch to the branch for the second issue and push it to GitLab, too. You will need to use the --set-upstream switch again.

Ensure you have the closes #2 in the commit message for this branch. If not, git commit --amend can help you.

Notice that after the push, you ought to see a text informing you about opening a merge request with a link.

Open that link in your browser now.

You will notice that the merge request is not submitted yet. The title and description are pre-filled and they look similar to the form we have seen with issues.

Create the merge request now.

Double-check the destination of the merge request. It has to be a branch in your repository, not the repository you forked from.

Let’s merge the request now (there is a big button for that).

Keep the defaults and do a merge (i.e., not a rebase nor a squash).

Ensure that you have pushed the typo hot-fix first (i.e. before merging).

The merge request being closed, we should see a new commit in the master branch.

You may also look at the repository graph again to see how the commits look after the merge.

Check issues of your project now and note that the second issue should have been closed now. You can also check the details of the issue and notice how the commit is nicely connected to the issue.

Back in your local clone of the repository: do not forget to pull the latest changes from master (GitLab created the commit on the server only). Hint.

Merging on the command line

We will now merge the first issue directly on command line without opening a merge request. Because the merge request is always bound to some kind of a branch, you can always merge on the command line, too.

Note again the dual approach which is omnipresent in Linux: you can use nice graphical UI, but also a fully automatable command-line interface.

First, we need to ensure that we are on the branch we want to merge into. Usually, that would be the main branch.

Also make sure your main branch is up-to-date (in our example, make sure you have pulled the changes merged in GitLab UI).

The actual merge is quite simple, indeed.

# ensure we are on main branch (git checkout master)
git merge issue/1-hide-traceback-file-not-found

And it is done. Push the master branch again and check the repository graph now. Note that the merge is actually just a commit that has two different commits as parents (previous commits). Indeed, the most of the options are similar in both subcommands (i.e., commit and merge).

Viewing list of branches and branch deletion

To view list of branches, we can simply call the following command.

git branch

Sometimes it is useful to view all branches including those on the remotes (see later on) by adding -a.

Once the branch is merged, we can remove it to keep the list clean.

git branch -d issue/1-hide-traceback-file-not-found
Deleting a branch does not delete its commits that were already merged.

Instead, removing the branch simply removes the label that stated that particular commits belonged to a particular branch. That is why Git will not ask for confirmation with -d because you are not discarding any actual code or any commits. However, if the branch is not yet merged, Git will refuse to delete the it (with an error message stating that the branch is not fully merged and a hint to use capital -D if you really wish to delete it).

Merging upstream changes (keeping your branch up-to-date) and working with remotes

A feature branch allows you to work on a new feature without disrupting the main branch. But the work in the main branch still continues and you want to keep your branch up-to-date.

That is a very common task. You do not want to miss important updates which are happening in master. As a matter of fact, failing to keep your branch up-to-date with master can complicate merging later on. Depending on the size and activity of the project, it might make sense to merge with the main branch every week or even every day.

Keeping your branch synchronized is often referred to as merging from upstream as that refers to the parent project (branch).

You will see that this is not different at all from any other merging. It is only about choosing the right direction (i.e., your changes to the main branch or changes from the main branch to your feature branch).

Git will always help you with the merging and in most cases, it will be a completely automated process.

We will simulate that work in the upstream repository (i.e., the one you forked from) continues and you want to keep your repository (your fork) up-to-date.

With Git, all this is possible and (maybe surprisingly) there is very little difference whether you merge your own (local) branch or changes of someone else working in a completely different fork.

Depending on the development workflow, you would be either working in a branch of the main project or you would have your fork and work in the forked repository. The first approach is usual for closed teams (e.g., in a company) while the second approach is more useful for open projects where anyone is welcomed to contribute.

To merge changes from a different repository than the default one (e.g., a different project on GitLab), we need to set-up so called remotes.

A remote is a Git name for saying that your local clone also knows about other forks and it can tell you whether there are differences. Again, this is an overly simplified way of looking at things, but is sufficient for the how-do-you-do of Git remotes. Usually you expect that the remotes share a common ancestor, i.e. the initial commits are the same across remotes.

To see your remotes, run (inside your local clone of your fork of the examples repository)

git remote

It would probably print only origin. That is the default remote: when you do git pull or git push, it uses origin. Thus, you were using remotes even without knowing about it ;-).

As with virtually everything in Git, even the default remote can be configured. Look for tracking branch in the manual for details.

Running it with -v (for verbose) will print what are the specific URLs where the remote is located. As a matter of fact, you will probably see two remotes now: one for push, one for fetch (pull). You can even configure Git to pull from a different repository than you are pushing too. Not very useful for us at the moment, though.

To see even more details, try git remote show origin.

Adding another remote

Before continuing, ensure your public key works for our repository at gitolite3@linux.ms.mff.cuni.cz:lab05-LOGIN (see lab 05 for details).

Let us add a new remote to our repository. This will refer to a different project, so that we can compare our changes with theirs (again, a simplified view of things).

git remote add upstream gitolite3@linux.ms.mff.cuni.cz:lab07-group-sum-ng.git

But wait. This is not GitLab repository! That is fine. We will add a remote living somewhere else. They share the same Git history and things will work.

The above command added a remote named upstream that points to the given address. Note that Git is silent in this case.

Run git remote again. How it changed?

Note that our repository contains the -ng suffix as a new generation, i.e. we are (kind of) simulating that the original project you have forked from is stale but someone else took over and continues with the development.

Working with remotes

By adding the remote, no data were exchanged yet. You have to tell Git to do everything, nothing happens automagically. Note that if you ever encounter a different versioning system, Git will feel very low-level and perhaps even tedious to use. It is the price for its effectiveness and flexibility.

Let’s fetch the changes from our new remote now.

git fetch upstream

You should see the typical summary as when cloning/pulling changes in Git. This time it referred to data from the upstream repository.

However, in your working tree (i.e., the directory with your project), nothing changed. That is fine, we only asked to fetch the changes, not to apply them.

However, run git branch and git branch --all to see which branches you have access to now.

Note that adding a remote does not start any communication with the remote server, Git only writes down the configuration. git fetch than actually retrieves the changes from the remote server. Without git fetch, we would have no information about the actual code available on that particular remote.

Comparing branches (and merging them too)

Now, we will investigate how the newly added remote differs.

Ensure you are on the master branch at the moment.

Let’s start with showing commits on the remote:

git log remotes/upstream/topic/tests

As you can see, git log can show commits on a certain branch only (yes, the remotes/... is actually a branch name: after all, you have seen it in git branch --all). And it also works on files (e.g., git log -- README.md). It is quite powerful command indeed.

Actually, git log README.md would work too as long as Git is able to differentiate between file names and branch names. Because file names usually differ from branch names, it will work in most cases without explicit -- too.

But we wanted to see how the code differs. That is actually even more important: you want to see which changes to the code were made and whether it would be possible to merge them at all.

git diff remotes/upstream/topic/tests

You ought to see a patch that displays that the newly added remote differs in one file only: automated tests were added.

They look pretty good – we want them in our project, too.

Let’s merge the remote branch (topic/tests), then:

git merge remotes/upstream/topic/tests

Since there shall be no conflicts (i.e., both branches – master and remotes/upstream/topic/tests – changed different files), the merge should be automatically completed.

Check your project directory: is the tests.bats file there?

Note that you can change the merge commit message using --amend.

Merging conflicts (and their resolution)

Using the same approach, prepare for merge (i.e., do not run git merge yet) with upstream/hotfix/readme-typo.

As you probably noticed, the second branch contains a typo fix. But you have already fixed it (if not, fix it before merging!).

The merge will lead to so-called conflict: two developers touched the same file and made their individual modifications. We would need to resolve that manually.

That is quite common and there is no need to be afraid of it. Git is able to help you a lot – when there are changes to different parts of a file, Git is able to merge the changes without any problems. But when both branches change the same lines, it is up to you to resolve it. That is quite natural and you would be surprised how many times Git is able to merge things automatically.

Enough of theory, run the merge command now:

git merge remotes/upstream/hotfix/readme-typo

This merge will end with an error and Git will inform you about the conflict.

Review the output from the merge command. Note how Git tries to help you what can be done…

Run also git status and investigate its output.

Now comes the tricky part of the whole workflow: you need to resolve the conflict.

In our case, it is rather simple. For complex software, resolving a conflict can be a very tricky operation as you need to check several places and mentally combine the changes first. Having automated tests can help, but analytical thinking is certainly a plus.

Once you solve the conflict (i.e. you modify the sources), you need to call git add (like with a normal commit – a merge commit is still a commit, after all) to mark the conflict as resolved.

git add README.md

To finish the merge, run git commit as with any normal commit.

Do not forget to push the changes to your repository.

How would the graphical representation of the commits in GitLab look like now?

Try to sketch it on a paper before opening the Graphs page in GitLab.

While Git is often able to perform the merge on its own, it is always the user that is responsible that the result makes sense.

It is easy to imagine a situation where two developers change different parts of the program (without breaking anything) yet the combined change cannot work.

Having good tests certainly helps in these situations.

Branching summary

The above topics represent about 90% of what you will ever need to know about branches for everyday development.

We have not covered advanced topics such as history rewriting, rebasing etc. You can always find them in the Git Book.

The rest of the lab is devoted to some extra topics that you can skip :-).

End of the lab contains another example to try branches completely on your own.

Check you understand it all

Select all true statements. You need to have enabled JavaScript for the quiz to work.

Completely unrelated topics :-)

Take the following as an excursion that scripts are not only about dull text files but the possibilities are endless.

VLC

VLC is a popular multimedia player available for many platforms. We will not bore you with GUI but instead show several interesting command-line options. They might be useful in specific setups such as kiosk-mode (e.g., on an exhibition) or similar.

Following command-line switches will play the given video but split it into four different windows. Useful if you have multiple monitors with very thin edges, perhaps.

vlc --video-splitter wall --wall-cols 2 --wall-rows 2 --wall-element-aspect 4:3 video.mpg

VLC can be also used to stream video over the network. Assuming you have files 1.mp4, 2.mp4 and 3.mp4 in your current directory, prepare the following file (name it vod.vlc):

new channel1 vod
setup channel1 input 1.mp4
setup channel1 enabled

new channel2 vod
setup channel2 input 2.mp4
setup channel2 enabled

new channel3 vod
setup channel3 input 3.mp4
setup channel3 enabled

Each block setups one video-on-demand configuration (i.e., the client asks for a particular video and the VLC server will provide it).

Next, we will start command-line VLC in server mode to listen for RTSP connections on port 5554.

cvlc --vlm-conf vod.vlc  --rtsp-tcp --rtsp-port 5554

We can now play the selected video with the following command.

vlc rtsp://127.0.0.1:5554/channel3

The above assumes you play the video on the same machine. The first command can be extended with --rtsp-host to listen on another interface to stream the video over network (and run the second command on a different machine).

youtube-dl

youtube-dl is a video downloader that is able to find and download videos from various websites. As a command-line utility, it can be used in scripts or for downloading videos for later viewing (i.e., download with fast connectivity and view later).

One of the supported sites is OpenClassroom of the Standford University. Here, some of the videos are in FLV format but youtube-dl can convert the video by simply adding --recode-video mp4:

youtube-dl --recode-video mp4 "http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=IntroToAlgorithms&video=CS161L1P1&speed=100"

youtube-dl supports many other sites: downloading from many of them is explicitly prohibited by their terms of use (for many using similar tools is bordering on the concept of fair use policy).

ffmpeg

Another interesting tool is ffmpeg that is a general converter of video formats. Apart from trivial conversion between various formats it can do a plethorea of extra effects.

As usual: the advantage of command-line interface is substantial for mass conversions where no user interactivity is needed.

For example, the following command converts the audio to AAC while keeping video without any modification. We used to convert with it the lab videos during remote teaching so they work in Firefox too.

ffmpeg -i "input_file.mp4" -c:v copy -c:a aac "output_file.mp4"

But it can do much more than that. Let’s download the following clips first and save them as 1.mp4, 2.mp4 and 3.mp4.

https://www.videvo.net/video/airport-departure-board/6001/
https://www.videvo.net/video/swiss-aircraft-taking-off/4061/
https://www.videvo.net/video/airplane-window-view/5176/

The following command then puts these clips next to each other into a single video. It uses multiple -i to actually load multiple input streams and then it uses a complex filter to stack the videos next to each other. The -t is used to limit the conversion to first 10 seconds only.

The variables are used only to convey the whole pipeline. In f_rescale we rescale all the videos. The [0] denotes the first input file (first in the array of input files), the [v0] is a user identifier to name the result of the filter. In f_stack, we use the xstack filter with the given layout specification – see this page if you are interested in details about positioning with 0_0|w0_0|w0_h1.

f_rescale='[0]scale=-1:360[v0];[1]scale=-1:180[v1];[2]scale=-1:180[v2]'
f_stack='[v0][v1][v2]xstack=inputs=3:layout=0_0|w0_0|w0_h1[v]'

ffmpeg -i 1.mp4 -i 2.mp4 -i 3.mp4 -filter_complex "$f_rescale;$f_stack" -map "[v]" -t 10 -vsync 2 output.mp4

If you need to perform the above for a single video, using an interactive editor would be certainly easier. But if you need to process multiple videos with same or similar configuration, ffmpeg might be a better choice.

Tasks to check your understanding

We expect you will solve the following tasks before attending the labs so that we can discuss your solutions during the lab.

Copy the following code to 07/avg.py in your submission project into the master branch.

#!/usr/bin/env python3

import sys

def main():
    values = list(map(int, sys.argv[1:]))
    avg = sum(values) / len(values)
    print(avg)

if __name__ == "__main__":
    main()

The script computes the average of numbers provided on the command line, e.g. running ./avg.py 1 2 6 prints 3.0.

Create an issue in your submission repository describing the problem when the program is run without any extra arguments.

Then create a Git branch named lab-07/avg where you fix the code to also work when no arguments are provided (the program will print 0 in that case).

Close the issue from a commit.

Your repository must contain at least two commits for this file: one importing the current solution and one (in the lab-07/avg branch) fixing the issue (that is what the automated tests will check).

We are experiencing some unexpected issues with this task: if your task fails on GitLab, try restarting the job after a while.

See also this issue on the Forum.

This example can be checked via GitLab automated tests.

You perhaps noticed that your submission repository is actually a fork of another one. (That was for technical reasons as it simplified the creation of the repository for us.)

But it means that you can merge from it a change from us. It now contains file 07/UPSTREAM.md with an artificial content.

Merge this file into your repository. Do not rebase or squash, do a normal merge, please.

Do not copy the file, but use Git to perform the merge.

Tests may start to fail after some time (GitLab clones only recent history).

That is fine as long as the test was passing at some point.

We are experiencing some unexpected issues with this task: if your task fails on GitLab, try restarting the job after a while.

See also this issue on the Forum.

This example can be checked via GitLab automated tests.

There are only the two extra examples above for this lab to check the basic skills you just acquired.

The best way to practice using Git branches is to start using them regularly.

When solving any exercise (either from this or from other courses): set yourself a goal to always create a fresh branch for it and merge it to the main branch only when it is ready.

Many of the merges will be fast-forwards (i.e. Git will just move the commits as they would be linear with respect to master) but that is fine.

And of course, use issues and name your branches/commits reasonably.

And that is all about advanced Git :-).

Learning outcomes

Learning outcomes provide a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).

Conceptual knowledge

Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …

  • explain main differences between a project fork and repository working copy (clone)

  • explain what is a Git branch

  • explain what is a feature branch

  • explain what is branch merging in Git

  • explain what is a merge (pull) request and when it can be used

  • explain what is a Git remote

  • explain when Git merge conflict can arise and how it can be resolved

  • explain what is usually meant by upstream project (repository)

Practical skills

Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …

  • create Git branch locally using git branch

  • push new Git branch to remote server

  • create merge request from a feature branch in GitLab

  • switch between branches (git checkout)

  • merge Git branches locally using git merge

  • fix merge conflicts

  • setup Git remotes

  • optional: use the youtube-dl utility

  • optional: use VLC media player from command-line

This page changelog

  • 2024-04-04: Note about spurious faults in GitLab.

  • 2024-04-18: Clarify conflict resolution.