Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
The goal of this lab is to introduce you to the Git command-line client and how to write reusable scripts. We will demonstrate how Linux is suited for interpreted languages. And we will make our work with GitLab much more efficient and see how to transfer files from it and back to it via a command-line client.
Linux scripting
A script in the Linux environment is any program that is interpreted when being run (i.e., the program is distributed as a source code). In this sense, there are shell scripts (the language is the shell as you have seen it last time), Python, Ruby or PHP scripts.
The advantage of so-called scripting languages is that they do require only a text editor for development and that they are easily portable. Disadvantage is that you need to install the interpreter first. Fortunately, Linux typically comes with many interpreters preinstalled and starting with a scripting language is thus very easy.
Simple shell scripts
To write a shell script, we simply write the commands into a file (instead of typing them in a terminal).
Therefore, a simple script that prints some information about your system could be as simple as the following.
cat /proc/cpuinfo
cat /proc/meminfo
If you store this into a file first.sh
, then you can execute it with
the following command.
bash first.sh
Notice that we have executed bash
as that is the shell program (interpreter)
that we are using and the name of the input file.
It will cat
those two files (note that we could have executed a single
cat
with two arguments as well).
Recall that your project_name.py
script can be executed with the following
command (again, we run the right interpreter).
python3 factor.py
Shebang and executable bit
Running scripts by specifying the interpreter to use (i.e., the command to run the script file with) is not very elegant. There is an easier way: we mark the file as executable and Linux handles the rest.
Actually, when we execute the cat
command or mc
, there is a file
(usually in the /bin
or /usr/bin
directory) that is named cat
or mc
and
marked executable.
(For now, imagine the special executable mark as a special file attribute.)
Notice that there is no file extension.
However, marking the file as executable is only the first half of the solution.
Imagine that we create the following content and store it into a file hello.py
marked as executable.
print("Hello")
And then we want to run it.
But wait! How will the system know which interpreter to use? For binary executables (e.g., originally from C sources), it is easy as the binary is (almost) directly in the machine code. But here we need an interpreter first.
In Linux, the interpreter is specified via so-called shebang or hashbang.
As a matter of fact, you have already encountered it several times:
When the first line of the script starts with #!
(hence the name hash and bang), Linux expects a path to the interpreter
after it and will run this interpreter and ask it to execute the script.
For shell scripts, we will be using #!/bin/bash
, for Python we need to
use #!/usr/bin/env python3
.
We will explain the env
later on; for now, please just remember to use
this version.
Now back to the original question: how is the script executed.
The system takes the command from the shebang, appends the actual filename
of the script as a parameter, and runs that.
When the user specifies more arguments (such as --version
), they are appended
as well.
For example, if hexdump
were actually a shell script, it would
start with the following:
#!/bin/bash
...
code-to-loop-over-bytes-and-print-them-goes-here
...
Executing hexdump -C file.gif
would then actually execute the following
command:
/bin/bash hexdump -C file.gif
Notice that the only magic thing behind shebang and executable files is that the system assembles a longer command line.
The user does not need to care about the implementation language.
Let us try it practically.
We know about the shebang, so we will update our example and also mark the file as an executable one.
Store the following into first.sh
.
#!/bin/bash
cat /proc/cpuinfo
cat /proc/meminfo
To mark it as executable, we run the following command. For now, please, remember it as a magic that must be done, more details why it looks like this will come later.
chmod +x first.sh
chmod
will not work on file systems that are not Unix/Linux-friendly.
That unfortunately includes even NTFS.
Now we can easily execute the script with the following command:
./first.sh
The obvious question is: why the redundant ./
? It refers to the current
directory after all, right (recall previous lab)? So it refers to the same file!
When you type a command (e.g., cat
) without any path (i.e., only bare filename
containing the program),
shell looks into so-called $PATH to actually find the file with the program
(usually, $PATH
would contain directory /usr/bin
where most of the
executable binaries are stored).
Unlike in other operating systems, shell does not look into the working
directory when program cannot be found in the $PATH
.
To run a program in the current directory, we need to specify its path
(when any extra path is provided, shell ignores $PATH
and simply looks
for the file).
Luckily, it does not have to be an absolute path, but a relative one is
sufficient. Hence the magic spell of ./
.
If you move to another directory, you can execute it by providing a relative
path too, such as ../first.sh
.
Run ls
in the directory now.
You should see first.sh
now printed in green.
If not, you can try ls --color
or check that you have run chmod
correctly.
If you do not have a colorful terminal (unusual but still possible),
you can use ls -F
to distinguish file types: directories will
have a slash appended, executable files will have an asterisk next to
their filename.
Excercise
Create a script that prints all image files in current directory (for now, you can safely assume there will always be some). Try to run it from different directories using relative and absolute path. Answer.
Create a script that prints information about currently visible disk
partitions in the system.
For now, it will only display contents of /proc/partitions
.
Answer.
Changing working directory
Let us modify our first script a little bit.
cd /proc
cat cpuinfo
cat meminfo
Run the script again.
Despite the fact that the script changed directory to /proc
,
when it terminates, we are still in the original directory.
Try inserting pwd
to ensure that the script really is inside /proc
.
Debugging the scripts
If you want to see what is happening, run the script as bash -x first.sh
.
Try it now.
For longer scripts, it is better to print your own messages as -x
tends to
become too verbose.
To print a message to the terminal, you can use the echo
command.
With few exceptions (more about these later), all arguments are simply
echoed to the terminal.
Create a script echos.sh
with the following content and explain
the differences:
#!/bin/bash
echo alpha bravo charlie
echo alpha bravo charlie
echo "alpha bravo" charlie
Answer.
Command-line arguments
Command-line arguments (such as -l
for ls
or -C
for hexdump
) are
the usual way to control the behaviour of CLI tools in Linux.
For us, as developers, it is important to learn how to work with them inside
our programs.
We will talk about using these arguments in shell scripts later on, today we will handle them in Python.
Accessing these arguments in Python is very easy.
We need to add import sys
to our program and then we can access these arguments
in the sys.argv
list.
Therefore, the following program only prints its arguments.
#!/usr/bin/env python3
import sys
def main():
for arg in sys.argv:
print("'{}'".format(arg))
if __name__ == '__main__':
main()
When we execute it (of course, first we chmod +x
it), we will see the
following (lines prefixed with $
denote the command, the rest is command output).
$ ./args.py
'./args.py'
$ ./args.py one two
'./args.py'
'one'
'two'
$ ./args.py "one two"
'./args.py'
'one two'
Note that the zeroth index is occupied by the command itself (we will not use it now, but it can be used for some clever tricks) and notice how the second and third command differs from inside Python.
It should not be surprising though, recall the previous lab and handling of filenames with spaces in them.
Other interpreters
We will now try what other interpreters we can put in the shebang.
Construct an absolute (!) path (hint: man 1 realpath
) to the args.py
that we have used above.
Use it as a shebang on an otherwise empty file (e.g. use-args
) and make this file executable.
Hint.
And now run it like this:
./use-args
./use-args first second
You will see that the argument zero now contains a path to your script.
Argument on index one contains the outer script – use-args
and only after
these items are the actual command-line arguments (first
and second
).
While it may seem as an exercise in futility, it demonstrates an important principle: GNU/Linux is extremely friendly towards the creation of mini-languages. If you need to create an interpreter for your own mini-language, you only need to make sure it accepts the input filename as the first argument. And voilà, users can create their own executables on the top of it.
As another example, prepare the following file and store it
as experiment
(with no file extension)
and make the file executable:
#!/bin/bash
echo Hello
Note that we decided to drop the extension again altogether. The user does not really need to know which language was used. That is captured by the shebang, after all.
Now change the shebang to #!/bin/cat
.
Run the program again.
What happens?
Now run it with an argument (e.g., ./experiment experiment
).
What happened?
Answer.
Change the shebang to /bin/echo
. What happened?
Shebang: check you understand the basics
We will assume that both my-cat
and my-echo
are executable scripts
in the current directory.
my-cat
contains as the only content the following shebang #!/bin/cat
and
my-echo
contains only #!/bin/echo
.
Select all true statements.
You need to have enabled JavaScript for the quiz to work.Git principles
So far, our interaction with GitLab was over its GUI. We will switch to the command line for higher efficiency now.
Recall that GitLab is built on top of Git which is the actual versioning system used.
Git offers a command-line client that can download the whole project to your machine, track changes in it, and then upload it back to the server (GitLab in our case, but there are other products, too).
Before diving into Git itself, we need to prepare our environment a bit.
Setting your editor
Git will often need to run your editor. It is essential to ensure it uses the editor of your choice.
We will explain the following steps in more detail later on, for now ensure that
you add the following line to the end of ~/.bashrc
file
(replace mcedit
with editor of your choice):
export EDITOR=mcedit
Now open a new terminal and run (including the dollar sign):
$EDITOR ~/.bashrc
If you set the above correctly, you should see again .bashrc
opened
in your favorite text editor.
If not, ensure you have really modified your .bashrc
file (in your home
directory¨) to contain the same as above (no spaces around =
etc.).
$EDITOR
unless you
really know what you are doing. Git expects a certain behaviour from the
editor that is rarely satisfied by GUI editors but is always
provided by a TUI-based one.
The git
command
Virtually everything around Git is performed by its git
command. Its first
argument is always the actual action – often called a subcommand –
that we want to perform.
For example, there is git config
to configure Git and git commit
to perform
a commit (create a version).
There is always a built-in help available via the following command:
git SUBCOMMAND --help
Manual pages are also available as man git-SUBCOMMAND
.
Git has over 100 subcommands available. Don’t panic, though. We will start with less than 10 of them and even quite advanced usage requires knowledge of no more than 20 of them.
Configure Git
One of the key concepts in Git is that each commit (change) is authored – i.e., it is known who made it. (Git also supports cryptographic signatures of commits, so that authorship cannot be forged, but let us keep things simple for now.)
Thus, we need to tell Git who we are. The following two commands are the absolute minimum you need to execute on any machine (or account) where you want to use Git.
git config --global user.name "My real name"
git config --global user.email "my-email"
The --global
flag specifies that this setting is valid for all Git projects.
You can change this locally by running the same command without this flag
inside a specific project.
That can be useful to distinguish your free-lance and corporate identity, for example.
Note that Git does not check the validity of your e-mail address or your name (indeed, there is no way how to do it). Therefore, anything can be there. However, if you use your real e-mail address, GitLab will be able to pair the commit with your account etc. which can be quite useful. The decision is up to you.
Working copy (a.k.a. using Git locally)
The very first operation you need to perform is so called clone. During cloning, you copy your project source code from the server (GitLab) to your local machine. The server may require authentication for cloning to happen.
Cloning also copies the whole history of the project. Once you clone the project, you can view all the commits you have made so far. Without need for an internet connection.
The clone is often called a working copy. As a matter of fact, the clone is a 1:1 copy, so if someone deleted the project, you would be able to recreate the source code without any problem. (That is not true about the Issues or the Wiki as it applies only to the Git-versioned part of the project.)
As you will see, the whole project as you see it on GitLab becomes a directory on your hard-drive. As usually, there are also GUI alternatives to the commands we will be showing here, but we will focus our attention on the CLI variants only.
Cloning for the first time (git clone
)
For the following example, we will be using your submission repository under teaching/nswi177/2023/.
Move to your project (in the web browser) and click on the blue Clone button. You should see Clone with SSH and Clone with HTTPS addresses.
Copy the HTTPS address and use it as the correct address for the clone
command:
git clone https://gitlab.mff.cuni.cz/teaching/nswi177/2023/student-LOGIN.git
The command will ask you for your username and password. As usual with our GitLab, please use the SIS credentials.
Note that some environments may offer you to use some kind of a keyring or another form of a credential helper (to store your password). Feel free to use them, later on, we will see how to use SSH and asymmetric cryptography for seamless work with Git projects without any need for username/password handling.
It seems that some environments are rather forceful in their propagation of their password helpers (and if you enter your password incorrectly the first time, they do not provide a simple way to clear it).
Try running the following first if you encounter HTTP Basic: Access denied.
and no password prompt is shown
(see also this issue).
export GIT_ASKPASS=""
export SSH_ASKPASS=""
git clone ...
Note that you should have the student-LOGIN
directory on your machine now.
Move to it and see what files are there.
What about hidden files?
Answer.
Unless stated otherwise, all commands will be executed from the student-LOGIN
directory.
After the project is cloned, you can start editing files. This is completely orthogonal to Git and until you explicitly tell Git to do something, it does not touch your files at all.
Once you are finished with your changes (e.g., you fixed a certain bug), it is time to tell Git about the new revision.
Making changes (git status
and git diff
)
Before changing any file locally, open a new terminal and run git status
.
You should see something like this.
$ git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
We will now do a trivial change. Open the README.md
file in
your project (locally, i.e., not in GitLab browser UI) and add a link
to the Forum there.
Notice how links are created in Markdown and add your link as the last paragraph.
Run git status
after the change.
Read carefully the whole output of this command to understand what it reports.
Create a new file, 03/editor.txt
and put into it the name of the editor
that you have decided to use (feel free to create directory 03
in some
graphical tool or use mkdir 03
).
Again, check how git status
reports this change in your project directory.
What have you learned? Answer.
Run git diff
to see how Git tracks the changes you made.
You will see a list of modified files (i.e., their content differs from last commit) and you can also see a so called diff (sometimes also called a patch) that describes the change.
The diff will typically look like this:
diff --git a/README.md b/README.md
index 39abc23..61ad679 100644
--- a/README.md
+++ b/README.md
@@ -3,3 +3,5 @@
Submit your solutions for all graded tasks and quizzes here.
See details at course homepage: <https://d3s.mff.cuni.cz/teaching/nswi177/>
+
+Forum is at ...
How to read it? It is a piece of plain text that contains the following information:
- the file where the change happened
- the context of the change
- line numbers (
-3,3 +3,5
) - lines without modifications (starting with space)
- line numbers (
- the actual change
- lines added (starting with
+
) - lines removed (starting with
-
)
- lines added (starting with
Why this output is suitable for source code changes?
Note that git diff
is also extremely useful to check that the change you
made is correct as it focuses on the context of the change rather than the
whole file.
Making the change permanent (git add
and git commit
)
Once you are happy with these changes, you can stage the changes. This is Git-speak for saying these files (their current content) will be in the next revision. Often, you will stage all changed files. But sometimes you may want to split the commit as you actually worked on two different things and first you commit one part and then the other.
For example, you were fixing a bug, but also encountered a typo somewhere along the way. It is possible to add them both to the same commit, but it is much better to keep the commits well organized. The first commit would be a Bugfix in XY, the second one will be Typo fix.
That clearly states what the commit changed. It is actually similar to how you create functions in a programming language. A single function should do one thing (and do it well). A single commit should capture one change.
Now prepare your first commit (recall that commit is basically
a version or a named state of the project) – run git add 03/editor.txt
.
We will take care of the extension in README.md
later.
How git status
differs from the previous state?
Answer.
After staging all the relevant changes (i.e. git add
-ing all the needed files),
you create a commit.
The commit clears the staging status and you can work on fixing another bug :-).
Make your first commit via git commit
. Do not forget to use a descriptive commit message!
Note that without any other options, git commit
will open your text editor.
Write the commit message there and quit the editor (save the file first).
Your commit is done.
For short commit messages, you may use git commit -m "Typo fix"
where the whole commit
message is given as argument to the -m
option (notice the quotes because of the space).
How will git status
look like now?
Think about it first before actually running the command!
You basically repeat this as long as you need to make changes. Recall that each commit should capture a reasonable state of the project that is worth returning to later.
Sending the changes to the server
To upload the changes (commits) back to the server, you need to initiate a so-called push. It uploads all new commits (i.e., those between your clone operation and now) back to the server. The command is rather simple.
git push
It will again ask for your password and after that, you should see your changes on GitLab.
Which changes are on GitLab? Answer.
Exercise
Add the link to Forum as a second commit from the command line.
As a third commit, create 03/architecture.sh
script that contains the
right shebang, it is executable and prints the current architecture
(if you skipped this task in previous lab, simply run only uname
there or
look up the right switch in the man page now).
Push now the changes to GitLab. Note that all commits were pushed at the same time.
Browsing through the commits (git log
)
Investigate what is in the Repository -> Commits menu in GitLab.
Compare it with the output of git log
and git log --oneline
.
Yes, commands can be even that simple.
Getting the changes from the server
Change the title in the README.md
to also contain for YOUR NAME.
But this time make the change on GitLab.
To update your local clone of the project, execute git pull
.
What is the easiest way to ensure that you have also the change in README.md
on your machine after git pull
?
Answer.
Note that git pull
is quite powerful as it can incorporate changes that
happened virtually at the same time in both GitLab web UI as well as in
your local clone.
However, understanding this process requires also knowledge about
branches, which is out-of-scope for this lab.
git push
and starting with git pull
.
Working on multiple machines
Things get a little bit more complex when you work on multiple machines (e.g., mornings at a school desktop, evenings at your personal notebook).
Git is really powerful and can do extraordinary merges of your work.
But for now it is best to ensure the following workflow to minimize introducing incompatible changes.
Note that if things go horribly wrong, you can always do a fresh clone to a different directory, copy the files manually and remove the broken clone.
As long as you ensure that you work in the following manner, nothing will ever break:
- Clone your work on machine A.
- Work on machine A (and commit the result)
- Push on A (to server).
- Move to machine B and clone there.
- Work on B (commits).
- Push on B (to server).
- Move to A and pull (from server).
- Work on A (commits).
- Push on A.
- Pull on B.
- Work on B.
- Etc (i.e., go to 5).
Once you forgot some of the synchronizing pulls/pushes when switching between machines, problems can arise. They are easy to solve, but we will talk about that in later labs.
For now, you can always do a fresh clone and simply copy files with the new changes and commit again (not the right Git way, but it definitely works).
Going further
Git: check you remember the basic commands
Select all true statements.
You need to have enabled JavaScript for the quiz to work.Running tests locally
Because you now know about shebangs, executable bits and scripts in general, you have enough knowledge to actually run our tests locally without needing GitLab.
It should make your development faster and more natural as you do not need to wait for GitLab.
Simply execute ./bin/run_tests.sh
in the root directory of your project
and check the results.
You can even run only a specific subset of tests.
./bin/run_tests.sh 03-before
./bin/run_tests.sh 03-post
./bin/run_tests.sh 03-before/architecture
Note: If you are using your own installation of Linux, you might need
to install the bats
(or bash-bats
or bats-core
) package first.
Before-class tasks (deadline: start of your lab, week February 27 - March 3)
The following tasks must be solved and submitted before attending your lab. If you have lab on Wednesday at 10:40, the files must be pushed to your repository (project) at GitLab on Wednesday at 10:39 latest.
For virtual lab the deadline is Tuesday 9:00 AM every week (regardless of vacation days).
All tasks (unless explicitly noted otherwise) must be submitted to your submission repository. For most of the tasks there are automated tests that can help you check completeness of your solution (see here how to interpret their results).
03/git.txt
(40 points, group git
)
You will need the following repository.
https://d3s.mff.cuni.cz/f/teaching/nswi177/202223/labs/task-03.git/
There are multiple files in this repository.
Copy the one mentioned in the commit messages to 03/git.txt
.
In other words, clone the above repository, view existing commits and in
the commit messages, you will see a filename that you should copy to your
own project (as 03/git.txt
).
Automated tests only check presence of the file, not that you have copied the right one.
03/architecture.sh
(30 points, group shell
)
Extend your solution from the previous lab and write a script that that prints what hardware architecture your computer has.
Ensure your script has the right shebang and executable bit set.
03/editor.txt
(30 points, group git
)
Store into this file the name of your text editor that you use from
the command line.
Save simply the command that you execute such as joe
.
If you have went through the text above, you are already done :-).
Post-class tasks (deadline: March 19)
We expect you will solve the following tasks after attending the labs and hearing feedback to your before-class solutions.
All tasks (unless explicitly noted otherwise) must be submitted to your submission repository. For most of the tasks there are automated tests that can help you check completeness of your solution (see here how to interpret their results).
Using Git on the command line (70 points, group git
)
This task is not tested through automated tests in GitLab.
We need to distribute you passwords for this repository
(we do not want to bind it with your SIS account).
We will do that during week 02.
For this task you will be using your SIS/GitLab login but a different password.
The password was uploaded to the Wiki that is part of your NSWI177 project.
The information is on a page called Secrets.
See the screenshot below for details how to find the page.
You will need the following repository (obviously, replace LOGIN
with
your SIS/GitLab login). Use the password from the Wiki page Secrets
(recall that pasting can be done simply by selecting the text here and pasting
it into the terminal with middle mouse click).
This URL has no browser-friendly version, do not be surprised by 404 if you
open it in a web browser.
https://lab.d3s.mff.cuni.cz/nswi177/git-03/LOGIN.git
After you clone it, create a file 03.txt
inside it.
Make two commits with this file.
In the first commit, insert 2022
as its only content (i.e., to 03.txt
).
As a second commit, modify it to 2023
.
Push your changes back to the repository.
03/local.txt
(30 points, group shell
)
In this task, you will only store a specific string into this file.
The correct answer is printed by the automated tests when you execute them
locally (i.e., with 03-post/local
).
Learning outcomes
Learning outcomes provide a condensed view of fundamental concepts and skills that you should be able to explain and/or use after each lesson. They also represent the bare minimum required for understanding subsequent labs (and other courses as well).
Conceptual knowledge
Conceptual knowledge is about understanding the meaning and context of given terms and putting them into context. Therefore, you should be able to …
-
explain what is a script in a Linux environment
-
explain what is a shebang (hashbang) and how it influences script execution
-
understand the difference when script has or does not have executable bit set
-
explain what is a working directory
-
explain why working directory is private to a running program
-
explain how are parameters (arguments) passed in a script with a shebang
-
explain what is a Git working copy (clone)
-
optional: explain why
cd
cannot be a normal executable file like/usr/bin/ls
-
optional: understand major differences between
/bin/sh
and/bin/bash
shebangs
Practical skills
Practical skills are usually about usage of given programs to solve various tasks. Therefore, you should be able to …
-
create a Linux script with correct shebang
-
set the executable script using the
chmod
utility -
access command-line arguments in a Python program
-
configure author information in Git
-
setup default editor in a shell (set
EDITOR
in~/.bashrc
) -
clone a Git repository over HTTPS in shell
-
review changes in a Git working copy (
git status
command) -
create a Git commit from command-line (
git add
andgit commit
commands) -
upload new commits to Git server or download new ones to a working copy (assuming single user project,
git push
andgit pull
commands) -
view summary information about previous commits using
git log
-
optional: customize Git with aliases
This page changelog
-
2023-02-27: Warning about password helpers and executable bit in GitLab UI.
-
2023-02-24: Add connection details about the Git repository from post-class task.