Přeložit do češtiny pomocí Google Translate ...
Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
The goal of this lab is to introduce you to the Git command-line client and how to write reusable scripts.
Scripts
A script in the Linux environment is any program that is interpreted when being run (i.e., the program is distributed as a source code). In this sense, there are shell scripts (the language is the shell as you have seen it last time), Python, Ruby, etc. scripts.
The advantage of so-called scripting languages is that they do require only a text editor for development and that they are easily portable. Disadvantage is that you need to install the interpreter first. Fortunately, Linux typically comes with many interpreters preinstalled and starting with a scripting language is thus very easy.
Simple shell scripts
To write a shell script, we simply write the commands into a file (instead of typing them in a terminal).
As a simple example, we want to show a hexadecimal dump of a file that we download from the Internet. We will download the GIF sample from this page,
The commands would be
wget "http://www.matthewflickinger.com/lab/whatsinagif/images/sample_1.gif" -O /tmp/mf_sample.gif
hexdump -C /tmp/mf_sample.gif
Store the above in a file first.sh
.
Now cd
into the directory with this file and run
sh first.sh
What happened?
Notice two things in the script: we have used quotes around the URL and we stored the
file in /tmp
.
Last time, we have used quotes when there was a space in the filename.
Generally, you would use quotes when there is a possibility that the
argument could be “tricky”.
We will talk about this later on when talking about variables and their
expansion, for now remember that quotes around arguments are generally
a safe way to prevent surprises.
Change the script to first execute cd /tmp
and use relative file paths.
Then run the script again.
What happened? Have the script terminated in /tmp
?
Answer.
This is an essential take away – scripts (or any programs for that matter) can change their working (current) directory, but working directory is always local to the process (running program). Thus, when the program terminates, the caller (i.e., the shell) is still in the same directory.
If you want to see what is happening, run the script as sh -x first.sh
.
Try it now.
For longer scripts, it is better to print your own messages as -x
tends to
become too verbose and it is rather a debugging aid.
To print a message to the terminal, you can use the echo
command.
With few exceptions (more about these later), all arguments are simply
echoed to the terminal.
Create a script echos.sh
with the following content and explain
the differences.
echo alpha bravo charlie
echo alpha bravo charlie
echo "alpha bravo" charlie
Answer.
If you have some Python script on the disk, you can execute it as
python script.py
Executable bit
Running scripts by specifying the interpreter to use (i.e., the command to run the script file with) is not very elegant. Linux offers another way when we mark the file as an executable and Linux handles the rest.
Actually, when we execute the cat
command or mc
, there is a file
(usually in the /usr/bin
directory) that is named cat
or mc
and that
is marked executable.
Notice that there is no extension.
To have an idea about the amount of programs installed, look into /usr/bin
.
To mark our first script as executable, simply run chmod +x first.sh
.
We will talk about other features of chmod
(and access rights in general)
later on, for now remember only chmod +x
.
Run ls
in the folder again.
You should see first.sh
now printed in green.
If not, you can try ls --color
or check that you have run chmod
correctly.
When you type a command (e.g., cat
), shell looks into so-called $PATH
to actually find the file with the program.
Unlike in other operating systems, shell does not look into the working
directory when program cannot be found in the $PATH.
To run a program in the current directory, we need to specify its path. Luckily, it does not have to be an absolute path, but a relative one is sufficient. Thus, we need to execute
./first.sh
Try it yourself.
If you are in a different directory, running ../first.sh
(or similar) would
work too.
Shebang (hashbang)
Create the simplest Python program now. It should only contain print("Hello")
without main
or without any other content.
Store it into the file, make the script executable, and run it. Hint.
The result is not very satisfying, but the reason is very simple. Linux executed this script as a shell one!
To fix that, we need to specify which interpreter to use. This is done via so-called shebang or hashbang. As a matter of fact, you have already encountered it several times.
If the first line of the script starts with #!
(hence the name hash and bang), Linux expects a path to the interpreter
after it and will use this interpreter instead of the default sh
.
It is a good practice to specify the interpreter always and never rely on the default fallback to the shell script.
For shell scripts, we will be using #!/bin/bash
, for Python we need to
use #!/usr/bin/env python3
.
In small print, note that most interpreters use #
to denote a comment which
means that no extra handling is needed to skip the first line
(as it is really not needed by the interpreter).
You probably noticed that when we have executed sh script
we have used
sh
and not bash
.
You are completely right and at this moment, sh
or bash
would make no
difference (sh
refers to the original shell born in the seventies, bash
is its improved version).
Later on, we will be using some advanced features that are not present in
plain sh
and it is just easier to remember now to use /bin/bash
there.
For Python and other more complex languages, you will often see the variant
with env
and python3
(or ruby
).
We will talk about this later on, for now, please just remember to use
this version.
Fix hello.py
from the beginning of this section and run it again.
Answer.
Command-line arguments
Command-line arguments (such as -l
for ls
or -C
for hexdump
) are
the usual way to control the behaviour of CLI tools in Linux.
For us, as developers, it is important to learn how to work with them inside
our programs.
We will talk about using these arguments in shell scripts later on, today we will handle them in Python.
Accessing these arguments in Python is very easy.
We need to add import sys
to our program and then we can access these arguments
in the sys.argv
list.
Write a program that prints its arguments. Answer.
Let us execute it.
./args.py
./args.py one two
./args.py "one two"
Note that the zeroth index is occupied by the command itself (we will not use it now, but it can be used for some clever tricks) and notice how the second and third command differs from inside Python.
Other interpreters
Look at the following script and explain what it does (we will need it later).
#!/usr/bin/env python3
import sys
def run_with_file(input_file):
total = 0
for line in input_file:
line = line.strip()
if (not line) or line.startswith('#'):
continue
parts = line.split()
if parts[0] == 'echo':
print(total)
elif parts[0] == 'add':
total += int(parts[1])
else:
print("Unknown command ('{}')!".format(parts[0]))
def main():
if len(sys.argv) != 2:
print("Run with exactly one argument - filename with commands.")
return
with open(sys.argv[1]) as inp:
run_with_file(inp)
if __name__ == '__main__':
main()
Answer.
We will now try which interpreters we can try to put into the shebang.
Construct an absolute (!) path (hint: man 1 realpath
)
to the args.py
we used above.
Use it as a shebang on an otherwise empty file (e.g. use-args
) and make this file executable.
Hint.
And now run it like this:
./use-args
./use-args first second
You will see that the argument zero now contains a path to your script.
Argument on index one contains the outer script – use-args
and only after
these items are the actual command line arguments (first
and second
).
This is essential – when you add a shebang, the interpreter receives
the input filename as the first argument.
In other words – every Linux-friendly interpreter shall start evaluating
a program passed to it as a filename in the first argument.
While it may seem as an excercise in futility, it demonstrates an important principle: GNU/Linux is extremely friendly towards the creation of mini-languages. If you need to create an interpreter for your own mini-language (such as the summation one at the beginning of this section), you only need to make sure it accepts the input filename as the first argument. And voila, users can create their own executables on top of it.
As another example, prepare the following file and store it
as experiment
(i.e., no file extension)
and make the file executable.
#!/bin/bash
echo Hello
Note that we decided to drop the extension again altogether. The user does not really need to know which language was used. That is captured by the shebang, after all.
Now change the shebang to #!/usr/bin/cat
.
Run the program again.
What happens?
Now run it with an argument (e.g., ./experiment experiment
).
What happened?
Answer.
Change the shebang to /usr/bin/echo
. What happened?
Answer.
Git on command-line
This section will describe how to use Git on the command-line as opposed to using the GUI superstructure offered by GitLab. We already described the motivation for both Git and GitLab in Lab #1. Here we will show how to access the files from the command-line to improve your experience when using Git.
While it is possible to edit many files on-line in GitLab, it is much easier to have them locally and use a better editor (or IDE). Furthermore, not all tools have their on-line counterparts and you have to run them locally.
Therefore, Git offers a command-line client that can download the whole project to your machine, track changes in it, and then upload it back to the server (GitLab in our case but there are other products too).
As you will see, the whole project as you see it on GitLab becomes a directory on your hard-drive and the whole process of submitting changes is much easier. As usual, there are also GUI alternatives to the commands we will be showing here, but we will devote our attention to the CLI variants only.
Setting your editor
Git will often need to run your editor. It is essential to ensure it uses the editor of your choice.
We will explain following steps in more detail later on, for now ensure that
you add the following line to the end of ~/.bashrc
file
(replace mcedit
with editor of your choice).
export EDITOR=mcedit
Now open a new terminal and run (including the dollar sign)
$EDITOR ~/.bashrc
If you set the above correctly, you should see again .bashrc
opened
in your favorite text editor.
You need to close all terminals for this change to make an effect (i.e., before you start using any of the Git commands mentioned below).
Manpages for Git
Git CLI client is generally used as
git subcommand --options-for-subcommand
That is, you always run git
and its first argument is name of the Git
command to execute.
Manual pages for Git are split into separate pages named git-subcommand
(that is, the command config
is documented in man 1 git-config
).
You can also run git subcommand --help
or even git help subcommand
.
Configure Git
One of the key concepts in Git is that each commit (change) is authored – i.e., it is known who made it. We will skip commit signing here and will not be considering identity forge/theft here.
Thus, we need to tell Git who we are. The following two commands are the absolute minimum you need to execute on any machine (or account) where you want to use Git.
git config --global user.name "My real name"
git config --global user.email "my-email"
The --global
flag specifies that this setting is valid for all Git projects.
You can change this locally by running the same command without this flag
inside a specific project.
That can be useful to distinguish your free-lance and corporate identity, for example.
Note that Git does not check the validity of your e-mail address or your name (indeed, there is no way how to do it). Therefore, anything can be there. However, if you use your real e-mail address, GitLab will be able to pair the commit with your account etc. which can be quite useful.
The decision is up to you.
Cloning for the first time (git clone
)
For the following example, we will be using the repository teaching/nswi177/2021-summer/common/csv-templater.
Fork this repository to your own namespace (in GitLab via web browser) first. Hint.
Forking a project means creating a copy for yourself on GitLab. Create the fork – you do not have write access to our repository and we do not want you to fight over the same files anyway.
Move to your (forked) project and click on the blue Clone button. You should see Clone with SSH and Clone with HTTPS addresses.
Copy the HTTPS address and use it as the correct address for the clone
command.
git clone https://gitlab.mff.cuni.cz/YOUR_LOGIN/csv-templater.git
The command will ask you for your username and password. As usual with our GitLab, please use the SIS credentials.
Note that some environments may offer you to use some kind of a keyring or another form of a credential helper. Feel free to use them, later on, we will see how to use SSH and asymetric cryptograhpy for seamless work with Git projects without any need for username/password handling.
Note that you should have the csv-templater
directory on your machine now.
Move to it and see what files are there.
What about hidden files?
Answer.
Unless stated otherwise, all commands will be executed from the csv-templater
directory.
Making a change (git status
and git diff
)
Fix typos on line 11 in the Python script and in the README.md
and run
git status
before and after the change.
Read carefully the whole output of this command to understand what it reports.
UPDATE In the process of cleaning the script we also removed the typos ;-). Thus for the emulated fix, change (with Python formatting) to (with Python-style formatting). Sorry.
Create a new file, demo/people.csv
with at least three columns and 4 rows.
Again, check how git status
reports this change in your project directory.
What have you learned? Answer.
Run git diff
to see how Git tracks the changes you made.
Why this output is suitable for source code changes?
Note that git diff
is also extremely useful to check that the change you
made is correct as it focuses on the context of the change rather than
the whole file.
Making the change permanent (git add
and git commit
)
Now prepare for your first commit (recall that commit is basically
a version or a named state of the project) – run git add csv_templater.py
.
We will take care of the typo in README.md
later.
How git status
differs from the previous state?
Answer.
Make your first commit via git commit
. Do not forget to use a descriptive commit message!
Note that without any other options, git commit
will open your text editor.
Write the commit message there and quit the editor (save the file first).
Your commit is done.
For short commit messages, you may use git commit -m "Typo fix"
where the whole commit
message is given as argument to the -m
option (notice the quotes because of the space).
How will git status
look like now?
Think about it first before actually running the command.
Sending the changes to the server
We will now propagate your changes back to GitLab by using git push
.
It will again ask for your password and after that, you should see
your changes on GitLab.
Which changes are on GitLab? Answer.
Excercise
Add the second typo as a second commit from the command-line.
By the way, have you tried running the CSV templater? Add the following example to the README as the third commit.
./csv_templater.py -t demo/breed.txt demo/patrol.csv
As another commit, add the CSV file with extra data you created some time ago. Hint.
Push now the changes to GitLab. Note that all commits were pushed at the same time.
Browsing through the commits (git log
)
Investigate what is in the Repository -> Commits menu in GitLab.
Compare it with the output of git log
and git log --oneline
.
Getting the changes from the server
Add another example to the README but this time make the change on GitLab.
./csv_templater.py -t demo/call.txt -o "call-{name}.txt" demo/patrol.csv
To update your local clone of the project, execute git pull
.
Note that git pull
is quite powerful as it can incorporate changes that
happened virtually at the same time in both GitLab web UI as well as in
your local clone.
However, understanding this process requires also knowledge about
branches, which is out-of-scope for this lab.
Thus for now, remember to not mix changes locally and in GitLab UI
(or on a different machine) without always ending with git push
and starting with git pull
.
Other bits
Again, several assorted notes that do not fit into the sections above but are worth knowing.
If you do not have a colorful terminal (unusual but still possible),
you can use ls -F
to distinguish file types: directories will
have a slash appended, executable files will have a star next to
their filename.
Running ./tools/run_tests.sh
from inside your clone of your grading
repository (i.e., student-YOUR_LOGIN
) will run the tests that are
normally executed in GitLab.
Refer to the graded tasks section for another take on the tests that do not require downloading all tests again and again.
Graded tasks
From now on, prefer to use the command-line client for submitting the tasks.
Using Git CLI (15 points)
Use git config
to temporarily change your e-mail to YOUR_SIS_LOGIN@gitlab.mff.cuni.cz
(surely, replace YOUR_SIS_LOGIN
with the right one) and make one commit to your
graded task repository with this e-mail.
You can create a new file 03/git_cli.txt
if you do not know what to change ;-).
Add the word graded-task
to the commit message, please.
This task is not automatically checked by the nswi177-tests
pipeline (CI) on GitLab.
03/scoring.py
(40 points)
Write a Python program that recognizes the following mini-language for computing tournament scoring.
add team-zulu task1 5
add team-alpha task1 10
add team-alpha task2 5
add team-bravo task1 10
add team-yankee task1 5
summary After first week
csv week1.csv
add team-zulu task2 15
summary End of tournament
podium
We expect that your program could be used in a shebang (recall how sys.argv
shall
be used) for such data/program.
That is, adding #!/absolute/path/to/03/scoring.py
and chmod +x
-ing it would
allow to run the above as a script.
We expect that the above results in (note the ordering)
After first week
team-alpha: 15
team-bravo: 10
team-yankee: 5
team-zulu: 5
End of tournament
team-alpha: 15
team-bravo: 10
team-yankee: 5
team-zulu: 20
Medal podium
team-zulu
team-alpha
team-bravo
and the file week1.csv
would contain (again, orderd by team name)
team,score
team-alpha,15
team-bravo,10
team-yankee,5
team-zulu,5
Your program does not need to handle wrong input or CSV file creation issues (such as path to non-existent directory etc.).
You can choose to implement only part of the assignment, we consider
the commands podium
and csv
as extras, most points will be awarded
for add
and summary
(they really does not make sense without each other).
podium
command should print the teams sorted by the points (if two teams have
the same amount of points, the order doesn’t matter).
Feel free to reuse the code from us or as a starting point (but not the code of your mates in the course).
Update: any text after summary
is supposed to be copied as-is, i.e. it
represents a user-defined title.
For podium
, the text is always Medal podium, text after csv
represents
file path.
03/git-identity.sh
(15 points)
Write a shell script (including the executable bit and the right shebang) that prints your Git identity. That is, your Git username and your Git e-mail.
Do not use --global
flag inside the script to allow for testing.
Update: use git
commands, do not try reading from ~/.gitconfig
or similar.
Update 2: print username and e-mail each on its own line.
03/tests.txt
(30 points)
Explain what the following program does in your own words.
If you do not know some of the commands, look up their meaning in their manpages.
Among other things, answer from which directory would you run this script and what
(in broad terms only) the command bats __tests/[01][0-9].bats
do?
Note that the point of using man
is not to learn everything about the command.
Instead, use it to get the idea of what the program does: does it download a file
from the Internet, does it convert different types of images, etc.
#!/bin/bash
rm -rf __tests
mkdir __tests
cd __tests
wget https://d3s.mff.cuni.cz/f/teaching/nswi177/tests.tar.gz
tar xzf tests.tar.gz
cd ..
bats __tests/[01][0-9].bats
Deadline: April 5, AoE
Solutions submitted after the deadline will not be accepted.
Note that at the time of the deadline we will download the contents of your project and start the evaluation. Anything uploaded/modified later on will not be taken into account!
Note that we will be looking only at your master branch (unless explicitly specified otherwise), do not forget to merge from other branches if you are using them.