Přeložit do češtiny pomocí Google Translate ...
Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.
Table of contents
- Standard input and outputs
- Program composition (a.k.a. pipes)
- Program return (exit) code
- More examples
- Shell customization
- Other bits
- Graded tasks
The goal of this lab is to define and thoroughly understand the concepts
of standard input, output, and standard error output.
This would allow us to understand program I/O redirection and composition
of different programs via pipes.
We will also customize our shell
environment a little by investigating command aliases and the .bashrc
file.
This lab defines a lot of concepts that might be new to you. Please, do not skip the more theoretical parts as we will build on top of them in the following labs.
Standard input and outputs
Standard output
Standard output (often shortened to stdout) is the default output that you can
use by calling print("Hello")
if you are in Python, for example.
Stdout is used by the basic output routines in almost every programming language.
Generally, this output has the same API as if you were writing to a file.
Be it print
in Python, System.out.print
in Java or printf
in C
(where the limitations of the language necessitate the existence of a pair of
printf
and fprintf
).
This output is usually prepared by the language runtime together with the shell and the operating system. Practically, the standard output is printed to the terminal or its equivalent (and when the application is launched graphically, stdout is typically lost).
Note that in Python you can access it explicitly via sys.stdout
.
Standard input
Similarly to stdout, almost all languages have access to stdin that represents the default input. By default, this input comes from the keyboard, although usually through the terminal (i.e., stdin is not used in graphical applications for reading keyboard input).
Note that the function input()
that you may have used in your Python
programs is an upgrade on top of stdin because it offers basic editing
functions.
Plain standard input does not support any form of editing.
If you want to access the standard input in Python, you need to use sys.stdin
explicitly.
As one could expect, it uses a file API, hence it is possible to read a line
from it calling .readline()
on it or iterate through all lines.
In fact, the iteration of the following form is a quite common pattern for many Linux utilities (they are usually written in C but the pattern remains the same).
for line in sys.stdin:
...
Standard I/O redirection
As a technical detail, we mentioned earlier that standard input and outputs are prepared (partially) by the operating system. This also means that it can be changed (i.e., initialized differently) without changing the program. And the program may not even “know” about it.
This is called redirection and it allows the user to specify that the standard output would not go to the screen (terminal) but rather to a file. As stated above, the standard output uses file API and thus it is technically possible from the language runtime standpoint as well.
This redirection has to be done before the program is started and it has to be done by the caller. For us, it means we have to do it in the shell.
It is very simple: at the end of the command we can specify > output.txt
and
everything that would be normally printed on a screen goes to output.txt
.
One big warning before you start experimenting: the output redirection is a low-level
operation and has no form of undo.
Therefore, if the file you redirect to already exists, it will be overwritten
without any prompting.
And without any easy option to restore the original file content
(and for small files, the restoration is technically impossible for most
file systems used in Linux).
As a precaution, get into a habit to hit Tab
after you specify the filename.
If the file does not exist, the cursor will not move.
If the file already exists, the tab completion routine will insert a space.
As the simplest example, the following two commands will create files one.txt
and
two.txt
with the words ONE
and TWO
inside (including newline at the end).
echo ONE > one.txt
echo TWO >two.txt
Note that the shell is quite flexible in the use of spaces and both options are valid
(i.e., one.txt
does not have a space as the first character in the filename).
If you know Python’s popen
or a similar call, they also offer the option
to specify which file to use for stdout if you want to do a redirection
in your program
(but only for a new program launched, not inside a running program).
If you recall Lab 02, we mentioned that
the program cat
is used to concatenate files.
With the knowledge of output redirection, it suddenly starts to make more sense as
the (merged) output can be easily stored in a file.
cat one.txt two.txt >merged.txt
The shell also offers an option to append the output to an existing file
using the >>
operator.
Thus, the following command would add UNO
as another line into one.txt
.
echo UNO >>one.txt
If the file does not exist, it will be created.
For the following example we will need the program tac
that reverses the order of
individual lines but otherwise works like cat
. Try this first.
tac one.txt two.txt
If you have executed the commands above, you should see the following:
UNO
ONE
TWO
Try the following and explain what happens (and why) if you execute
tac one.txt two.txt >two.txt
Answer.
Similarly to the redirection of the output, it is possible to redirect the input too.
The syntax is < input.file
.
Try it with the longest.py
from the first set of graded tasks (and if you have decided
not to implement it, simply uncomment the I just read line).
Filters
Many utilities in Linux work as so-called filters. They accept the input from stdin and print their output to stdout.
One such example is cut
that can be used to print only certain columns
from the input.
For example, running it as cut -d: -f 1
above /etc/passwd
will display a list
of accounts (usernames) on the current machine.
Explain the difference between the following two calls:
cut -d: -f 1 </etc/passwd
cut -d: -f 1 /etc/passwd
Answer.
The above behaviour is quite common for most filters: you can specify the input file explicitly but when missing, the program reads from the stdin.
In your own filters, you should also follow this approach: the amount of source
code you need to write is negligible, but it gives the user flexibility in use.
The following snippet demonstrates how you can easily add the same behaviour to
a program called rev.py
that simply reverses the ordering of characters on each
line.
#!/usr/bin/env python3
import sys
def reverse(inp):
for line in inp:
print(line.rstrip('\n')[::-1])
def main():
if len(sys.argv) == 1:
reverse(sys.stdin)
elif len(sys.argv) == 2:
with open(sys.argv[1], "r") as inp:
reverse(inp)
else:
# Handle error
pass
if __name__ == '__main__':
main()
Note that rev
is also a standard Linux program.
Standard error output
Let us return to our files one.txt
and two.txt
and execute the following
command (assuming nonexistent.txt
really does not exist).
cat one.txt nonexistent.txt two.txt
You will notice that cat
printed a warning about the non-existent file,
probably somewhere in the middle of the output
(note that the word probably is needed here as we will explain later).
Now execute the same as above but with output redirection.
cat one.txt nonexistent.txt two.txt >merged.txt
The error was still printed in the terminal and merged.txt
contains the
contents from one.txt
and two.txt
.
This is caused by the use of so-called standard error output (often just stderr) and it represents a very sensible behaviour. The error message is displayed to the user even when an output redirection is used.
This is possible because each program has – apart from stdout and stdin –
also stderr: a separate stream that by default goes to the terminal (i.e., the same
medium as for stdout) but is not affected by the output redirection.
In Python, it is available as sys.stderr
and it is (as sys.stdout
)
an opened file.
Extending our rev.py
program, we would use stderr in the following way:
def main():
if len(sys.argv) == 1:
reverse(sys.stdin)
elif len(sys.argv) == 2:
try:
with open(sys.argv[1], "r") as inp:
reverse(inp)
except FileNotFoundError as e:
print("{}: No such file".format(sys.argv[1]), file=sys.stderr)
else:
# Handle error
pass
Standard error output is also often used for printing debugging information or for logging in general. The reasons should be obvious: the output always goes to the screen, thus it is a good way to inform the user about the progress (for example) and it does not change the real output of the program (as stdout is logically a different file).
Under the hood
Technically, opened files have so-called file descriptors that are used when an application communicates with the operating system (recall that file operations have to be done by the operating system). The file descriptor is an integer that serves as an index in a table of opened files that is kept for each process (i.e., running instance of a program).
This number – the file descriptor — is then passed as (typically) the first
parameter in a syscall, such as write
(write
takes at least two arguments –
opened file descriptor and the byte buffer to write – in our examples we will
pass the string directly for simplicity).
Therefore, when your application calls print("Message", file=some_file)
,
eventually your program would call the operating system as write(3, "Message\n")
where 3 denotes the file descriptor for the opened file represented by
the some_file
handle.
While the above may look like a technical detail, it will help you understand
why the standard error redirection looks the way it does, or why file operations
in most programming languages require opening the file first before writing to
it (i.e., why write_to_file(filename, contents)
is never a primitive operation).
In any unix-style environment, the file descriptors 0, 1, and 2 are always used for standard
input, standard output, and standard error output, respectively.
That is, the call print("Message")
in Python eventually ends up in calling
write(1, "Message\n")
and a call to print("Error", file=sys.stderr)
calls
write(2, "Error\n")
.
This also explains why the output redirection is even possible – the caller (e.g., the shell) can simply open the file descriptor 1 to point to a different file as a normal application does not touch it at all.
The fact that stdout and stderr are logically different streams (files) also explains the word probably in one of the examples above. Even though they both end in the same physical device (the terminal), they may use a different configuration: typically, the standard output is buffered, i.e., output of your application goes to the screen only when there is enough of it while the standard error is not buffered – it is printed immediately. The reason is probably obvious – error messages should be visible as soon as possible, while normal output might be delayed to improve performance.
Note that the buffering policy can be more sophisticated, but the essential take away is that any output to the stderr is displayed immediately while stdout might be delayed.
Redirecting standard error output
To redirect the standard error output, you can use >
again, but this time preceded
by the number 2
(that denotes the stderr file descriptor).
Hence, our cat
example can be transformed to the following form where err.txt
would contain the error message and nothing would be printed on the screen.
cat one.txt nonexistent.txt two.txt >merged.txt 2>err.txt
Notable special files
We already mentioned several important files under /dev/
.
With output redirection, we can actually use some of them right away.
Run cat one.txt
and redirect the output to /dev/full
and then
to /dev/null
.
What happened?
Especially /dev/null
is a very useful file as it can be used in any
situation when we are not interested in the output of a program.
Generic redirection
Shell allows us to redirect outputs quite freely using file descriptor numbers before and after the greater-than sign.
For example, >&2
specifies that the standard output is redirected to a standard
error output.
That may sound weird but consider the following mini-script
(of course, if we knew about curl --silent
we would call it directly,
but for the sake of this example, let us pretend we know only about wget
).
echo "Downloading robots.txt"
wget https://www.mff.cuni.cz/robots.txt -O /tmp/mff_robots.txt
cat /tmp/mff_robots.txt
By the way, what is missing? Answer.
What happens if the user redirects the output of this script (i.e., robots.sh >robots.txt
)?
The redirection is inherited by the nested programs
(you can think about it as nested function calls) and thus every output
is redirected.
Here, many messages of wget
are still visible (which does not hurt),
but also the output file contains the progress message.
We probably do not want that.
To fix that, we will redirect the echo
to stderr (so it is displayed
on the screen) like this:
echo "Downloading robots.txt" >&2
Recall that 2
is the file descriptor of the error output and >
redirects
the standard (i.e., normal) output. Hence, we redirect stdout to stderr.
And we probably do not need wget
to be so verbose.
We can use -q
or forcefully discard all stderr output by
appending 2>/dev/null
.
Sometimes, we want to redirect stdout and stderr to one single file.
In these situations we can use >output.txt 2>output.txt
or even better
>output.txt 2>&1
. However, what about 2>&1 >output.txt
, can we use it as well?
Try it yourself!
Hint.
Program composition (a.k.a. pipes)
We finally move to the area where Linux excels: program composition. In essence, the whole idea behind Unix-family of operating systems is to allow easy composition of various small programs together.
Mostly, the programs that are composed together are filters and they operate on text inputs. These programs do not make any assumptions on the text format and are very generic. Special tools (that are nevertheless part of Linux software repositories) are needed if the input is more structured, such as XML or JSON.
The advantage is that composing the programs is very easy and it is very easy to compose them incrementally too (i.e., add another filter only when the output from the previous ones looks reasonable). This kind of incremental composition is more difficult in normal languages where printing data requires extra commands (here it is printed to the stdout without any extra work).
The disadvantage is that complex compositions can become difficult to read. It is up to the developer to decide when it is time to switch to a better language and process the data there. A typical division of labour is that shell scripts are used to preprocess the data: they are best when you need to combine data from multiple files (such as hundreds of various reports, etc.) or when the data needs to be converted to a reasonable format (e.g. non-structured logs from your web server into a CSV loadable into your favorite spreadsheet software or R). Computing statistics and similar tasks are best left to specialized tools.
Needless to add, Linux offers plenty of tools for statistical computations or plot drawing utilities that can be controlled in CLI. Mastering of these tools is, unfortunately, out of topic for this course.
Motivation example
As a somewhat artificial example, we will consider the following CSV that can be downloaded from here (it is the example from the lecture with the speeds of USB disks).
disk,duration
/dev/sdb,1008
/dev/sdb,1676
/dev/sdc,1505
/dev/sdc,4115
...
We want to know what was the longest duration of the copying: in other words, the maximum of column two.
Surely, we do not consider using any spreadsheet software as we want to stay in the terminal.
Recall that you have already seen a cut
command that is able to cut only specific
columns from a file.
There is also the command sort
that sorts lines.
Thus our little script could look like this:
#!/bin/bash
cut -d, -f 2 <disk-speeds-data.csv >/tmp/disk_numbers.txt
sort </tmp/disk_numbers.txt
Prepare this script and run it.
It is far from perfect – sort
have sorted the lines alphabetically, not by
number values.
However, nothing that would not be solvable with man sort
: add -n
and re-execute.
The last line shows the maximum duration of 5769 seconds.
One could object that printing all numbers is redundant and only fills the screen with garbage. That is true but consider how long such a task would take to implement with your favorite programming language. Perhaps, loading this into your office suite would be probably faster, but we are after a solution that is repeatable and automatable. And we will print only the last line soon.
The big disadvantage is that this solution creates a temporary file.
There are two issues with that.
First of all, for large data sets, the mere existence of the temporary file
can cause issues.
A bit more subtle but much more dangerous problem is that the script
has a hardcoded path for the temporary file.
This script can be executed in two terminals concurrently, resulting in
various unexpected results.
Do not be fooled by the fact that this script is so short that the chances
of really concurrent execution are negligible.
It is a trap that is waiting to spring.
We will later talk about the use of mktemp(1)
, for this example we can
go without the temporary file completely.
The first program prints to the standard output and the second one reads from the standard input. In essence, the family of unix systems is built on top of the ability to chain such programs together using pipes.
cut -d, -f 2 <disk-speeds-data.csv | sort
The pipe symbol |
really means that the standard output of the first (left-hand)
process is bound to the standard input of the second process.
The binding is done by the kernel and the data are not written to a disk
but instead passed internally
(using memory buffers but that is a technical detail).
The result is the same, but we escaped the pitfalls of using temporary files and the result is actually even more readable. The readability may not be obvious, but once you get used to it, reading such programs is very straightforward. Each program in the pipeline denotes a type of transformation. These transformations are composed together to produce the final result.
We promised we would print only the biggest number.
The tail
utility prints only the last ten lines, but with -n 1
it will print
only the last (one) line.
The piping is not limited to two programs and thus the whole script
would look like
cut '-d,' -f 2 | sort -n | tail -n 1
Note that we have removed the hard-coded path from the script. Instead, the user is supposed to run the script like in the command below. Note that this actually makes the script more flexible: it is easy to test such a script with different inputs and the script can be again used in a bigger pipeline.
get-slowest.sh <disk-speeds-data.csv
Note that for many programs you can specify the use of stdin explicitly
by using -
(dash) as the input filename.
Program return (exit) code
Execute the following commands:
ls / || echo "ls failed"
ls /nonexistent-filename || echo "ls failed"
The section will explain how it is possible that echo
knew whether ls
was successful or not.
Understanding the following is essential because together with pipes and standard I/O redirection, form the basic building blocks of shell scripts.
First of all, we will introduce a syntax for conditional chaining of program calls and then describe how to prepare such programs in Python.
If we want to execute one command only if the previous one succeeded, we
separate them with &&
(i.e., it is a logical and)
On the other hand, if we want to execute the second command only if the
first one fails (in other words, execute the first or the second), we
separate them with ||
.
The example with ls
is quite artificial as ls
is quite noisy when
an error occurs.
However, there is also a program called test
that is silent and can be used
to compare numbers or check file properties.
For example, test -d ~/Desktop
checks that ~/Desktop
is a directory.
If you run it, nothing will be printed.
However, in company with &&
or ||
, we can check its result.
test -d .git && echo "We are in a root of a Git project"
test -f README.md || echo "README.md missing"
This could be used as a very primitive branching in our scripts.
In the next lab, we will introduce proper conditional statements, such as if
and while
.
Note that test
is actually a very powerful command – it does not print
anything but can be used to control other programs.
It is possible to chain commands, &&
and ||
are left associative and
have the same priority.
Compare the following commands and how they behave when in a directory
where the file README.md
is or is not present.
test -f README.md || echo "README.md missing" && echo "We have README.md"
test -f README.md && echo "We have README.md" || echo "README.md missing"
Failing fast
There is a caveat regarding pipes and success of commands: the success of a
pipeline is determined by its last command.
Thus, sort /nonexistent | head
is
a successful command. To make a failure of any command fail the (whole) pipeline, you
need to run set -o pipefail
in your script (or shell) before the pipeline.
Compare the behavior of the following two snippets.
sort /nonexistent | head && echo "All is well"
set -o pipefail
sort /nonexistent | head && echo "All is well
In most cases, you want the second behavior.
In fact, you typically want the whole script to terminate if one of the commands fail. Note that this is different from allowing any of the commands to fail. For example, the following composite command is successful even though one of its components failed.
cat /nonexistent || echo "Oh well"
In other words, you want the script to terminate if the failure occurs where you do not expect it. Like an uncaught exception, for example.
To enable this behavior, you need to call set -e
.
Usually you also want to fail the script when an uninitiailized variable is
used: that is enabled by set -u
.
Therefore, typically, you want to start your script with the following
trio (we will talk about set -u
later):
set -o pipefail
set -e
set -u
Many commands allow that short options
(such as -l
or -h
you know from ls
)
to be merged like this
(note that -o pipefail
has to be last):
set -ueo pipefail
Get into a habit where each of your scripts start with this command.
UPDATE: pitfalls of pipes (a.k.a. SIGPIPE)
As one of you have pointed out during labs (thanks!),
set -ueo pipefail
can cause unwanted and quite unexpected behaviour.
Following script terminates with a hard-to-explain error
(note that the final hexdump
is only to ensure we do not print garbage
from /dev/urandom
directly on the termina).
set -ueo pipefail
cat /dev/urandom | head -n 1 | hexdump || echo "Pipe failed?"
Despite the fact that everything looks fine.
The reason comes from the head
command.
head
has a very smart implementation that terminates after first -n
lines
were printed.
Reasonable right?
But that means that the first cat
is suddenly writing to a pipe that no one
reads.
It is like writing to a file that was already closed.
That generates an exception (well, kind of) and cat
terminates with an error.
Because of set -o pipefail
, the whole pipe fails.
The truth is that distinguishing whether the closed pipe is a valid situation
that shall be handled gracefully or if it indicates an issue is impossible.
Therefore cat
terminates with an error (after all, someone just closed her
output without letting her know first) and thus shell has to mark the whole
pipe as failed.
Solving this is not always easy and several options are available. Each has its pros and cons.
For the 04/pass_gen.sh
we suggest adding || true
or || echo -n ""
to
mark the pipeline as fine.
Or do not enable pipefail
for this script (which is otherwise a very reasonable
behaviour).
Announcing success/failure from our programs
Whether a program terminates as successful or as failing depends on its so-called return (or exit) code.
This code is an integer and unlike in other programming languages, zero denotes success and non-zero value denotes an error. Why do you think that the authors decided that zero (that is traditionally reserved for false) means success and nonzero (traditionally converted to true) means failure. Hint.
Unless specified otherwise, when your program terminates normally
(i.e., main
reaches the end and no exception is raised), the exit code is
by default zero.
If you want to change this behaviour, you need to specify this exit code
as a parameter to the exit
function.
In Python, it is sys.exit
.
With this knowledge, we admit that the program 01/prime.py
is not a good
example of a Linux-style program.
Program for checking for primes should denote the result by its exit code
and should take its input from the command-line (or stdin, at least), not
from a hard-coded file.
A proper program (this time for checking for even numbers) would look like this:
#!/usr/bin/env python3
import sys
def main():
if len(sys.argv) != 2:
print("Usage: {} NUMBER".format(sys.argv[0]), file=sys.stderr)
sys.exit(2)
number = int(sys.argv[1])
if number % 2 == 0:
sys.exit(0)
else:
sys.exit(1)
if __name__ == '__main__':
main()
Notice the following things.
Exit code 2 denotes bad invocation and is thus distinguishable from odd numbers.
While ||
and &&
are simply checking for zero/non-zero return code, we will need
different exit codes in more complex scripts.
Additionally note that the program prints usage help to stderr (because it is printed
when the program was called in an illegal manner) and it uses argv[0]
to print
its name.
The last thing to notice is that the program is completely silent – no need to print
anything as the result is signalled in the exit code (as with the test
program).
More examples
The following examples can be solved either by executing multiple commands or by piping basic shell commands together. To help you find the right program, you can use our Mini-manual.
Create a directory a
and inside of it create a text file --help
containing Lorem Ipsum
.
Print the content of this file and then delete it.
Solution.
Create a directory called b
and inside of it create files called
alpha.txt
and *
.
Then delete the file called *
and watch out what happened to the file alpha.txt
.
Solution.
Print the content of the file /etc/passwd
sorted by the rows.
Solution.
The command getent passwd USERNAME
prints the information about user
account USERNAME
(e.g., intro
) on your machine.
Write a command that prints information about user intro
or a message
This is not NSWI177 disk
if the user does not exist.
Solution.
Print the first and fifth column of the file /etc/passwd
.
Solution.
Count the lines of the file /etc/services
.
Solution.
Print last two lines of the files /etc/passwd
and /etc/group
using
a single command.
Solution.
Recall the file disk-speeds-data.csv
with the disk copying durations.
Compute the sum of all durations.
Solution.
Print information about the last commit, when the script is executed in
a directory that is not part of any Git project, the script shall print
only Not inside a Git repository
.
Hint. Solution.
Print the contents of /etc/passwd
and /etc/group
separated by
text Ha ha ha
(i.e., contents of /etc/passwd
,
line with Ha ha ha
and contents of /etc/group
).
Solution.
Shell customization
We already mentioned that you should customize your terminal emulator to be comfortable to use. After all, you will spend at least this semester with it and it should be fun to use.
In this lab, we will show some other options how to make your shell more comfortable to use.
Command aliases
You probably noticed that you execute some commands with the same options
a lot.
One such example could be ls -l -h
that prints a detailed file listing, using
human-readable sizes.
Or perhaps ls -F
to append a slash to the directories.
And probably ls --color
too.
Shell offers to create so-called aliases where you can easily add new commands without creating full-fledged scripts somewhere.
Try executing the following commands to see how a new command l
could be
defined.
alias l='ls -l -h`
l
We can even override the original command, the shell will ensure that a recursive rewrite is not happening.
alias ls='ls -F --color=auto'
Note that these two aliases together also ensure that l
will display
filenames in colors.
Some typical aliases that you will probably want to try are the following
ones.
Use a manual page if you are unsure what the alias does.
Note that curl
is used to retrieve contents from a URL and wttr.in
is really
a URL.
By the way, try that command even if you do not plan to use this alias :-).
alias ls='ls -F --color=auto'
alias ll='ls -l'
alias l='ls -l -h'
alias cp='cp -i'
alias mv='mv -i'
alias rm='rm -i'
alias man='man -a'
alias weather='curl wttr.in'
~/.bashrc
Aliases above are nice, but you probably do not want to specify them each time
you launch the shell.
However, most shells in Linux have some kind of file that they execute before
they enter into interactive mode.
Typically, the file resides directly in your home directory and is named after
the shell, ending with rc
(you can remember it as runtime configuration).
For Bash that we are using now (if you are using a different shell, you
probably already know where to find its configuration files), that file is
called ~/.bashrc
.
You have already used it when setting EDITOR
for Git but, you can also add
aliases there.
Depending on your distribution, you may already see some aliases or some
other commands there.
Add aliases you like there, save the file and launch a new terminal. Check that the aliases work.
The .bashrc
file behaves as a shell script and you are not limited to
have only aliases there.
Virtually any commands can be there that you want to execute in every
terminal that you launch.
Git aliases
Aliases are also provided by Git.
Perhaps you would rather type git ci
instead of git commit
?
Or git st
instead of git status
?
The following command does exactly that.
git config --global alias.ci commit
git config --global alias.st status
These commands can be executed only once as Git stores them inside
~/.gitconfig
that is read with each invocation of the git
command
(i.e., no need to put them into ~/.bashrc
).
Other interesting aliases are:
alias.graph 'log --oneline --decorate --all --graph --date=short --color --pretty=format:\"%C(auto)%h%d %s %C(auto,green)(%ad)\"'
alias.ll "log --format='tformat:%C(yellow)%h%Creset %an (%cr) %C(yellow)%s%Creset' --max-count=20 --first-parent"
Other bits
Again, several assorted notes that do not fit into the sections above but are worth knowing.
It is quite common that you need to jump there and back between two directories.
For instance, imagine that you are working with /etc/ssh/
and /var/log/
.
Rather than always typing cd /etc/ssh
and a moment later cd /var/log
,
you can use cd -
to jump to the last visited directory.
Graded tasks
IMPORTANT: all the following tasks must be solved using only
pipes and &&
or ||
command composition.
Use standard shell utilities and do not use shell if
s or while
s
even if you know them (the purpose of these tasks is to exercise your
knowledge of Linux filters).
UPDATE: please note this explanation
regarding possible issues with 04/pass_gen.sh
.
04/seq.py
(20 points)
Write your own variant of the seq
program in Python.
Note that the tests expects certain hard-coded error messages
as well as proper exit codes.
The program distinguishes 3 types of bad invocation: wrong argument count, invalid number specification and zero step.
Note that for certain inputs, the output shall be empty.
Feel free to run the Linux version of seq
to understand the expected
output (we only require that you print everything in one line).
You can safely assume that the user would never provide big numbers, i.e., the whole list will fit into memory. However, the user may provide wrong arguments or no arguments at all.
Update: your solution would use a richer set of exit codes
(i.e., standard seq
terminates on all errors with 1, our tests
require different ones).
Exit code | Error message | Note |
---|---|---|
1 | Wrong argument (integer expected). | When argument is not integer. |
2 | Wrong argument count. | When there is zero or four or more arguments. |
3 | Step cannot be zero. | When STEP is set to zero. |
Clarifications: You do not need to process any of the arguments (like -s
,
-f
, …) and your implementation should only work in integers (else the exit
code of 1 would not make much sense).
04/uid_sum.sh
(15 points)
The third column in the file /etc/passwd
contains a so-called user ID.
Write a script that prints sum of the five highest user IDs.
For the purpose of testing, expect that /etc/passwd
would be read
from stdin in your script (i.e., do not hardcode the path for /etc/passwd
in your script).
04/pass_gen.sh
(10 points)
UPDATE: please note this explanation
regarding possible issues with 04/pass_gen.sh
.
File /dev/urandom
provides an infinite stream of (pseudo-)random bytes.
Use it to create a script generating random passwords.
It should print one line in the form Random password: XXXXXXXXXXXXXXXXXXXX
,
where XXX...
part is a random string of 20 characters.
These characters might be numbers and lowercase and uppercase letters of English alphabet.
Do not forget end-of-line character at the end of this line.
04/run_in_dir.sh
(10 points)
Write a script that switches to directory 01
and counts words in file input.txt
.
If the directory does not exist or the file input.txt
is not in that directory,
it prints 0
to stdout and no error is displayed.
The script always terminates with zero exit code.
04/longest_line.sh
(10 points)
What is the length of the longest line from the first 15 lines of README.md
in your submission
repositories when all digits are removed?
In other words: remove all digits from the README.md
, look at the first 15 lines and find
the longest one.
The script shall print length of this line.
Note that we will run your script as 04/longest_line.sh
, not as ./longest_line.sh
.
04/matrix_slice.sh
(15 points)
Assume that you have a text file matrix.txt
, which contains matrix writen in a “fancy” notation.
You can rely that the format is fixed (with regard to spacing, 3 digits maximum,
position of pipe symbol etc.).
| 106 179 58 169 32 107 88 116 185 111 |
| 188 50 14 158 115 47 82 152 154 62 |
| 5 125 24 90 187 214 64 36 44 148 |
| 41 190 161 27 16 186 182 205 126 12 |
| 68 105 145 178 191 213 40 48 49 70 |
| 181 9 180 193 95 151 65 206 200 22 |
| 66 67 211 177 1 160 11 97 53 217 |
| 54 142 138 78 143 101 104 201 157 144 |
| 99 26 57 79 15 59 159 76 52 38 |
| 4 119 108 202 109 129 139 56 183 85 |
| 140 218 124 170 30 197 127 7 35 194 |
| 77 3 89 6 196 172 113 46 137 55 |
| 96 80 83 102 189 2 71 28 162 171 |
| 174 60 173 175 135 198 165 37 51 163 |
| 31 130 0 81 133 93 20 128 215 120 |
| 209 73 150 42 63 147 164 141 43 19 |
| 91 45 117 176 123 33 146 208 72 61 |
| 166 94 192 92 168 204 199 134 156 195 |
| 153 212 86 219 132 75 23 121 118 207 |
| 18 98 69 84 131 203 29 34 167 87 |
| 112 21 13 25 122 100 10 103 17 216 |
| 8 149 39 74 114 110 184 210 155 136 |
Write a script that prints matrix slice, which contains rows 10, 11, …, 19 and columns 3, 4, …, 7. The output shall contain only the slice without the pipes.
For rows 2, 3 and columns 1, 2 the following would be printed.
188 50
5 125
The script will read input from stdin.
The values of rows 10, 11, …, 19 and columns 3, 4, …, 7 are to be hard-coded in the script.
04/row_sum.sh
(20 points)
Assume the same matrix format as in the example above. Write a script that prints sum of each row.
We expect that for the following matrix we would get this output.
| 106 179 |
| 188 50 |
| 5 125 |
285
238
130
The script will read input from stdin, there is no limit on the amount of columns or rows but you can rely on the fixed format as explained above.
Deadline: April 12, AoE
Solutions submitted after the deadline will not be accepted.
Note that at the time of the deadline we will download the contents of your project and start the evaluation. Anything uploaded/modified later on will not be taken into account!
Note that we will be looking only at your master branch (unless explicitly specified otherwise), do not forget to merge from other branches if you are using them.