Lab #4 | NSWI177

Information below is not for the current semester. The current semester can be found here.

Přeložit do češtiny pomocí Google Translate ...

Labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14.

Standard input and outputs
Program composition (a.k.a. pipes)
Program return (exit) code
More examples
Shell customization
Other bits
Graded tasks

The goal of this lab is to define and thoroughly understand the concepts of standard input, output, and standard error output. This would allow us to understand program I/O redirection and composition of different programs via pipes. We will also customize our shell environment a little by investigating command aliases and the .bashrc file.

This lab defines a lot of concepts that might be new to you. Please, do not skip the more theoretical parts as we will build on top of them in the following labs.

Standard input and outputs

Standard output

Standard output (often shortened to stdout) is the default output that you can use by calling print("Hello") if you are in Python, for example. Stdout is used by the basic output routines in almost every programming language.

Generally, this output has the same API as if you were writing to a file. Be it print in Python, System.out.print in Java or printf in C (where the limitations of the language necessitate the existence of a pair of printf and fprintf).

This output is usually prepared by the language runtime together with the shell and the operating system. Practically, the standard output is printed to the terminal or its equivalent (and when the application is launched graphically, stdout is typically lost).

Note that in Python you can access it explicitly via sys.stdout.

Standard input

Similarly to stdout, almost all languages have access to stdin that represents the default input. By default, this input comes from the keyboard, although usually through the terminal (i.e., stdin is not used in graphical applications for reading keyboard input).

Note that the function input() that you may have used in your Python programs is an upgrade on top of stdin because it offers basic editing functions. Plain standard input does not support any form of editing.

If you want to access the standard input in Python, you need to use sys.stdin explicitly. As one could expect, it uses a file API, hence it is possible to read a line from it calling .readline() on it or iterate through all lines.

In fact, the iteration of the following form is a quite common pattern for many Linux utilities (they are usually written in C but the pattern remains the same).

for line in sys.stdin:
    ...

Standard I/O redirection

As a technical detail, we mentioned earlier that standard input and outputs are prepared (partially) by the operating system. This also means that it can be changed (i.e., initialized differently) without changing the program. And the program may not even “know” about it.

This is called redirection and it allows the user to specify that the standard output would not go to the screen (terminal) but rather to a file. As stated above, the standard output uses file API and thus it is technically possible from the language runtime standpoint as well.

This redirection has to be done before the program is started and it has to be done by the caller. For us, it means we have to do it in the shell.

It is very simple: at the end of the command we can specify > output.txt and everything that would be normally printed on a screen goes to output.txt.

One big warning before you start experimenting: the output redirection is a low-level operation and has no form of undo. Therefore, if the file you redirect to already exists, it will be overwritten without any prompting. And without any easy option to restore the original file content (and for small files, the restoration is technically impossible for most file systems used in Linux). As a precaution, get into a habit to hit Tab after you specify the filename. If the file does not exist, the cursor will not move. If the file already exists, the tab completion routine will insert a space.

As the simplest example, the following two commands will create files one.txt and two.txt with the words ONE and TWO inside (including newline at the end).

echo ONE > one.txt
echo TWO >two.txt

Note that the shell is quite flexible in the use of spaces and both options are valid (i.e., one.txt does not have a space as the first character in the filename).

If you know Python’s popen or a similar call, they also offer the option to specify which file to use for stdout if you want to do a redirection in your program (but only for a new program launched, not inside a running program).

If you recall Lab 02, we mentioned that the program cat is used to concatenate files. With the knowledge of output redirection, it suddenly starts to make more sense as the (merged) output can be easily stored in a file.

cat one.txt two.txt >merged.txt

The shell also offers an option to append the output to an existing file using the >> operator. Thus, the following command would add UNO as another line into one.txt.

echo UNO >>one.txt

If the file does not exist, it will be created.

For the following example we will need the program tac that reverses the order of individual lines but otherwise works like cat. Try this first.

tac one.txt two.txt

If you have executed the commands above, you should see the following:

UNO
ONE
TWO

Try the following and explain what happens (and why) if you execute

tac one.txt two.txt >two.txt

Answer.

Similarly to the redirection of the output, it is possible to redirect the input too. The syntax is < input.file.

Try it with the longest.py from the first set of graded tasks (and if you have decided not to implement it, simply uncomment the I just read line).

Filters

Many utilities in Linux work as so-called filters. They accept the input from stdin and print their output to stdout.

One such example is cut that can be used to print only certain columns from the input. For example, running it as cut -d: -f 1 above /etc/passwd will display a list of accounts (usernames) on the current machine.

Explain the difference between the following two calls:

cut -d: -f 1 </etc/passwd
cut -d: -f 1 /etc/passwd

Answer.

The above behaviour is quite common for most filters: you can specify the input file explicitly but when missing, the program reads from the stdin.

In your own filters, you should also follow this approach: the amount of source code you need to write is negligible, but it gives the user flexibility in use. The following snippet demonstrates how you can easily add the same behaviour to a program called rev.py that simply reverses the ordering of characters on each line.

#!/usr/bin/env python3

import sys

def reverse(inp):
    for line in inp:
        print(line.rstrip('\n')[::-1])

def main():
    if len(sys.argv) == 1:
        reverse(sys.stdin)
    elif len(sys.argv) == 2:
        with open(sys.argv[1], "r") as inp:
            reverse(inp)
    else:
        # Handle error
        pass

if __name__ == '__main__':
    main()

Note that rev is also a standard Linux program.

Standard error output

Let us return to our files one.txt and two.txt and execute the following command (assuming nonexistent.txt really does not exist).

cat one.txt nonexistent.txt two.txt

You will notice that cat printed a warning about the non-existent file, probably somewhere in the middle of the output (note that the word probably is needed here as we will explain later).

Now execute the same as above but with output redirection.

cat one.txt nonexistent.txt two.txt >merged.txt

The error was still printed in the terminal and merged.txt contains the contents from one.txt and two.txt.

This is caused by the use of so-called standard error output (often just stderr) and it represents a very sensible behaviour. The error message is displayed to the user even when an output redirection is used.

This is possible because each program has – apart from stdout and stdin – also stderr: a separate stream that by default goes to the terminal (i.e., the same medium as for stdout) but is not affected by the output redirection. In Python, it is available as sys.stderr and it is (as sys.stdout) an opened file.

Extending our rev.py program, we would use stderr in the following way:

def main():
    if len(sys.argv) == 1:
        reverse(sys.stdin)
    elif len(sys.argv) == 2:
        try:
            with open(sys.argv[1], "r") as inp:
                reverse(inp)
        except FileNotFoundError as e:
            print("{}: No such file".format(sys.argv[1]), file=sys.stderr)
    else:
        # Handle error
        pass

Standard error output is also often used for printing debugging information or for logging in general. The reasons should be obvious: the output always goes to the screen, thus it is a good way to inform the user about the progress (for example) and it does not change the real output of the program (as stdout is logically a different file).

Under the hood

Technically, opened files have so-called file descriptors that are used when an application communicates with the operating system (recall that file operations have to be done by the operating system). The file descriptor is an integer that serves as an index in a table of opened files that is kept for each process (i.e., running instance of a program).

This number – the file descriptor — is then passed as (typically) the first parameter in a syscall, such as write (write takes at least two arguments – opened file descriptor and the byte buffer to write – in our examples we will pass the string directly for simplicity). Therefore, when your application calls print("Message", file=some_file), eventually your program would call the operating system as write(3, "Message\n") where 3 denotes the file descriptor for the opened file represented by the some_file handle.

While the above may look like a technical detail, it will help you understand why the standard error redirection looks the way it does, or why file operations in most programming languages require opening the file first before writing to it (i.e., why write_to_file(filename, contents) is never a primitive operation).

In any unix-style environment, the file descriptors 0, 1, and 2 are always used for standard input, standard output, and standard error output, respectively. That is, the call print("Message") in Python eventually ends up in calling write(1, "Message\n") and a call to print("Error", file=sys.stderr) calls write(2, "Error\n").

This also explains why the output redirection is even possible – the caller (e.g., the shell) can simply open the file descriptor 1 to point to a different file as a normal application does not touch it at all.

The fact that stdout and stderr are logically different streams (files) also explains the word probably in one of the examples above. Even though they both end in the same physical device (the terminal), they may use a different configuration: typically, the standard output is buffered, i.e., output of your application goes to the screen only when there is enough of it while the standard error is not buffered – it is printed immediately. The reason is probably obvious – error messages should be visible as soon as possible, while normal output might be delayed to improve performance.

Note that the buffering policy can be more sophisticated, but the essential take away is that any output to the stderr is displayed immediately while stdout might be delayed.

Redirecting standard error output

To redirect the standard error output, you can use > again, but this time preceded by the number 2 (that denotes the stderr file descriptor).

Hence, our cat example can be transformed to the following form where err.txt would contain the error message and nothing would be printed on the screen.

cat one.txt nonexistent.txt two.txt >merged.txt 2>err.txt

Notable special files

We already mentioned several important files under /dev/. With output redirection, we can actually use some of them right away.

Run cat one.txt and redirect the output to /dev/full and then to /dev/null. What happened?

Especially /dev/null is a very useful file as it can be used in any situation when we are not interested in the output of a program.

Generic redirection

Shell allows us to redirect outputs quite freely using file descriptor numbers before and after the greater-than sign.

For example, >&2 specifies that the standard output is redirected to a standard error output. That may sound weird but consider the following mini-script (of course, if we knew about curl --silent we would call it directly, but for the sake of this example, let us pretend we know only about wget).

echo "Downloading robots.txt"
wget https://www.mff.cuni.cz/robots.txt -O /tmp/mff_robots.txt
cat /tmp/mff_robots.txt

By the way, what is missing? Answer.

What happens if the user redirects the output of this script (i.e., robots.sh >robots.txt)? The redirection is inherited by the nested programs (you can think about it as nested function calls) and thus every output is redirected.

Here, many messages of wget are still visible (which does not hurt), but also the output file contains the progress message. We probably do not want that.

To fix that, we will redirect the echo to stderr (so it is displayed on the screen) like this:

echo "Downloading robots.txt" >&2

Recall that 2 is the file descriptor of the error output and > redirects the standard (i.e., normal) output. Hence, we redirect stdout to stderr.

And we probably do not need wget to be so verbose. We can use -q or forcefully discard all stderr output by appending 2>/dev/null.

Sometimes, we want to redirect stdout and stderr to one single file. In these situations we can use >output.txt 2>output.txt or even better >output.txt 2>&1. However, what about 2>&1 >output.txt, can we use it as well? Try it yourself! Hint.

Program composition (a.k.a. pipes)

We finally move to the area where Linux excels: program composition. In essence, the whole idea behind Unix-family of operating systems is to allow easy composition of various small programs together.

Mostly, the programs that are composed together are filters and they operate on text inputs. These programs do not make any assumptions on the text format and are very generic. Special tools (that are nevertheless part of Linux software repositories) are needed if the input is more structured, such as XML or JSON.

The advantage is that composing the programs is very easy and it is very easy to compose them incrementally too (i.e., add another filter only when the output from the previous ones looks reasonable). This kind of incremental composition is more difficult in normal languages where printing data requires extra commands (here it is printed to the stdout without any extra work).

The disadvantage is that complex compositions can become difficult to read. It is up to the developer to decide when it is time to switch to a better language and process the data there. A typical division of labour is that shell scripts are used to preprocess the data: they are best when you need to combine data from multiple files (such as hundreds of various reports, etc.) or when the data needs to be converted to a reasonable format (e.g. non-structured logs from your web server into a CSV loadable into your favorite spreadsheet software or R). Computing statistics and similar tasks are best left to specialized tools.

Needless to add, Linux offers plenty of tools for statistical computations or plot drawing utilities that can be controlled in CLI. Mastering of these tools is, unfortunately, out of topic for this course.

Motivation example

As a somewhat artificial example, we will consider the following CSV that can be downloaded from here (it is the example from the lecture with the speeds of USB disks).

disk,duration
/dev/sdb,1008
/dev/sdb,1676
/dev/sdc,1505
/dev/sdc,4115
...

We want to know what was the longest duration of the copying: in other words, the maximum of column two.

Surely, we do not consider using any spreadsheet software as we want to stay in the terminal.

Recall that you have already seen a cut command that is able to cut only specific columns from a file. There is also the command sort that sorts lines.

Thus our little script could look like this:

#!/bin/bash

cut -d, -f 2 <disk-speeds-data.csv >/tmp/disk_numbers.txt
sort </tmp/disk_numbers.txt

Prepare this script and run it.

It is far from perfect – sort have sorted the lines alphabetically, not by number values. However, nothing that would not be solvable with man sort: add -n and re-execute.

The last line shows the maximum duration of 5769 seconds.

One could object that printing all numbers is redundant and only fills the screen with garbage. That is true but consider how long such a task would take to implement with your favorite programming language. Perhaps, loading this into your office suite would be probably faster, but we are after a solution that is repeatable and automatable. And we will print only the last line soon.

The big disadvantage is that this solution creates a temporary file. There are two issues with that. First of all, for large data sets, the mere existence of the temporary file can cause issues. A bit more subtle but much more dangerous problem is that the script has a hardcoded path for the temporary file. This script can be executed in two terminals concurrently, resulting in various unexpected results. Do not be fooled by the fact that this script is so short that the chances of really concurrent execution are negligible. It is a trap that is waiting to spring. We will later talk about the use of mktemp(1), for this example we can go without the temporary file completely.

The first program prints to the standard output and the second one reads from the standard input. In essence, the family of unix systems is built on top of the ability to chain such programs together using pipes.

cut -d, -f 2 <disk-speeds-data.csv | sort

The pipe symbol | really means that the standard output of the first (left-hand) process is bound to the standard input of the second process. The binding is done by the kernel and the data are not written to a disk but instead passed internally (using memory buffers but that is a technical detail).

The result is the same, but we escaped the pitfalls of using temporary files and the result is actually even more readable. The readability may not be obvious, but once you get used to it, reading such programs is very straightforward. Each program in the pipeline denotes a type of transformation. These transformations are composed together to produce the final result.

We promised we would print only the biggest number. The tail utility prints only the last ten lines, but with -n 1 it will print only the last (one) line. The piping is not limited to two programs and thus the whole script would look like

cut '-d,' -f 2 | sort -n | tail -n 1

Note that we have removed the hard-coded path from the script. Instead, the user is supposed to run the script like in the command below. Note that this actually makes the script more flexible: it is easy to test such a script with different inputs and the script can be again used in a bigger pipeline.

get-slowest.sh <disk-speeds-data.csv

Note that for many programs you can specify the use of stdin explicitly by using - (dash) as the input filename.

Program return (exit) code

Execute the following commands:

ls / || echo "ls failed"
ls /nonexistent-filename || echo "ls failed"

The section will explain how it is possible that echo knew whether ls was successful or not.

Understanding the following is essential because together with pipes and standard I/O redirection, form the basic building blocks of shell scripts.

First of all, we will introduce a syntax for conditional chaining of program calls and then describe how to prepare such programs in Python.

If we want to execute one command only if the previous one succeeded, we separate them with && (i.e., it is a logical and) On the other hand, if we want to execute the second command only if the first one fails (in other words, execute the first or the second), we separate them with ||.

The example with ls is quite artificial as ls is quite noisy when an error occurs. However, there is also a program called test that is silent and can be used to compare numbers or check file properties. For example, test -d ~/Desktop checks that ~/Desktop is a directory. If you run it, nothing will be printed. However, in company with && or ||, we can check its result.

test -d .git && echo "We are in a root of a Git project"
test -f README.md || echo "README.md missing"

This could be used as a very primitive branching in our scripts. In the next lab, we will introduce proper conditional statements, such as if and while.

Note that test is actually a very powerful command – it does not print anything but can be used to control other programs.

It is possible to chain commands, && and || are left associative and have the same priority.

Compare the following commands and how they behave when in a directory where the file README.md is or is not present.

test -f README.md || echo "README.md missing" && echo "We have README.md"
test -f README.md && echo "We have README.md" || echo "README.md missing"

Failing fast

There is a caveat regarding pipes and success of commands: the success of a pipeline is determined by its last command. Thus, sort /nonexistent | head is a successful command. To make a failure of any command fail the (whole) pipeline, you need to run set -o pipefail in your script (or shell) before the pipeline.

Compare the behavior of the following two snippets.

sort /nonexistent | head && echo "All is well"

set -o pipefail
sort /nonexistent | head && echo "All is well

In most cases, you want the second behavior.

In fact, you typically want the whole script to terminate if one of the commands fail. Note that this is different from allowing any of the commands to fail. For example, the following composite command is successful even though one of its components failed.

cat /nonexistent || echo "Oh well"

In other words, you want the script to terminate if the failure occurs where you do not expect it. Like an uncaught exception, for example.

To enable this behavior, you need to call set -e.

Usually you also want to fail the script when an uninitiailized variable is used: that is enabled by set -u.

Therefore, typically, you want to start your script with the following trio (we will talk about set -u later):

set -o pipefail
set -e
set -u

Many commands allow that short options (such as -l or -h you know from ls) to be merged like this (note that -o pipefail has to be last):

set -ueo pipefail

Get into a habit where each of your scripts start with this command.

UPDATE: pitfalls of pipes (a.k.a. SIGPIPE)

As one of you have pointed out during labs (thanks!), set -ueo pipefail can cause unwanted and quite unexpected behaviour.

Following script terminates with a hard-to-explain error (note that the final hexdump is only to ensure we do not print garbage from /dev/urandom directly on the termina).

set -ueo pipefail
cat /dev/urandom | head -n 1 | hexdump || echo "Pipe failed?"

Despite the fact that everything looks fine.

The reason comes from the head command. head has a very smart implementation that terminates after first -n lines were printed. Reasonable right? But that means that the first cat is suddenly writing to a pipe that no one reads. It is like writing to a file that was already closed. That generates an exception (well, kind of) and cat terminates with an error. Because of set -o pipefail, the whole pipe fails.

The truth is that distinguishing whether the closed pipe is a valid situation that shall be handled gracefully or if it indicates an issue is impossible. Therefore cat terminates with an error (after all, someone just closed her output without letting her know first) and thus shell has to mark the whole pipe as failed.

Solving this is not always easy and several options are available. Each has its pros and cons.

For the 04/pass_gen.sh we suggest adding || true or || echo -n "" to mark the pipeline as fine. Or do not enable pipefail for this script (which is otherwise a very reasonable behaviour).

Announcing success/failure from our programs

Whether a program terminates as successful or as failing depends on its so-called return (or exit) code.

This code is an integer and unlike in other programming languages, zero denotes success and non-zero value denotes an error. Why do you think that the authors decided that zero (that is traditionally reserved for false) means success and nonzero (traditionally converted to true) means failure. Hint.

Unless specified otherwise, when your program terminates normally (i.e., main reaches the end and no exception is raised), the exit code is by default zero. If you want to change this behaviour, you need to specify this exit code as a parameter to the exit function. In Python, it is sys.exit.

With this knowledge, we admit that the program 01/prime.py is not a good example of a Linux-style program. Program for checking for primes should denote the result by its exit code and should take its input from the command-line (or stdin, at least), not from a hard-coded file.

A proper program (this time for checking for even numbers) would look like this:

#!/usr/bin/env python3

import sys

def main():
    if len(sys.argv) != 2:
        print("Usage: {} NUMBER".format(sys.argv[0]), file=sys.stderr)
        sys.exit(2)

    number = int(sys.argv[1])
    if number % 2 == 0:
        sys.exit(0)
    else:
        sys.exit(1)

if __name__ == '__main__':
    main()

Notice the following things. Exit code 2 denotes bad invocation and is thus distinguishable from odd numbers. While || and && are simply checking for zero/non-zero return code, we will need different exit codes in more complex scripts.

Additionally note that the program prints usage help to stderr (because it is printed when the program was called in an illegal manner) and it uses argv[0] to print its name.

The last thing to notice is that the program is completely silent – no need to print anything as the result is signalled in the exit code (as with the test program).

More examples

The following examples can be solved either by executing multiple commands or by piping basic shell commands together. To help you find the right program, you can use our Mini-manual.

Create a directory a and inside of it create a text file --help containing Lorem Ipsum. Print the content of this file and then delete it. Solution.

Create a directory called b and inside of it create files called alpha.txt and *. Then delete the file called * and watch out what happened to the file alpha.txt. Solution.

Print the content of the file /etc/passwd sorted by the rows. Solution.

The command getent passwd USERNAME prints the information about user account USERNAME (e.g., intro) on your machine. Write a command that prints information about user intro or a message This is not NSWI177 disk if the user does not exist. Solution.

Print the first and fifth column of the file /etc/passwd. Solution.

Count the lines of the file /etc/services. Solution.

Print last two lines of the files /etc/passwd and /etc/group using a single command. Solution.

Recall the file disk-speeds-data.csv with the disk copying durations. Compute the sum of all durations. Solution.

Print information about the last commit, when the script is executed in a directory that is not part of any Git project, the script shall print only Not inside a Git repository. Hint. Solution.

Print the contents of /etc/passwd and /etc/group separated by text Ha ha ha (i.e., contents of /etc/passwd, line with Ha ha ha and contents of /etc/group). Solution.

Shell customization

We already mentioned that you should customize your terminal emulator to be comfortable to use. After all, you will spend at least this semester with it and it should be fun to use.

In this lab, we will show some other options how to make your shell more comfortable to use.

Command aliases

You probably noticed that you execute some commands with the same options a lot. One such example could be ls -l -h that prints a detailed file listing, using human-readable sizes. Or perhaps ls -F to append a slash to the directories. And probably ls --color too.

Shell offers to create so-called aliases where you can easily add new commands without creating full-fledged scripts somewhere.

Try executing the following commands to see how a new command l could be defined.

alias l='ls -l -h`
l

We can even override the original command, the shell will ensure that a recursive rewrite is not happening.

alias ls='ls -F --color=auto'

Note that these two aliases together also ensure that l will display filenames in colors.

Some typical aliases that you will probably want to try are the following ones. Use a manual page if you are unsure what the alias does. Note that curl is used to retrieve contents from a URL and wttr.in is really a URL. By the way, try that command even if you do not plan to use this alias :-).

alias ls='ls -F --color=auto'
alias ll='ls -l'
alias l='ls -l -h'

alias cp='cp -i'
alias mv='mv -i'
alias rm='rm -i'

alias man='man -a'

alias weather='curl wttr.in'

`~/.bashrc`

Aliases above are nice, but you probably do not want to specify them each time you launch the shell. However, most shells in Linux have some kind of file that they execute before they enter into interactive mode. Typically, the file resides directly in your home directory and is named after the shell, ending with rc (you can remember it as runtime configuration).

For Bash that we are using now (if you are using a different shell, you probably already know where to find its configuration files), that file is called ~/.bashrc.

You have already used it when setting EDITOR for Git but, you can also add aliases there. Depending on your distribution, you may already see some aliases or some other commands there.

Add aliases you like there, save the file and launch a new terminal. Check that the aliases work.

The .bashrc file behaves as a shell script and you are not limited to have only aliases there. Virtually any commands can be there that you want to execute in every terminal that you launch.

Git aliases

Aliases are also provided by Git. Perhaps you would rather type git ci instead of git commit? Or git st instead of git status? The following command does exactly that.

git config --global alias.ci commit
git config --global alias.st status

These commands can be executed only once as Git stores them inside ~/.gitconfig that is read with each invocation of the git command (i.e., no need to put them into ~/.bashrc).

Other interesting aliases are:

alias.graph 'log --oneline --decorate --all --graph --date=short  --color --pretty=format:\"%C(auto)%h%d %s  %C(auto,green)(%ad)\"'
alias.ll "log --format='tformat:%C(yellow)%h%Creset %an (%cr) %C(yellow)%s%Creset' --max-count=20 --first-parent"

Other bits

Again, several assorted notes that do not fit into the sections above but are worth knowing.

It is quite common that you need to jump there and back between two directories. For instance, imagine that you are working with /etc/ssh/ and /var/log/. Rather than always typing cd /etc/ssh and a moment later cd /var/log, you can use cd - to jump to the last visited directory.

Graded tasks

IMPORTANT: all the following tasks must be solved using only pipes and && or || command composition. Use standard shell utilities and do not use shell ifs or whiles even if you know them (the purpose of these tasks is to exercise your knowledge of Linux filters).

UPDATE: please note this explanation regarding possible issues with 04/pass_gen.sh.

`04/seq.py` (20 points)

Write your own variant of the seq program in Python. Note that the tests expects certain hard-coded error messages as well as proper exit codes.

The program distinguishes 3 types of bad invocation: wrong argument count, invalid number specification and zero step.

Note that for certain inputs, the output shall be empty.

Feel free to run the Linux version of seq to understand the expected output (we only require that you print everything in one line).

You can safely assume that the user would never provide big numbers, i.e., the whole list will fit into memory. However, the user may provide wrong arguments or no arguments at all.

Update: your solution would use a richer set of exit codes (i.e., standard seq terminates on all errors with 1, our tests require different ones).

Exit code	Error message	Note
1	Wrong argument (integer expected).	When argument is not integer.
2	Wrong argument count.	When there is zero or four or more arguments.
3	Step cannot be zero.	When STEP is set to zero.

Clarifications: You do not need to process any of the arguments (like -s, -f, …) and your implementation should only work in integers (else the exit code of 1 would not make much sense).

`04/uid_sum.sh` (15 points)

The third column in the file /etc/passwd contains a so-called user ID.

Write a script that prints sum of the five highest user IDs. For the purpose of testing, expect that /etc/passwd would be read from stdin in your script (i.e., do not hardcode the path for /etc/passwd in your script).

`04/pass_gen.sh` (10 points)

UPDATE: please note this explanation regarding possible issues with 04/pass_gen.sh.

File /dev/urandom provides an infinite stream of (pseudo-)random bytes.

Use it to create a script generating random passwords. It should print one line in the form Random password: XXXXXXXXXXXXXXXXXXXX, where XXX... part is a random string of 20 characters. These characters might be numbers and lowercase and uppercase letters of English alphabet.

Do not forget end-of-line character at the end of this line.

`04/run_in_dir.sh` (10 points)

Write a script that switches to directory 01 and counts words in file input.txt. If the directory does not exist or the file input.txt is not in that directory, it prints 0 to stdout and no error is displayed.

The script always terminates with zero exit code.

`04/longest_line.sh` (10 points)

What is the length of the longest line from the first 15 lines of README.md in your submission repositories when all digits are removed?

In other words: remove all digits from the README.md, look at the first 15 lines and find the longest one. The script shall print length of this line.

Note that we will run your script as 04/longest_line.sh, not as ./longest_line.sh.

`04/matrix_slice.sh` (15 points)

Assume that you have a text file matrix.txt, which contains matrix writen in a “fancy” notation. You can rely that the format is fixed (with regard to spacing, 3 digits maximum, position of pipe symbol etc.).

| 106 179  58 169  32 107  88 116 185 111 |
| 188  50  14 158 115  47  82 152 154  62 |
|   5 125  24  90 187 214  64  36  44 148 |
|  41 190 161  27  16 186 182 205 126  12 |
|  68 105 145 178 191 213  40  48  49  70 |
| 181   9 180 193  95 151  65 206 200  22 |
|  66  67 211 177   1 160  11  97  53 217 |
|  54 142 138  78 143 101 104 201 157 144 |
|  99  26  57  79  15  59 159  76  52  38 |
|   4 119 108 202 109 129 139  56 183  85 |
| 140 218 124 170  30 197 127   7  35 194 |
|  77   3  89   6 196 172 113  46 137  55 |
|  96  80  83 102 189   2  71  28 162 171 |
| 174  60 173 175 135 198 165  37  51 163 |
|  31 130   0  81 133  93  20 128 215 120 |
| 209  73 150  42  63 147 164 141  43  19 |
|  91  45 117 176 123  33 146 208  72  61 |
| 166  94 192  92 168 204 199 134 156 195 |
| 153 212  86 219 132  75  23 121 118 207 |
|  18  98  69  84 131 203  29  34 167  87 |
| 112  21  13  25 122 100  10 103  17 216 |
|   8 149  39  74 114 110 184 210 155 136 |

Write a script that prints matrix slice, which contains rows 10, 11, …, 19 and columns 3, 4, …, 7. The output shall contain only the slice without the pipes.

For rows 2, 3 and columns 1, 2 the following would be printed.

188  50
  5 125

The script will read input from stdin.

The values of rows 10, 11, …, 19 and columns 3, 4, …, 7 are to be hard-coded in the script.

`04/row_sum.sh` (20 points)

Assume the same matrix format as in the example above. Write a script that prints sum of each row.

We expect that for the following matrix we would get this output.

| 106 179 |
| 188  50 |
|   5 125 |

285
238
130

The script will read input from stdin, there is no limit on the amount of columns or rows but you can rely on the fixed format as explained above.

Deadline: April 12, AoE

Solutions submitted after the deadline will not be accepted.

Note that at the time of the deadline we will download the contents of your project and start the evaluation. Anything uploaded/modified later on will not be taken into account!

Note that we will be looking only at your master branch (unless explicitly specified otherwise), do not forget to merge from other branches if you are using them.

Lab #4: Inputs, outputs, and pipes

Table of contents

Standard input and outputs

Standard output

Standard input

Standard I/O redirection

Filters

Standard error output

Under the hood

Redirecting standard error output

Notable special files

Generic redirection

Program composition (a.k.a. pipes)

Motivation example

Program return (exit) code

Failing fast

UPDATE: pitfalls of pipes (a.k.a. SIGPIPE)

Announcing success/failure from our programs

More examples

Shell customization

Command aliases

`~/.bashrc`

Git aliases

Other bits

Graded tasks

`04/seq.py` (20 points)

`04/uid_sum.sh` (15 points)

`04/pass_gen.sh` (10 points)

`04/run_in_dir.sh` (10 points)

`04/longest_line.sh` (10 points)

`04/matrix_slice.sh` (15 points)

`04/row_sum.sh` (20 points)

Deadline: April 12, AoE

Table of contents

Standard input and outputs

Standard output

Standard input

Standard I/O redirection

Filters

Standard error output

Under the hood

Redirecting standard error output

Notable special files

Generic redirection

Program composition (a.k.a. pipes)

Motivation example

Program return (exit) code

Failing fast

UPDATE: pitfalls of pipes (a.k.a. SIGPIPE)

Announcing success/failure from our programs

More examples

Shell customization

Command aliases

~/.bashrc

Git aliases

Other bits

Graded tasks

04/seq.py (20 points)

04/uid_sum.sh (15 points)

04/pass_gen.sh (10 points)

04/run_in_dir.sh (10 points)

04/longest_line.sh (10 points)

04/matrix_slice.sh (15 points)

04/row_sum.sh (20 points)

Deadline: April 12, AoE

`~/.bashrc`

`04/seq.py` (20 points)

`04/uid_sum.sh` (15 points)

`04/pass_gen.sh` (10 points)

`04/run_in_dir.sh` (10 points)

`04/longest_line.sh` (10 points)

`04/matrix_slice.sh` (15 points)

`04/row_sum.sh` (20 points)