Other labs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12.

Přeložit do češtiny pomocí Google Translate ...

Lab #11 (May 4 – May 8)

Before class

Topic

  • Build systems.

Exercises

1.
As usual, update your clone of teaching/nswi177/2020-summer/upstream/examples repository. We will be using examples from make subdirectory.
2.

Before diving into build systems, we will play a little bit with Pandoc. Pandoc is a universal document converter that can convert between various formats, including HTML, Markdown, Docbook, LaTeX, Word, LibreOffice or PDF.

Please, install it on your machine (package name is pandoc) first.

Start with running

pandoc index.md

As you can see, the output is conversion of the Markdown file into HTML, though without HTML header.

If you add --standalone, it generates a full HTML page. Let’s try it (both invocations will have the same end result).

pandoc --standalone index.md >index.html
pandoc --standalone -o index.html index.md

Try opening index.html in your web browser too.

As mentioned, Pandoc can create OpenDocument too (the format used mostly in OpenOffice/LibreOffice suite). And it is so easy.

pandoc -o index.odt index.md

We omitted here the --standalone as it is not needed for anything else than HTML output.

Install OpenOffice/LibreOffice to check what the output looks like.

3.
Convert also rules.md to HTML and ODT.
4.

As a side-note, do you know that LibreOffice can be used from command-line too? For example, we can ask LibreOffice to convert a document to PDF via following command.

soffice --headless --convert-to pdf rules.odt

The --headless prevents opening any GUI and --convert-to is self-explanatory.

So, with three commands we are able to create HTML page and PDF output from single source. Quite a neat trick if you need to submit printed documentation and you only have HTML, for example.

5.

Let’s get back to Pandoc. Without any other options, it uses its own default template for the final HTML. But we can change this template too. Yes, this is similar to the SSG example you already know but we are attacking the problem from a different angle here. Note that both approaches can be combined.

Open template.html. It looks very like the Jinja one you have seen in Task 04 but instead of {{ there are things in dollars. If you have not yet tackled Task 04, the template is normal HTML with placeholders, enclosed in dollars. So when the template is expanded (or rendered), the parts between dollars would be replaced with actual content.

Let’s try it with Pandoc.

pandoc --template template.html rules.md >out/rules.html

Check what the output looks like. Notice that we create the result in separate directory to stop cluttering the current one.

6.
Using the provided template, generate also index.md into out/. Copy main.css to out/ too.
7.

One more side-step :-)

The web server that is running on port 8080 on unixadmin.ms.mff.cuni.cz supports personal web pages too. All you need to do is to create public_html directory in your home dir and then you can access it via /~LOGIN URL.

To try it, copy the generated files from out to unixadmin.ms.mff.cuni.cz into $HOME/public_html. Note that you can use Midnigh Commander easily for that (use Shell link menu and if you have setup linux-intro alias in your .ssh/config, it would work here as well).

Also create the tunnel via ssh -L 8090:localhost:8080 -N LOGIN@unixadmin.ms.mff.cuni.cz (but you already know this command, right?).

Open http://localhost:8090/~LOGIN in your web browser.

It will show Forbidden web page.

Why?

Because the webpage is now served over HTTP (wrapped inside SSH but that is irrelevant now) by the web server. This web server is running under user apache and this user could not access your files. That is fine – you do not want to have your files readable for everybody.

But we can allow the web server access to the public_html directory.

So we can simply run

setfacl -m u:apache:x public_html

to fix this (recall what meaning has x bit on directories and why it is sufficient here).

But this still will not work as our $HOME is not readable by Apache too (the operating system checks the permissions for each directory on the absolute path).

Run the setfacl on your $HOME too.

Refresh your browser – you should see your index.html displayed.

8.

You probably noticed that there is link to teams.html but no teams.md.

That is because we generate that page from a CSV teams.csv.

Look into bin/make_teams_page.sh and generate out/teams.html by yourself.

9.

To generate the whole website (let’s call our 3 small pages a website) we need to execute several commands.

As these commands need to be run after any change to the input files (either *.md or teams.csv), let’s put them into a script to simplify things for us.

By yourself, create a script that generates the whole website into the out directory.

Solution.
10.

The script is nice but it overwrites all files even if there was no change. In our small example, it is no big deal. You have a fast computer, after all.

But in bigger project where we, for example, compile thousands of files (e.g. look at source tree of Linux kernel, Firefox or LibreOffice), it is a big deal. If input file was not changed (e.g. we modified only rules.md) we do not need to regenerate its output (e.g. we do not need to re-create index.html).

Let’s extend our script a little bit.

Instead of

pandoc --template template.html index.md >out/index.html

we use (man test if you have never seen -nt)

[ "index.md" -nt "out/index.html" ] \
    && pandoc --template template.html index.md >out/index.html

We can do that for every command to speed-up web generation.

But.

That is a lot of work. And probably the time-saved would be all wasted by rewriting our script. Not talking about the fact that the result looks horrible. And is expensive to maintain.

Luckily, there is better way.

11.

Open Makefile now.

This file is a control file for a build system named make that does exactly what we tried to imitate in the previous example.

It contains so called dependencies and actions to execute when the dependants are out-of-date (i.e. dependency is newer than the target).

We will start with the following fragment:

out/rules.html: rules.md template.html
        pandoc --template template.html rules.md >out/rules.html

Important: the indenting in Makefiles have to be done with tabs so make sure your editor does not expand tabs to spaces. It is also a common issue when copying fragments from web-browser. (Usually, your editor will recognize that Makefile is a special filename and switch to tabs-only policy by itself.)

The fragment has three parts.

Before the colon is the name of the target. That is usually a filename and describes what we want to build. Here it is out/rules.html.

The second part is after the colon till the end of the line. It lists dependencies. make looks at the dependencies and if they are newer than the target, it means that the target is out-of-date and needs to be rebuild.

The third part are following lines that has to be indented by tab and contains commands that has to be executed for the target to be build. Here, it is the call to pandoc.

Together, we can read it as a rule that describes when it is needed to build a target and how.

12.

The rest of the Makefile is similar. There are rules for other files and also several special rules.

The special rules are all, clean and .PHONY.

all is a traditional name for the very first rule in the file. Note that it lists as its dependencies all generated files.

The first rule is also called default rule and is executed by default. As you have probably guessed, by default we want to build everything (more precisely: update everything that needs to be updated).

clean is a special rule that has no dependencies but instead has only commands that remove everything in out. It is a useful service-style rule for removing generated files (e.g. to start with fresh state, save disk space etc.).

As make expects that target name is filename, we need to tell it that all and clean are actually not filenames (i.e. we are not creating file all as one could expect) via the special target .PHONY.

This weird approach is basically a desing flaw of make that was originally created as a one-shot utility and somehow survived for more than 40 years. Note that despite the age, make is still used even in new projects and is also often used as a backend. That is, you have something smarter that generates Makefile and let make do the actual work.

So far good. But we have not yet seen how to run Makefile.

That is actually simple: install make package first and run

make

Depending on your other changes, perhaps nothing was done or some commands were executed.

Let’s run

make clean

to clean all files in out/. As you can see, make prints what it is doing to stderr.

Run make again. Now make a change to index.md and run make again.

What is the difference?

Solution.
13.

On your own, add rules for creating out/teams.html.

Do not forget to add removal of teams.md to the clean target.

Solution.
14.

Add a team to teams.csv and rebuild the web.

What commands were executed?

Solution.
15.

Add a target upload that copies the generated web to your public_html directory.

That is something where Midnight Commander is not the right choice but we can use scp (or sftp).

scp is like cp (i.e. it copies files) but the s was taken from SSH ;-).

Thus simple

scp out/index.html LOGIN@unixadmin.ms.mff.cuni.cz:public_html/

copies out/index.html to your $HOME/public_html. With proper alias in .ssh/config, you can even use the following shorter form:

scp out/index.html linux-intro:public_html/

With -r it copies directory recursively.

Solution.
16.
Add a link to rules.pdf to the rules.md page and generate the rules.pdf automatically in Makefile too. Solution.
17.

The Makefile starts to have too much of repeated code.

But make can help you with that too.

Let’s remove all the rules for generating out/*.html from *.md and replace them with:

out/%.html: %.md template.html
        pandoc --template template.html -o $@ $<

That is a pattern rule that captures the idea that HTML is generated from Markdown. Here, the percent sign represents so called stem – the variable part of the pattern.

In the command part, we use make variables (they start with dollar as in shell) $@ and $<. $@ is the actual target and $< is the first dependency.

Run make clean && make to verify that even with pattern rules, the web is still generated.

18.

Generate also contact.html. Do not forget to add it to the menu.

Run make after the changes. What was rebuild?

Solution.
19.

Generating the teams web page is nice but we pollute current directory with a temporary file.

Let’s do a small change and use tmp/ for that:

...

out/%.html: tmp/%.md template.html
        pandoc --template template.html -o $@ $<

tmp/teams.md: teams.csv bin/make_teams_page.sh
        bin/make_teams_page.sh <$< >$@

...

clean:
        rm -f out/* tmp/*
20.

There are special pages prepared for individual tasks in the game. These will be available for download as PDF only.

Add their generation to the Makefile, place the generated PDFs to out/task-*.pdf.

Note that soffice accepts --outdir paramter.

Hint.Solution.
21.
Generate the tasks page with only list of PDFs for download. Add it to the menu too.
22.

Last thing we will do with make is to improve the readability of Makefile a little bit with variables (actually, they cannot be changed and constants would be a better name).

Let’s start with a simple change:

PAGES = \
        out/contact.html \
        out/index.html \
        out/rules.html \
        out/teams.html

all: $(PAGES) out/main.css ...

...

Not much but we at least keep the list of web pages together and the all: line is a bit shorter.

Note that \ at the end of line denotes that the line continues.

Why we have each page on a separate line?

Solution.
23.
On your own, add variable for task PDFs too.
24.

We will finish the simplification with another bit that is often useful when dealing with more complex path names.

Note that there are several variants of make: so far, our Makefile is fully standard compliant. The last addition will work in GNU make only (but that is the default on Linux so there shall not be any problem).

We will change the Makefile as follows:

PAGES = \
        contact \
        index \
        rules \
        teams

PAGES_TMP := $(addsuffix .html, $(PAGES))
PAGES_HTML := $(addprefix out/, $(PAGES_TMP))

We keep only the basename of each page and we compute the output path. Note that there is := used in the computation and the $(addsuffix and $(addprefix are function calls. Arguments are separated by comma but they operate as if $(PAGES) was an array.

25.
Make the same change for tasks too. Solution.