Semester: summer 2023/24
Lectures: Tuesday, 15:40, S9 (english) (Lubomír Bulej)
Page in SIS: NSWI143
Grading: Exam


Feb 27, 2024 — Added lecture resources page.

Jan 16, 2024 — Starting with the summer semester 2023/2024, the lectures will be given in English only.

Show all...


The goal of the course is to cultivate mechanical sympathy towards computers, so that they are not perceived as mysterious black boxes that, somehow magically, execute programs. Even though modern programming languages and runtime environments almost completely shield software developers from hardware details, the fundamental aspects of how (non-quantum) computations are carried out do not change. Consequently, understading how computers and processor operate — what is easy, and what is difficult for them — helps developing efficient software systems even when using modern high-level programming languages.

To this end, the course focuses on two critical components of a computer: the processor and the memory subsystem. Specifically, the course covers the functional blocks and components that make up the processor and the memory hierarchy, their behavior and interaction, and their impact on performance of a modern computer. To demonstrate how to put things together, we will build a model of a simple, yet functional, RISC-V processor out of simple logic gates.

Topics covered

  • Computer performance, fundamental metrics and their limitations, comparing performance of computer architectures.
  • Introduction to digital systems, logical expressions, boolean functions, gates, combinatorial and sequential circuits, basic building blocks, arithmetic operations.
  • Instruction set architecture (ISA) implementation, single-cycle and multi-cycle data path and control, hard-wired and microprogrammed controller implementation, exception handling.
  • Pipelined instruction execution, scalar pipelined data path, hazard detection and handling, branch prediction, exception handling.
  • Overview of Superscalar architectures, static and dynamic instruction scheduling, out-of-order execution, speculative execution, contemporary architectures.
  • Memory subsystem organization, latency and throughput, static and dynamic memory technology, cache organization and mapping, cache coherency.
  • Brief overview of parallel processing and multiprocessor systems, Flynn’s taxonomy, Amdahl’s law, SIMD processing in multimedia, multi-core CPUs, GPUs. (if time permits)


The course is historically based on successive editions of the classic textbook by D. A. Patterson and J. L. Hennessy: Computer Organization and Design (5th edition, Morgan Kaufmann, 2013, ISBN 978-0124077263).

Starting with summer 2024, we will be using the RISC-V variant of the book (2nd edition, Morgan Kaufmann, 2020, ISBN 978-0128203316) edition), but using almost any edition should be fine (including the MIPS variant since 3rd edition). The differences are negligible for the purpose of this course.

Regardless of the edition, students are assumed to be comfortable with the material in the introductory chapters (number representation, computer arithmetics, instructions of a computer). This course then focuses on chapters dealing with the processor design and the memory hierarchy.

Final exam

The final exam is closed-book, written-only, and consists of a set of questions/exercises covering the course topics. On average, there are 12-13 questions with a total of 20 points.

  • 10 points (50%) are required to get the grade 3, which is the lowest passing grade,
  • 13 points (65%) are needed to pass with grade 2, and
  • 17 points (85%) are required to pass with grade 1.

Please note that if you need these grades converted to an american grading system, the local grade 3 will end up as american grade D, for which you might not get credit.

When grading the exam, I try to point out deficiencies which influenced the points awarded. Sometimes a particular answer is awarded a range of points. The lower bound corresponds to the amount that would be awarded if I were grading the exam in a very strict manner, while the upper bound corresponds to a very benevolent grading. A large difference between the lower and upper bounds in the total indicates that many answers were too ambiguous. To determine the exam outcome, I usually take the total from the middle of the range (never below middle). You will be provided with a scanned copy of your graded exam answer sheet.

The exam covers the following topics:

  • fundamentals of computer performance (relation between execution time and clock cycles, instructions, and clock rate, Amdahl’s law),
  • instruction set architecture (what kind of instructions do we need and why, compilation of basic elements of structured programming, i.e., assignments, conditionals, loops, function calls, argument passing),
  • fundamentals of digital circuits (basic gates, concept of sequential and combinational circuits, datapath building blocks such as adders, ALUs, multiplexors, decoders),
  • processor implementation (single-cycle and multi-cycle datapath and control),
  • performance improvement techniques (pipelining datapath and control, pipeline hazards, forwarding/bypassing, branch prediction, handling of exceptions, static and dynamic multiple-issues pipelines and related techniques, such as out-of-order execution, speculation, register renaming), and
  • memory hierarchy (caches, operation of write-through and write-back caches, cache miss model, cache architectures and their impact on cache misses, cache coherency, coherency protocols such as IV, MSI and MESI, false sharing).

The goal of the exam is to test understanding, not the ability to memorize facts (in most cases, the necessary facts are included in the exercise). The required level of understanding roughly corresponds to the level of detail presented in the lectures and lecture slides. Going though relevant exercises in the H&P book is highly recommended.

To register for the exam, use the Study Information System.