Computer Architecture (NSWI143)

About

Semester: Summer 2018/2019
Schedule: Tuesday, 15:30 in room S5 (česky)
Wednesday 9:00 in room S10 (english)
Note: Wednesday lecture on May 8 cancelled due to state holiday. Replacement lecture NOT scheduled!
Lecturer: Lubomír Bulej
SIS entry:NSWI143

The goal of the course is to introduce basic elements of computer organization and processor design so that students (as future professionals) do not treat computer as a black box that, in some magic way, executes programs. To this end, the course focuses on two key components of a computer – the processor and the memory subsystem. Specifically, the course covers the functional blocks and components that make up the processor and the memory hierarchy, their behavior and interaction, and their influence on the performance of a modern computer. Understanding these principles is a prerequisite for attaining a sufficient level of mechanical sympathy which helps in creating efficient programs even when using modern high-level programming languages.

Textbook

The course is based on the classic textbook by D. A. Patterson and J. L. Hennessy: Computer Organization and Design (5th edition, Morgan Kaufmann, 2013, ISBN 978-0124077263. 4th or 3rd edition also suffices). Students are assumed to be comfortable with the material in the introductory chapters (number representation, computer arithmetics, instructions of a computer) and the course focuses primarily on chapters dedicated to processor and the memory hierarchy.

Topics covered

  • Introduction to digital systems, logical expressions, boolean functions, gates, combinatorial and sequential circuits, basic building blocks, arithmetic operations.
  • Computer performance, fundamental metrics and their limitations, comparing performance of computer architectures.
  • Instruction set architecture (ISA) implementation, single-cycle and multi-cycle data path and control, hard-wired and microprogrammed controller implementation, exception handling.
  • Pipelined instruction execution, scalar pipelined data path, hazard detection and handling, branch prediction, exception handling.
  • Superscalar architectures, static and dynamic instruction scheduling, out-of-order execution, speculative execution, contemporary architectures.
  • Memory subsystem organization, latency and throughput, static and dynamic memory technology, cache organization and mapping, cache coherency.
  • Parallel processing and multiprocessor systems, Flynn's taxonomy, Amdahl's law, SIMD processing in multimedia, multi-core CPUs, GPUs.

Lectures

WeekTopicNotes
1 Agenda, Introduction Lecture was not scheduled. The material was discussed in Week 2.
2 Computer performance  
3 Instruction set architecture Make sure you are familiar with the content of Chapters 1 to 3 in the H&P book. You can find additional slides covering introductory parts in the supplementary material section below.
4 Digital circuits The H&P book covers basics of logic design in an appendix (Appendix C in 4th edition, appending B in 5th edition). You should be familiar with gates, basic (ripple carry) adder, and basic logic blocks such as multiplexors, and decoders.
5 Processor implementation Review the implementation for the basic single-cycle datapath in Sections 4.3 and 4.4 of the H&P book .
6 Review the implementation of datapath control in Section 4.4 of the H&P book.
7 Improving performance Review pipelining in Section 4.5 and 4.6 of the H&P book.
8 Review pipelining hazards in Section 4.6 and 4.7 of the H&P book.
9 Multiple issue pipeline, out-of-order and speculative execution.
10 Memory hierarchy Memory technology, direct-mapped cache.
11 Lecture cancelled (state holiday).
Replacement lecture on Thursday, May 2, at 10:40 in lecture room S1.
12 Lecture cancelled (state holiday).
Replacement lecture NOT scheduled.
13 Cache coherency, coherency protocols.
14 Parallel processing Lecture cancelled (business trip).
Replacement lecture NOT scheduled.
TýdenTémaPoznámky
1 Agenda, Úvod  
2 Výkonnost počítačů  
3 Instrukční sada  
4 Logické obvody  
5 Implementace procesoru Jednocyklová datová cesta
6 Řadič datové cesty
7 Zvyšování výkonnosti procesorů Zřetězené zpracování instrukcí (pipelining)
8 Hazardy v pipeline
9 Dynamické plánování instrukcí, spekulace
10 Paměťová hierarchie Technologie pamětí, přímo mapovaná cache
11 Množinově-asociativní a plně asociativní cache, 3C model výpadků cache
12 Koherence cache, koherenční protokoly
13 Paralelní zpracování Přednáška odpadá (rektorský den)
Náhradní přednáška není plánována.
14   Přednáška odpadá (služební cesta)
Náhradní přednáška není plánována.

Supplementary material

The following slides cover introductory material related to binary and hexadecimal numbers, logical operations, representations of numbers, storage of data in memory, storage of structure types, alignment, and the relation between pointers and memory addresses.

This material is not covered in this course — students should be familiar with the material from the Principles of Computers course.

Contact and office hours

If you have any question or comment, do not hesitate to contact me at bulej at d3s.mff.cuni.cz. E-mail is preferred for brief inquiries and it is generally OK to come to my office without an appointment. However, a consultation is not a private lecture, therefore if you need to discuss a topic at length, it is better to arrange an appointment with a clear goal. This is especially important if you have trouble understanding something, because the first step is to find out what is it that you still understand and what is the next bit which is unclear.

I can be found in office 205 (2nd floor in the Malá strana building), except when I'm not, which is on Mondays, when I'm teaching, attending the department seminar, or attending a conference or a project meeting. It is safe to assume that I will be in the office between 10am and 5pm (I often come earlier and stay longer), but if you plan to come outside those hours, it is better to contact me in advance.

Final exam

The final exam is closed-book and written-only, and consists of a set of questions/exercises covering the course topics. On average, there are 12-13 questions with a total of 20 points. 10 points (50%) are required to pass the exam with C/3 (there is no D), 13 points (65%) are needed to pass with B/2, and 17 points (85%) are required to pass with A/1.

When grading the exam, I try to point out deficiencies which influenced the points awarded. Sometimes a particular answer is awarded a range of points. The lower bound corresponds to the amount that would be awarded if I were grading the exam in a very strict manner, while the upper bound corresponds to a very benevolent grading. A large difference between the lower and upper bounds in the total indicates that many answers were too ambiguous. To determine the exam outcome, I usually take the total from the middle of the range (never below middle). You will be provided with a scanned copy of your graded exam answer sheet.

The exam covers the following topics:

  • fundamentals of computer performance (relation between execution time and clock cycles, instructions, and clock rate, Amdahl's law),
  • instruction set architecture (what kind of instructions do we need and why, compilation of basic elements of structured programming, i.e., assignments, conditionals, loops, function calls, argument passing),
  • fundamentals of digital circuits (basic gates, concept of sequential and combinational circuits, datapath building blocks such as adders, ALUs, multiplexors, decoders),
  • processor implementation (single-cycle and multi-cycle datapath and control),
  • performance improvement techniques (pipelining datapath and control, pipeline hazards, forwarding/bypassing, branch prediction, handling of exceptions, static and dynamic multiple-issues pipelines and related techniques, such as out-of-order execution, speculation, register renaming), and
  • memory hierarchy (caches, operation of write-through and write-back caches, cache miss model, cache architectures and their impact on cache misses, cache coherency, coherency protocols such as IV, MSI and MESI, false sharing).

The goal of the exam is to test understanding, not the ability to learn facts by heart. In many cases, factual information or diagrams are provided as part of the question. The required level of understanding roughly corresponds to the level of detail presented in the lectures and lecture slides. Going though relevant exercises in the H&P book is highly recommended.

To subscribe to an exam, use the Study Information System of the faculty.