[OSy] Second extended assignment

Petr Tůma petr.tuma at d3s.mff.cuni.cz
Thu Oct 25 21:31:21 CEST 2018


Dear all,

attached please find the assignment of the 2nd extended semestral task of
the Operating Systems course. This extended assignment is intended for
those who want to solve the task independently at home in order to gain
the "conditionally fulfilled" status for this topic.

Please read and try to understand the text of the extended assignment even if you plan to attend the labs 
(schedule is on the web). The assignment for the
labs will be related to the extended assignment (the topic will also be
virtual memory) and also the required knowledge and recommended
literature will be the same.

If you choose to implement the extended assignment, please respect the
deadline of November 15th 2018 AoE.

Please send your implementation via e-mail to the following address:

      nswi004 at d3s.mff.cuni.cz

You should send your implementation as an archive (zip, tar.bz2, etc.)
in the attachment of the e-mail. Please make sure to include everything
necessary to run your implementation, to run the required unit tests and
to evaluate your source code. You can also use the `make dist' target
from the main Makefile to package your solution.

We will try to evaluate your submissions as they arrive, but we cannot
guarantee a specific reply latency. Therefore please avoid (in your own
interest) sending your implementation at the last moment, because we
might not be able to point you in time to some small issues in order for
you to fix them.

If you should have any questions (either specific to the assignment or
of a generic nature), please do not hesitate to send your query to this
mailing list.

Best regards,

Petr Tuma & Vojtech Horky
-------------- next part --------------
========================================================================
                        Operating Systems Course
                             Semestral Task

                       Topic no. 2: Virtual Memory
                           Extended Assignment
========================================================================
Date of assignment:  October  27th 2018
Submission deadline: November 12th 2018
------------------------------------------------------------------------

  The goal of the 2nd semestral task is to implement the functionality
  of virtual to physical memory mapping. The task is defined by a
  functional programming interface that needs to be implemented and it
  is accompanied by a set of unit tests that the implementation must
  successfully pass.

  Use the source code of Kalisto version 0.9.11 as the basis for your
  implementation. i.e. start with the source code of Kalisto and extend
  it by the required functionality. Kalisto 0.9.11 is available for
  download from the following URL:

    http://d3s.mff.cuni.cz/osy/download/kalisto-0.9.11.tar.bz2

  If the assignment does not define an aspect of the required
  implementation specifically or precisely, then the expected behavior
  of the implementation is defined by the unit tests. If even the tests
  do not define an aspect of the required implementation specifically or
  precisely, then use your best knowledge and professional judgement to
  implement the functionality in the most reasonable way and document
  your decisions using comments in your source code.


Virtual Address Space of the MIPS R4000 CPU ----------------------------

  Depending on the current privilege mode of the MIPS R4000 CPU (user,
  supervisor, kernel), the 32-bit virtual address space is divided into
  several possible sets of segments. These segments are hard-wired on
  the MIPS R4000 CPU and their general properties cannot be altered by
  software. The virtual addresses are mapped to physical addresses
  either in a predefined manner (such is the case of the KSEG0 and KSEG1
  segments; see below) or the mapping is configurable by the operating
  system by means of altering the Translation Lookaside Buffer (TLB).

  In user mode, only the USEG segment is available, with the virtual
  address range of [0x00000000, 0x800000000). The USEG virtual addresses
  are mapped to physical addresses via the TLB. Every memory access
  outside the USEG memory range in user mode causes the Address Error
  exception.

  In supervisor mode, the following two segments are available:

  * SUSEG, with the virtual address range of [0x00000000, 0x80000000)
  * SSEG, with the virtual address range of [0xC0000000, 0xE0000000)

  The SUSEG and SSEG virtual addresses are mapped to physical addresses
  via the TLB. The SUSEG segment is an analogy of the USEG segment in
  user mode. Every memory access outside the SUSEG and SSEG memory
  ranges in supervisor mode causes the Address Error exception.

  Finally, in kernel mode, the following five segments are available:

  * KUSEG, with the virtual address range of [0x00000000, 0x80000000)
  * KSEG0, with the virtual address range of [0x80000000, 0xA0000000)
  * KSEG1, with the virtual address range of [0xA0000000, 0xC0000000)
  * KSSEG, with the virtual address range of [0xC0000000, 0xE0000000)
  * KSEG3, with the virtual address range of [0xE0000000, 0xFFFFFFFF]

  The KSSEG segment is alternatively labeled as KSEG2.

  The virtual addresses from the KUSEG, KSSEG and KSEG3 segments are
  mapped to physical addresses via the TLB. The KUSEG segment is an
  analogy of the USEG segment in user mode and the KSSEG is an analogy
  of the SSEG segment in supervisor mode.

  The virtual addresses from the KSEG0 and KSEG1 segments are mapped to
  physical addresses using a hard-wired identity mapping by subtracting
  the base address of the segment, i.e. both the virtual addresses
  0x80000000 and 0xA0000000 are mapped to the same physical address
  0x00000000. In other words, the virtual address ranges of the KSEG0
  and KSEG1 segments are mapped to the physical address range of
  [0x00000000, 0x20000000).

  On the actual MIPS R4000 CPU the memory accesses in the KSEG1 segment
  bypass the CPU cache.


Virtual Address Space as Managed by the Operating System ---------------

  The virtual address space (also often called virtual memory map [VMM]
  or simply address space [AS]) is a data structure managed by the
  operating system that stores information about areas of the virtual
  memory (whether an area is in use, how are the virtual addresses of an
  area mapped to physical addresses, etc.).

  The virtual address space is divided into areas (often called virtual
  memory areas [VMA]). These areas do not cross the boundaries of the
  MIPS R4000 segments, because the way virtual pages map to physical
  frames differs in each segment.

  Each virtual memory area comprises of a continuous sequence of virtual
  pages. Each such virtual page is mapped to a physical frame. Depending
  on the segment, this mapping is either hard-wired (in case of the
  KSEG0 and KSEG1 segments) or it is defined by the operating system by
  changing the configuration of the TLB (all other segments). The
  mapping defined by the TLB does not necessarily have to map the
  continuous sequence of virtual pages to a continuous sequence of
  physical frames (the mapping does not even have to be injective).

  The virtual address space is one particular type of resource that is
  usually shared by all threads of any given user space process. On the
  other hand, each user space process uses its own independent virtual
  address space. This is the primary form of isolation of each user
  space process from the other processes.


General Recommendations for the Implementation -------------------------

  The following recommendations capture the usual criteria of quality of
  source code that is generally expected in critical system-level
  software. While respecting these recommendations is slightly less
  binding than the functional requirements of the assignment, severely
  violating them might still lead to rejection of your implementation.

  - Avoid using non-trivial hard-wired numerical constants in the source
    code. Use properly named reusable macros.

  - Use empty lines and block comments as visual delimiters to
    consistently demark logical parts of the source code.

  - Use consistent horizontal spacing to improve the readability and
    comprehension of the source code. This includes both spaces in
    expressions and proper indentation.

  - Avoid creating overly long functions, functions with an excessive
    level of block nesting and functions that implement an excessive
    number of distinct logical operations. The function name should
    adequately capture the summary of the function logical operation.

  - Avoid creating source files that mix logically unrelated concerns or
    functionality. Avoid placing source files into unrelated
    directories.

  - Use properly named macros or inline functions as replacements for
    complex logical and arithmetic expressions.

  - Use a consistent coding style (similar constructs should use similar
    means). Match the coding style of the new code with the coding style
    of the existing code.

  - Use English for identifiers and comments. Use a consistent way of
    naming things and a consistent way of commenting.


Task Assignment --------------------------------------------------------

  The management of virtual address spaces comprises primarily of the
  means to create, destroy and potentially manipulate the virtual memory
  areas within the context of the virtual address space of the currently
  running process (thread). The virtual memory areas cannot overlap,
  their starting virtual address and size is aligned on the smallest
  possible page size (i.e. 4 KiB in case of MIPS R4000).

  There is no constant upper bound on the number of virtual address
  spaces the kernel supports. Likewise, there is no constant upper bound
  on the number of virtual memory areas within each virtual address
  space the kernel supports. The number of both entities is practically
  bounded only by the total amount of available resources (i.e. physical
  memory).

  Implement and/or use a suitable dynamic data structures for the
  bookkeeping of the VMMs, VMAs and virtual address mappings. The data
  structures should be mainly optimized for looking up the VMA in which
  a given virtual address lies (virtual address -> VMA) and for looking
  up the mapping of a given virtual address (virtual address -> physical
  address).

  A reasonable data structure for the virtual address -> VMA mapping is
  for example a balanced search tree (there is a sample red-black tree
  implementation in kernel/adt/rbtree.c) or a hash table. These data
  structures typically provide an O(log n) time complexity of the lookup
  (where n is the number of VMAs in the VMM).

  A reasonable data structure for the virtual address -> physical
  address mapping are hierarchical page tables [3] which typically
  provide an O(1) time complexity of the lookup. Various combinations of
  the two approaches are possible.

  The virtual address space management is required to implement the
  following API:

  * int vmm_create(vmm_t *pvmm)

    This function creates a new (empty) virtual address space (VMM) and
    returns its initialized data structure in the @pvmm output argument.
    The function returns EOK if the creation was successful. If there is
    not enough memory to create the new VMM, then the function returns
    ENOMEM.

  * int vma_map(void **from, const size_t size, const vm_flags_t flags)

    This function creates a new virtual memory area (VMA) within the
    virtual address space of the currently running process (thread) and
    creates a mapping of the continuous virtual pages to the (not
    necessarily continuous) physical frames acquired from the frame
    allocator.

    The size (in bytes) of the created VMA is defined by the @size
    argument. Depending on the bit flags in the @flags argument, the
    function either looks for a suitable starting virtual address of
    the VMA and returns it in the @from output argument (if @flags
    contains the VF_VA_AUTO flag), or it tries to create a VMA with a
    starting virtual address specified by the caller in the @from
    argument (if @flags contains the VF_VA_USER flag).

    Additional bit flags in the @flags argument determine the MIPS R4000
    segment where the VMA should be created. The values for the
    individual segments are VF_AT_KUSEG, VF_AT_KSEG0, VF_AT_KSEG1,
    VF_AT_KSSEG and VF_AT_KSEG3. The creation of two VMAs in the KSEG0
    and KSEG1 segments is mutually excluded if the two VMAs map to
    overlapping physical addresses.

    This function returns EOK if the physical memory was successfully
    allocated and the VMA created. If the creation of the VMA with a
    starting address specified in @from is not possible in the given
    segment or if either @from or @size are not properly aligned, then
    the function returns EINVAL. If there is not enough physical memory
    to satisfy the request, then the function returns ENOMEM.

  * int vma_unmap(const void *from)

    This function disposes the virtual memory area (VMA) within the
    virtual address space of the currently running process (thread)
    specified by the starting virtual address @from. The VMA has to be
    previously created by calling vma_map(). The function also releases
    the physical memory associated with the disposed VMA.

    The function returns EOK if the VMA was successfully disposed and
    the physical memory was successfully released. If the @from argument
    does not identify a valid VMA previously created by calling
    vma_map(), then the function returns EINVAL.

    A correct implementation of vma_unmap() must not only remove the VMA
    from the kernel data structures, but it must also initiate the "TLB
    shootdown" operation that purges all relevant mapping entries from
    the TLB.

  - Handling of TLB Exceptions

    The MIPS R4000 CPU might assert one of the following two exceptions
    while accessing a virtual memory address.

    The TLB Refill exception (see wrapped_tlb_refill()) is asserted when
    there is no suitable entry present in the TLB that would map the
    given virtual address to a physical address. The TLB Invalid
    exception (see tlb_invalid()) is asserted when there is a suitable
    entry in the TLB that maps the given virtual address to a physical
    address, but this entry is marked as invalid.

    The goal of the exceptions handlers of both exceptions is to lookup
    the physical address that corresponds to the given virtual address
    and fill in the corresponding entry into the TLB. If there is no
    valid mapping of the virtual address (it does not belong to any VMA
    of the current VMM), then the exception handler should finish the
    execution of the current (offending) thread.

    Note: Marking a TLB entry as invalid is helpful for example when
    implementing the copy-on-write mechanism or swapping. However, there
    is no use case for this feature in Kalisto.

    The TLB of the MIPS R4000 CPU can store mappings of multiple
    distinct virtual address spaces at the same time. The Address Space
    Identifier (ASID) is used to distinguish the individual virtual
    address spaces. This assignment mandates that the ASID feature is
    used to avoid unnecessary "TLB flush" operations when switching the
    current address space. On the other hand, the implementation can
    take the liberty of using the following simplifying assumption:

    * The number of virtual address spaces present in the system is
      bounded by the number of valid ASID values on the MIPS R4000 CPU.
      Therefore it is not necessary to implement a Least Recently Used
      (LRU) or a similar algorithm for the dynamic assignment of the
      ASID values to virtual address spaces.


Unit Tests -------------------------------------------------------------

  The unit tests that validate the functionality of the implementation
  can be found in the kernel/tests/vmm subtree of the source tree of
  Kalisto 0.9.11. The infrastructure for running and evaluating the
  tests can be executed using the tests-vmm.sh shell script. Your
  implementation must successfully pass all the supplied tests in a
  non-trivial way (i.e. the VMAs must be created if there is sufficient
  physical memory, there must be no memory leaks, the tests must finish
  in a reasonable time, etc.). Modification of the internal logic of the
  tests is forbidden.

  Note: The map2 test is multithreaded. Therefore make sure to access
  shared data structures only within critical sections guarded by the
  following calls (see kernel/mm/vmm.c) in order to disable interrupts
  (preemption) and force mutual exclusion of the threads on a single
  CPU:

    ipl_t state = query_and_disable_interrupts();
    /* Critical section */
    conditionally_enable_interrupts(state);

  The amount of configured physical memory can be randomly altered
  in order to stress-test your implementation.

  Replace the original trivial implementation in kernel/mm/vmm.c and
  kernel/mm/tlb.c with your custom implementation.

  To integrate your implementation with the tests, add necessary
  #include directives to the kernel/api.h header file to make sure
  all required declarations are available for the tests.


Recommended Literature -------------------------------------------------

[1] Operating Systems Course study material, Chapter 3. Memory
    Management, http://d3s.mff.cuni.cz/~ceres/sch/osy/text/ch03.html
[2] Wikipedia: Virtual memory,
    https://en.wikipedia.org/wiki/Virtual_memory
[3] Wikipedia: Page table,
    https://en.wikipedia.org/wiki/Page_table


More information about the NSWI004 mailing list