[OSy] Assignment of the 2nd semestral task
Martin Decky
decky at d3s.mff.cuni.cz
Thu Nov 10 18:39:25 CET 2016
Dear colleagues,
please see attached the assignment of the 2nd extended semestral task. I
strongly encourage you to try to comprehend it and also to study the
relevant parts of Kalisto -- the next labs assignments will be again
related to the same topic and your working knowledge of the principles
and mechanisms will greatly improve your chances of succeeding in the labs.
If you choose to implement the extended assignment, please respect the
deadline of November 24th 2016 (meaning that you should deliver your
implementation no later than this day).
Please send your implementation via email to the following email address:
nswi004 at d3s.mff.cuni.cz
You should send your implementation as an archive (zip, tar.bz2, etc.)
in the attachment of the email. Please make sure to include everything
necessary to run your implementation, to run the required unit tests and
to evaluate your source code.
If you should have any questions (either specific to the assignment or
of a generic nature), please do not hesitate to send your query to this
mailing list.
Best regards
Martin Decky
-------------- next part --------------
========================================================================
Operating Systems Course
Semestral Task
Topic no. 2: Virtual Memory
Extended Assignment
========================================================================
Date of assignment: November 10th 2016
Submission deadline: November 24th 2016
------------------------------------------------------------------------
The goal of the 2nd semestral task is to implement the functionality
of virtual to physical memory mapping. The task is defined by a
functional programming interface that needs to be implemented and it
is accompanied by a set of unit tests that the implementation must
successfully pass.
Use the source code of Kalisto version 0.9.10 as the basis for your
implementation. i.e. start with the source code of Kalisto and extend
it by the required functionality. Kalisto 0.9.10 is available for
download from the following URL:
http://d3s.mff.cuni.cz/osy/download/kalisto-0.9.10.tar.bz2
If the assignment does not define an aspect of the required
implementation specifically or precisely, then the expected behavior
of the implementation is defined by the unit tests. If even the tests
do not define an aspect of the required implementation specifically or
precisely, then use your best knowledge and professional judgement to
implement the functionality in the most reasonable way and document
your decisions using comments in your source code.
Virtual Address Space of the MIPS R4000 CPU ----------------------------
Depending on the current privilege mode of the MIPS R4000 CPU (user,
supervisor, kernel), the 32-bit virtual address space is divided into
several possible sets of segments. These segments are hard-wired on
the MIPS R4000 CPU and their general properties cannot be altered by
software. The virtual addresses are mapped to physical addresses
either in a predefined manner (such is the case of the KSEG0 and KSEG1
segments; see below) or the mapping is configurable by the operating
system by means of altering the Translation Lookaside Buffer (TLB).
In user mode, only the USEG segment is available, with the virtual
address range of [0x00000000, 0x800000000). The USEG virtual addresses
are mapped to physical addresses via the TLB. Every memory access
outside the USEG memory range in user mode causes the Address Error
exception.
In supervisor mode, the following two segments are available:
* SUSEG, with the virtual address range of [0x00000000, 0x80000000)
* SSEG, with the virtual address range of [0xC0000000, 0xE0000000)
The SUSEG and SSEG virtual addresses are mapped to physical addresses
via the TLB. The SUSEG segment is an analogy of the USEG segment in
user mode. Every memory access outside the SUSEG and SSEG memory
ranges in supervisor mode causes the Address Error exception.
Finally, in kernel mode, the following five segments are available:
* KUSEG, with the virtual address range of [0x00000000, 0x80000000)
* KSEG0, with the virtual address range of [0x80000000, 0xA0000000)
* KSEG1, with the virtual address range of [0xA0000000, 0xC0000000)
* KSSEG, with the virtual address range of [0xC0000000, 0xE0000000)
* KSEG3, with the virtual address range of [0xE0000000, 0xFFFFFFFF]
The KSSEG segment is alternatively labeled as KSEG2.
The virtual addresses from the KUSEG, KSSEG and KSEG3 segments are
mapped to physical addresses via the TLB. The KUSEG segment is an
analogy of the USEG segment in user mode and the KSSEG is an analogy
of the SSEG segment in supervisor mode.
The virtual addresses from the KSEG0 and KSEG1 segments are mapped to
physical addresses using a hard-wired identity mapping by subtracting
the base address of the segment, i.e. both the virtual addresses
0x80000000 and 0xA0000000 are mapped to the same physical address
0x00000000. In other words, the virtual address ranges of the KSEG0
and KSEG1 segments are mapped to the physical address range of
[0x00000000, 0x20000000).
On the actual MIPS R4000 CPU the memory accesses in the KSEG1 segment
bypass the CPU cache.
Virtual Address Space as Managed by the Operating System ---------------
The virtual address space (also often called virtual memory map [VMM]
or simply address space [AS]) is a data structure managed by the
operating system that stores information about areas of the virtual
memory (whether an area is in use, how are the virtual addresses of an
area mapped to physical addresses, etc.).
The virtual address space is divided into areas (often called virtual
memory areas [VMA]). These areas do not cross the boundaries of the
MIPS R4000 segments, because the way virtual pages map to physical
frames differs in each segment.
Each virtual memory area comprises of a continuous sequence of virtual
pages. Each such virtual page is mapped to a physical frame. Depending
on the segment, this mapping is either hard-wired (in case of the
KSEG0 and KSEG1 segments) or it is defined by the operating system by
changing the configuration of the TLB (all other segments). The
mapping defined by the TLB does not necessarily have to map the
continuous sequence of virtual pages to a continuous sequence of
physical frames (the mapping does not even have to be injective).
The virtual address space is one particular type of resource that is
usually shared by all threads of any given user space process. On the
other hand, each user space process uses its own independent virtual
address space. This is the primary form of isolation of each user
space process from the other processes.
General Recommendations for the Implementation -------------------------
The following recommendations capture the usual criteria of quality of
source code that is generally expected in critical system-level
software. While respecting these recommendations is slightly less
binding than the functional requirements of the assignment, severely
violating them might still lead to rejection of your implementation.
- Avoid using non-trivial hard-wired numerical constants in the source
code. Use properly named reusable macros.
- Use empty lines and block comments as visual delimiters to
consistently demark logical parts of the source code.
- Use consistent horizontal spacing to improve the readability and
comprehension of the source code. This includes both spaces in
expressions and proper indentation.
- Avoid creating overly long functions, functions with an excessive
level of block nesting and functions that implement an excessive
number of distinct logical operations. The function name should
adequately capture the summary of the function logical operation.
- Avoid creating source files that mix logically unrelated concerns or
functionality. Avoid placing source files into unrelated
directories.
- Use properly named macros or inline functions as replacements for
complex logical and arithmetic expressions.
- Use a consistent coding style (similar constructs should use similar
means). Match the coding style of the new code with the coding style
of the existing code.
- Use English for identifiers and comments. Use a consistent way of
naming things and a consistent way of commenting.
Task Assignment --------------------------------------------------------
The management of virtual address spaces comprises primarily of the
means to create, destroy and potentially manipulate the virtual memory
areas within the context of the virtual address space of the currently
running process (thread). The virtual memory areas cannot overlap,
their starting virtual address and size is aligned on the smallest
possible page size (i.e. 4 KiB in case of MIPS R4000).
There is no constant upper bound on the number of virtual address
spaces the kernel supports. Likewise, there is no constant upper bound
on the number of virtual memory areas within each virtual address
space the kernel supports. The number of both entities is practically
bounded only by the total amount of available resources (i.e. physical
memory).
Implement and/or use a suitable dynamic data structures for the
bookkeeping of the VMMs, VMAs and virtual address mappings. The data
structures should be mainly optimized for looking up the VMA in which
a given virtual address lies (virtual address -> VMA) and for looking
up the mapping of a given virtual address (virtual address -> physical
address).
A reasonable data structure for the virtual address -> VMA mapping is
for example a balanced search tree (there is a sample red-black tree
implementation in kernel/adt/rbtree.c) or a hash table. These data
structures typically provide an O(log n) time complexity of the lookup
(where n is the number of VMAs in the VMM).
A reasonable data structure for the virtual address -> physical
address mapping are hierarchical page tables [3] which typically
provide an O(1) time complexity of the lookup. Various combinations of
the two approaches are possible.
The virtual address space management is required to implement the
following API:
* int vmm_create(vmm_t *pvmm)
This function creates a new (empty) virtual address space (VMM) and
returns its initialized data structure in the @pvmm output argument.
The function returns EOK if the creation was successful. If there is
not enough memory to create the new VMM, then the function returns
ENOMEM.
* int vma_map(void **from, const size_t size, const vm_flags_t flags)
This function creates a new virtual memory area (VMA) within the
virtual address space of the currently running process (thread) and
creates a mapping of the continuous virtual pages to the (not
necessarily continuous) physical frames acquired from the frame
allocator.
The size (in bytes) of the created VMA is defined by the @size
argument. Depending on the bit flags in the @flags argument, the
function either looks for a suitable starting virtual address of
the VMA and returns it in the @from output argument (if @flags
contains the VF_VA_AUTO flag), or it tries to create a VMA with a
starting virtual address specified by the caller in the @from
argument (if @flags contains the VF_VA_USER flag).
Additional bit flags in the @flags argument determine the MIPS R4000
segment where the VMA should be created. The values for the
individual segments are VF_AT_KUSEG, VF_AT_KSEG0, VF_AT_KSEG1,
VF_AT_KSSEG and VF_AT_KSEG3. The creation of two VMAs in the KSEG0
and KSEG1 segments is mutually excluded if the two VMAs map to
overlapping physical addresses.
This function returns EOK if the physical memory was successfully
allocated and the VMA created. If the creation of the VMA with a
starting address specified in @from is not possible in the given
segment or if either @from or @size are not properly aligned, then
the function returns EINVAL. If there is not enough physical memory
to satisfy the request, then the function returns ENOMEM.
* int vma_unmap(const void *from)
This function disposes the virtual memory area (VMA) within the
virtual address space of the currently running process (thread)
specified by the starting virtual address @from. The VMA has to be
previously created by calling vma_map(). The function also releases
the physical memory associated with the disposed VMA.
The function returns EOK if the VMA was successfully disposed and
the physical memory was successfully released. If the @from argument
does not identify a valid VMA previously created by calling
vma_map(), then the function returns EINVAL.
A correct implementation of vma_unmap() must not only remove the VMA
from the kernel data structures, but it must also initiate the "TLB
shootdown" operation that purges all relevant mapping entries from
the TLB.
- Handling of TLB Exceptions
The MIPS R4000 CPU might assert one of the following two exceptions
while accessing a virtual memory address.
The TLB Refill exception (see wrapped_tlb_refill()) is asserted when
there is no suitable entry present in the TLB that would map the
given virtual address to a physical address. The TLB Invalid
exception (see tlb_invalid()) is asserted when there is a suitable
entry in the TLB that maps the given virtual address to a physical
address, but this entry is marked as invalid.
The goal of the exceptions handlers of both exceptions is to lookup
the physical address that corresponds to the given virtual address
and fill in the corresponding entry into the TLB. If there is no
valid mapping of the virtual address (it does not belong to any VMA
of the current VMM), then the exception handler should finish the
execution of the current (offending) thread.
Note: Marking a TLB entry as invalid is helpful for example when
implementing the copy-on-write mechanism or swapping. However, there
is no use case for this feature in Kalisto.
The TLB of the MIPS R4000 CPU can store mappings of multiple
distinct virtual address spaces at the same time. The Address Space
Identifier (ASID) is used to distinguish the individual virtual
address spaces. This assignment mandates that the ASID feature is
used to avoid unnecessary "TLB flush" operations when switching the
current address space. On the other hand, the implementation can
take the liberty of using the following simplifying assumption:
* The number of virtual address spaces present in the system is
bounded by the number of valid ASID values on the MIPS R4000 CPU.
Therefore it is not necessary to implement a Least Recently Used
(LRU) or a similar algorithm for the dynamic assignment of the
ASID values to virtual address spaces.
Unit Tests -------------------------------------------------------------
The unit tests that validate the functionality of the implementation
can be found in the kernel/tests/vmm subtree of the source tree of
Kalisto 0.9.10. The infrastructure for running and evaluating the
tests can be executed using the tests-vmm.sh shell script. Your
implementation must successfully pass all the supplied tests in a
non-trivial way (i.e. the VMAs must be created if there is sufficient
physical memory, there must be no memory leaks, the tests must finish
in a reasonable time, etc.). Modification of the internal logic of the
tests is forbidden.
Note: The map2 test is multithreaded. Therefore make sure to access
shared data structures only within critical sections guarded by the
following calls (see kernel/mm/vmm.c) in order to disable interrupts
(preemption) and force mutual exclusion of the threads on a single
CPU:
ipl_t state = query_and_disable_interrupts();
/* Critical section */
conditionally_enable_interrupts(state);
The amount of configured physical memory can be randomly altered
in order to stress-test your implementation.
Replace the original trivial implementation in kernel/mm/vmm.c and
kernel/mm/tlb.c with your custom implementation.
To integrate your implementation with the tests, add necessary
#include directives to the kernel/api.h header file to make sure
all required declarations are available for the tests.
Recommended Literature -------------------------------------------------
[1] Operating Systems Course study material, Chapter 3. Memory
Management, http://d3s.mff.cuni.cz/~ceres/sch/osy/text/ch03.html
[2] Wikipedia: Virtual memory,
https://en.wikipedia.org/wiki/Virtual_memory
[3] Wikipedia: Page table,
https://en.wikipedia.org/wiki/Page_table
More information about the NSWI004
mailing list