[OSy] Assignment of the 5th semestral task
Martin Decky
decky at d3s.mff.cuni.cz
Thu Dec 29 14:27:41 CET 2016
Dear all,
please see attached the assignment of the 5th extended semestral task
that deals with user space support.
Take your time to familiarize yourself with the topic, with the relevant
parts of Kalisto and with the related code constructs (how to validate
syscall arguments, how to convert user space identifiers to the pointers
to kernel structures, etc.). This will greatly improve your chances of
passing the labs assignment.
If you choose to implement the extended assignment, please respect the
deadline of January 12th 2017 (meaning that you should deliver your
implementation no later than this day).
Please send your implementation via email to the following email address:
nswi004 at d3s.mff.cuni.cz
You should send your implementation as an archive (zip, tar.bz2, etc.)
in the attachment of the email. Please make sure to include everything
necessary to run your implementation, to run the required unit tests and
to evaluate your source code.
If you should have any questions (either specific to the assignment or
of a generic nature), please do not hesitate to contact us.
Best regards
Martin Decky
-------------- next part --------------
========================================================================
Operating Systems Course
Semestral Task
Topic no. 5: System Calls
Extended Assignment
========================================================================
Date of assignment: December 29th 2016
Submission deadline: January 12th 2017
------------------------------------------------------------------------
The goal of the 5th semestral task is to extend the support for
running user space processes in Kalisto. The user space infrastructure
comprises of routines for creating and running user space processes
and threads and a system call interface that provides the essential
kernel services to the user space. The task is defined by a functional
programming interface that needs to be implemented and it is
accompanied by a set of unit tests that the implementation must
successfully pass.
Use the source code of Kalisto version 0.9.11 as the basis for your
implementation. i.e. start with the source code of Kalisto and extend
it by the required functionality. Kalisto 0.9.11 is available for
download from the following URL:
http://d3s.mff.cuni.cz/osy/download/kalisto-0.9.11.tar.bz2
If the assignment does not define an aspect of the required
implementation specifically or precisely, then the expected behavior
of the implementation is defined by the unit tests. If even the tests
do not define an aspect of the required implementation specifically or
precisely, then use your best knowledge and professional judgement to
implement the functionality in the most reasonable way and document
your decisions using comments in your source code.
System Call Interface --------------------------------------------------
The system call mechanism supplies the functionality of selected
kernel services to the user space applications. It is a function call
interface that passes a method identification and method arguments
from the user space to the kernel, dispatches the control flow to the
proper handler on the kernel side of the call and facilitates the
returning of the output arguments back to the user space. The dispatch
is usually done via a "syscall table", i.e. a data structure (mostly
just an array) that maps a method number to the method handler.
The user space is usually structured into user space processes. Each
user space process can have multiple user space threads and it is
associated with its address space and kernel resources. The address
space of an user space process is isolated from the address spaces of
other user space processes, as well as from the kernel address space.
The user space process is allowed to manipulate the kernel resources
associated with it only indirectly via the system call interface.
The system call mechanism should be reasonably efficient (since the
user space might use it fairly often in certain workloads), but it
should be also possible to extend the mechanism with new methods
and their handlers. The system call mechanism is also usually
encapsulated in a lightweight abstraction that allows to easily
replace both the caller stubs and the callee skeletons. Typically,
the routines on the caller side (stubs) are implemented as regular
library functions that internally use one or multiple system calls.
Likewise, the handlers on the callee side (skeletons) are implemented
as dedicated routines that internally use one or multiple regular
kernel functions.
The kernel should not be used as a shared library implementing
common operations for the user space (a regular user space shared
library should be used for the purpose of plain code reuse). The user
space should delegate to the kernel only those operations that are
impossible or at least hard to implement in user space. The exact
understanding of the "hard to implement in user space" phrase is the
primary distinguishing factor between the monolithic and microkernel
operating system designs.
The reason for this design prudence is the fact that any code in
kernel space is running with elevated (possibly unlimited) privileges
compared the same code in user space. Thus the same bug that is
present in the same code might have vastly larger impact when executed
in kernel space compared to user space.
The same reason is behind the requirement that the system call
handlers in the kernel must be hardened against any misuse
(intentional or random) from user space. This hardening is primarily
based on explicit and thorough testing of the correctness, validity
and consistency of the input arguments of the system calls and on
avoiding all logical errors (including, but not limited to, race
conditions) that may cause the kernel to fail while processing a
system call.
More technically, the kernel must always make sure that it operates
only on the virtual memory that is properly mapped to the caller of
the system call, i.e. memory that belongs to the address space of the
calling user space process (and not mapped to a different process or
to the kernel itself). The kernel must always make sure that it
operates only on those kernel data structures that logically belong to
the caller (and the caller has sufficient privileges to access them).
The kernel must also make sure that no unrecoverable exception or
error state is reached as a consequence of handling a system call.
If the user space requests an invalid operation or supplies invalid
arguments, it is mostly possible to just gracefully deny the offending
operation and return an error value. In special cases when returning
an error value does not make sense, it is possible to terminate the
offending thread or process (while properly releasing all the kernel
resources associated with it). However, in no case should the user
space process be able to jeopardize the stability of other user space
processes or the kernel itself.
General Recommendations for the Implementation -------------------------
The following recommendations capture the usual criteria of quality of
source code that is generally expected in critical system-level
software. While respecting these recommendations is slightly less
binding than the functional requirements of the assignment, severely
violating them might still lead to rejection of your implementation.
- Avoid using non-trivial hard-wired numerical constants in the source
code. Use properly named reusable macros.
- Use empty lines and block comments as visual delimiters to
consistently demark logical parts of the source code.
- Use consistent horizontal spacing to improve the readability and
comprehension of the source code. This includes both spaces in
expressions and proper indentation.
- Avoid creating overly long functions, functions with an excessive
level of block nesting and functions that implement an excessive
number of distinct logical operations. The function name should
adequately capture the summary of the function logical operation.
- Avoid creating source files that mix logically unrelated concerns or
functionality. Avoid placing source files into unrelated
directories.
- Use properly named macros or inline functions as replacements for
complex logical and arithmetic expressions.
- Use a consistent coding style (similar constructs should use similar
means). Match the coding style of the new code with the coding style
of the existing code.
- Use English for identifiers and comments. Use a consistent way of
naming things and a consistent way of commenting.
Task Assignment --------------------------------------------------------
After the Kalisto kernel initializes, it executes a singular user
space process whose binary image is preloaded to the physical memory.
This user space process is using the user space infrastructure
implemented in the statically linked run-time library librt.a.
The kernel switches the CPU to user mode and instructs it to start
executing the user space entry point that is implemented by the
librt.a library. The entry point sets up the run-time environment,
calls the library initialization routines and later calls the main()
function of the process. After the main() function exits, the librt.a
library finally tears down the run-time environment and asks the
kernel to terminate the user space process.
Implement or extend the implementation of the librt.a functions listed
below. Consider carefully which functions should make use of system
calls (e.g. putc()) and which functions should be mostly or completely
implemented without the direct use of any system calls in their own
body (e.g. printf(), gets(), malloc(), free()). There are some
functions (e.g. puts()) where both variants are essentially
permissible, since they both have their specific advantages and
disadvantages.
If a system call passes a pointer to the kernel, the kernel is
required to validate whether the pointer actually points to properly
mapped user space memory of the calling process. Accessing the user
memory from the kernel should never cause a memory access exception,
nor should the kernel inadvertently access any kernel memory or the
memory of a different address space.
Other system call arguments need to be validated as well. Invalid
arguments should be indicated by error return values of the system
call or by terminating the thread/process. However, when a thread or a
process is terminated, all its resources must be properly released.
* size_t putc(const char chr)
This function prints the character @chr to the dprinter device.
The return value is the number of characters printed.
* size_t puts(const char *str)
This function prints the NULL-terminated string @str to the dprinter
device. The return value is the number of characters printed.
* size_t printf(const char *format, ...)
This function prints a formatted NULL-terminated string @format to
the dprinter device. The formatted string should support a subset of
the common libc formatting directives, namely characters (%c),
strings (%s), integers in decimal (%d, %u, %i) and hexadecimal (%x)
notations, pointers (%p). It is not necessary to implement the
precision and width modifiers. The function returns the number of
characters printed.
* char getc(void)
This function reads and returns a character from the dkeyboard
device. If the keyboard buffer is empty, then the calling thread
is blocked and waits for a key press. Active polling should be
avoided.
* ssize_t gets(char *str, const size_t len)
This function reads at most @len - 1 characters from the dkeyboard
device and stores them into the @str buffer (which is required to be
able to store @len characters). The reading is finished when the
last character read (and stored) is '\n' or when the limit has been
reached. The terminating NULL character ('\0') is always stored into
the buffer. If the argument @len is == 0, then the function returns
EINVAL. Otherwise the function returns the number of characters read
and stored (not including the terminating '\0'). Active polling
should be avoided.
* void exit(int retval)
This function sets the exit code of the current user space process
to @retval and terminates the main thread of the current user space
process. This should cause the entire user space process (including
its other threads) to finish executing. All resources (both in user
space and in the kernel) that belong to the process have to be
released. The resources need to be released in a way to avoid any
deadlocks.
If there are no more user space processes running in the system, the
kernel shuts itself down, too.
The following functions provide the functionality of the user space
heap allocator. The heap allocator uses dedicated virtual memory areas
as a backing store for the heap. The virtual memory areas are
manipulated dynamically to satisfy the current memory requirements.
Since the Kalisto kernel has limited support for manipulating existing
virtual memory areas, the heap allocator should make use of fixed-size
virtual memory areas that it asks the kernel to map and unmap (i.e.
create and release).
It is possible to port and reuse the existing Kalisto kernel heap
allocator to user space, since the basic functionality is the same.
However, it is essential that the kernel and user space heap
allocators stay independent -- the user space heap allocator should
implement its business logic in user space.
* void *malloc(const size_t size)
This function allocates a block of memory from the user space heap.
The size of the allocated block should be always at least @size
bytes, but the size of the block might be slightly increased if
necessary (e.g. due to the alignment of the bookkeeping data
structures and due to other necessary overhead). The starting
address of the allocated block should be aligned on a 4-byte
boundary.
If there is no free block of a sufficient size in the currently
managed heap(s), the heap allocator should request an additional
virtual memory area from the kernel in order to create an additional
heap. Only if the kernel denies such request, then the heap
allocation fails.
If the allocation fails, then the function returns NULL. Otherwise
the function returns the starting address of the allocated block of
memory.
* void free(const void *ptr)
This function releases a block of memory starting at the address
@ptr on the user space heap that has been previously allocated by
calling malloc(). If the @ptr argument does not point to the
beginning of a valid allocated memory block, then the function might
cause a run-time fault (e.g. calling exit()), but this behavior is
not mandatory (it is the responsibility of the caller to avoid heap
corruption).
After releasing the block, this function should also optimize the
heap to minimize the external fragmentation of the heap (if
possible) and release entirely unused heaps by unmapping the unused
virtual memory area.
Unit Tests -------------------------------------------------------------
The unit tests that validate the functionality of the implementation
can be found in the user/tests/basic subtree of the source tree of
Kalisto 0.9.11. The infrastructure for running and evaluating the
tests can be executed using the tests-user.sh shell script. Your
implementation must successfully pass all the supplied tests in a
non-trivial way (i.e. the implementation must not use active polling,
the heap allocator must actually allocate some memory, it must not
leak memory, the tests must finish in a reasonable time, etc.).
Modification of the internal logic of the tests is forbidden.
Delete the original binary implementation of the input/output routines
(user/librt/stdio.obj) from the source tree of Kalisto to make sure
that the tests validate your original implementation.
To integrate your implementation with the tests, add necessary
#include directives to the user/librt/librt.h header file to make sure
all required declarations are available for the tests.
Recommended Literature -------------------------------------------------
[1] Operating Systems Course study material, Chapter 2. Process
Management, http://d3s.mff.cuni.cz/~ceres/sch/osy/text/ch02.html
[2] Wikipedia: User space,
https://en.wikipedia.org/wiki/User_space
More information about the NSWI004
mailing list