[OSy] Assignment of the 5th semestral task

Martin Decky decky at d3s.mff.cuni.cz
Thu Dec 29 14:27:41 CET 2016


Dear all,

please see attached the assignment of the 5th extended semestral task 
that deals with user space support.

Take your time to familiarize yourself with the topic, with the relevant 
parts of Kalisto and with the related code constructs (how to validate 
syscall arguments, how to convert user space identifiers to the pointers 
to kernel structures, etc.). This will greatly improve your chances of 
passing the labs assignment.

If you choose to implement the extended assignment, please respect the 
deadline of January 12th 2017 (meaning that you should deliver your 
implementation no later than this day).

Please send your implementation via email to the following email address:

   nswi004 at d3s.mff.cuni.cz

You should send your implementation as an archive (zip, tar.bz2, etc.) 
in the attachment of the email. Please make sure to include everything 
necessary to run your implementation, to run the required unit tests and 
to evaluate your source code.

If you should have any questions (either specific to the assignment or 
of a generic nature), please do not hesitate to contact us.


Best regards

Martin Decky
-------------- next part --------------
========================================================================
                        Operating Systems Course
                             Semestral Task

                       Topic no. 5: System Calls
                           Extended Assignment
========================================================================
Date of assignment:  December 29th 2016
Submission deadline: January 12th 2017
------------------------------------------------------------------------

  The goal of the 5th semestral task is to extend the support for
  running user space processes in Kalisto. The user space infrastructure
  comprises of routines for creating and running user space processes
  and threads and a system call interface that provides the essential
  kernel services to the user space. The task is defined by a functional
  programming interface that needs to be implemented and it is
  accompanied by a set of unit tests that the implementation must
  successfully pass.

  Use the source code of Kalisto version 0.9.11 as the basis for your
  implementation. i.e. start with the source code of Kalisto and extend
  it by the required functionality. Kalisto 0.9.11 is available for
  download from the following URL:

    http://d3s.mff.cuni.cz/osy/download/kalisto-0.9.11.tar.bz2

  If the assignment does not define an aspect of the required
  implementation specifically or precisely, then the expected behavior
  of the implementation is defined by the unit tests. If even the tests
  do not define an aspect of the required implementation specifically or
  precisely, then use your best knowledge and professional judgement to
  implement the functionality in the most reasonable way and document
  your decisions using comments in your source code.


System Call Interface --------------------------------------------------

  The system call mechanism supplies the functionality of selected
  kernel services to the user space applications. It is a function call
  interface that passes a method identification and method arguments
  from the user space to the kernel, dispatches the control flow to the
  proper handler on the kernel side of the call and facilitates the
  returning of the output arguments back to the user space. The dispatch
  is usually done via a "syscall table", i.e. a data structure (mostly
  just an array) that maps a method number to the method handler.

  The user space is usually structured into user space processes. Each
  user space process can have multiple user space threads and it is
  associated with its address space and kernel resources. The address
  space of an user space process is isolated from the address spaces of
  other user space processes, as well as from the kernel address space.
  The user space process is allowed to manipulate the kernel resources
  associated with it only indirectly via the system call interface.

  The system call mechanism should be reasonably efficient (since the
  user space might use it fairly often in certain workloads), but it
  should be also possible to extend the mechanism with new methods
  and their handlers. The system call mechanism is also usually
  encapsulated in a lightweight abstraction that allows to easily
  replace both the caller stubs and the callee skeletons. Typically,
  the routines on the caller side (stubs) are implemented as regular
  library functions that internally use one or multiple system calls.
  Likewise, the handlers on the callee side (skeletons) are implemented
  as dedicated routines that internally use one or multiple regular
  kernel functions.

  The kernel should not be used as a shared library implementing
  common operations for the user space (a regular user space shared
  library should be used for the purpose of plain code reuse). The user
  space should delegate to the kernel only those operations that are
  impossible or at least hard to implement in user space. The exact
  understanding of the "hard to implement in user space" phrase is the
  primary distinguishing factor between the monolithic and microkernel
  operating system designs.

  The reason for this design prudence is the fact that any code in
  kernel space is running with elevated (possibly unlimited) privileges
  compared the same code in user space. Thus the same bug that is
  present in the same code might have vastly larger impact when executed
  in kernel space compared to user space.

  The same reason is behind the requirement that the system call
  handlers in the kernel must be hardened against any misuse
  (intentional or random) from user space. This hardening is primarily
  based on explicit and thorough testing of the correctness, validity
  and consistency of the input arguments of the system calls and on
  avoiding all logical errors (including, but not limited to, race
  conditions) that may cause the kernel to fail while processing a
  system call.

  More technically, the kernel must always make sure that it operates
  only on the virtual memory that is properly mapped to the caller of
  the system call, i.e. memory that belongs to the address space of the
  calling user space process (and not mapped to a different process or
  to the kernel itself). The kernel must always make sure that it
  operates only on those kernel data structures that logically belong to
  the caller (and the caller has sufficient privileges to access them).
  The kernel must also make sure that no unrecoverable exception or
  error state is reached as a consequence of handling a system call.

  If the user space requests an invalid operation or supplies invalid
  arguments, it is mostly possible to just gracefully deny the offending
  operation and return an error value. In special cases when returning
  an error value does not make sense, it is possible to terminate the
  offending thread or process (while properly releasing all the kernel
  resources associated with it). However, in no case should the user
  space process be able to jeopardize the stability of other user space
  processes or the kernel itself.


General Recommendations for the Implementation -------------------------

  The following recommendations capture the usual criteria of quality of
  source code that is generally expected in critical system-level
  software. While respecting these recommendations is slightly less
  binding than the functional requirements of the assignment, severely
  violating them might still lead to rejection of your implementation.

  - Avoid using non-trivial hard-wired numerical constants in the source
    code. Use properly named reusable macros.

  - Use empty lines and block comments as visual delimiters to
    consistently demark logical parts of the source code.

  - Use consistent horizontal spacing to improve the readability and
    comprehension of the source code. This includes both spaces in
    expressions and proper indentation.

  - Avoid creating overly long functions, functions with an excessive
    level of block nesting and functions that implement an excessive
    number of distinct logical operations. The function name should
    adequately capture the summary of the function logical operation.

  - Avoid creating source files that mix logically unrelated concerns or
    functionality. Avoid placing source files into unrelated
    directories.

  - Use properly named macros or inline functions as replacements for
    complex logical and arithmetic expressions.

  - Use a consistent coding style (similar constructs should use similar
    means). Match the coding style of the new code with the coding style
    of the existing code.

  - Use English for identifiers and comments. Use a consistent way of
    naming things and a consistent way of commenting.


Task Assignment --------------------------------------------------------

  After the Kalisto kernel initializes, it executes a singular user
  space process whose binary image is preloaded to the physical memory.
  This user space process is using the user space infrastructure
  implemented in the statically linked run-time library librt.a.

  The kernel switches the CPU to user mode and instructs it to start
  executing the user space entry point that is implemented by the
  librt.a library. The entry point sets up the run-time environment,
  calls the library initialization routines and later calls the main()
  function of the process. After the main() function exits, the librt.a
  library finally tears down the run-time environment and asks the
  kernel to terminate the user space process.

  Implement or extend the implementation of the librt.a functions listed
  below. Consider carefully which functions should make use of system
  calls (e.g. putc()) and which functions should be mostly or completely
  implemented without the direct use of any system calls in their own
  body (e.g. printf(), gets(), malloc(), free()). There are some
  functions (e.g. puts()) where both variants are essentially
  permissible, since they both have their specific advantages and
  disadvantages.

  If a system call passes a pointer to the kernel, the kernel is
  required to validate whether the pointer actually points to properly
  mapped user space memory of the calling process. Accessing the user
  memory from the kernel should never cause a memory access exception,
  nor should the kernel inadvertently access any kernel memory or the
  memory of a different address space.

  Other system call arguments need to be validated as well. Invalid
  arguments should be indicated by error return values of the system
  call or by terminating the thread/process. However, when a thread or a
  process is terminated, all its resources must be properly released.

  * size_t putc(const char chr)

    This function prints the character @chr to the dprinter device.
    The return value is the number of characters printed.

  * size_t puts(const char *str)

    This function prints the NULL-terminated string @str to the dprinter
    device. The return value is the number of characters printed.

  * size_t printf(const char *format, ...)

    This function prints a formatted NULL-terminated string @format to
    the dprinter device. The formatted string should support a subset of
    the common libc formatting directives, namely characters (%c),
    strings (%s), integers in decimal (%d, %u, %i) and hexadecimal (%x)
    notations, pointers (%p). It is not necessary to implement the
    precision and width modifiers. The function returns the number of
    characters printed.

  * char getc(void)

    This function reads and returns a character from the dkeyboard
    device. If the keyboard buffer is empty, then the calling thread
    is blocked and waits for a key press. Active polling should be
    avoided.

  * ssize_t gets(char *str, const size_t len)

    This function reads at most @len - 1 characters from the dkeyboard
    device and stores them into the @str buffer (which is required to be
    able to store @len characters). The reading is finished when the
    last character read (and stored) is '\n' or when the limit has been
    reached. The terminating NULL character ('\0') is always stored into
    the buffer. If the argument @len is == 0, then the function returns
    EINVAL. Otherwise the function returns the number of characters read
    and stored (not including the terminating '\0'). Active polling
    should be avoided.

  * void exit(int retval)

    This function sets the exit code of the current user space process
    to @retval and terminates the main thread of the current user space
    process. This should cause the entire user space process (including
    its other threads) to finish executing. All resources (both in user
    space and in the kernel) that belong to the process have to be
    released. The resources need to be released in a way to avoid any
    deadlocks.

    If there are no more user space processes running in the system, the
    kernel shuts itself down, too.

  The following functions provide the functionality of the user space
  heap allocator. The heap allocator uses dedicated virtual memory areas
  as a backing store for the heap. The virtual memory areas are
  manipulated dynamically to satisfy the current memory requirements.

  Since the Kalisto kernel has limited support for manipulating existing
  virtual memory areas, the heap allocator should make use of fixed-size
  virtual memory areas that it asks the kernel to map and unmap (i.e.
  create and release).

  It is possible to port and reuse the existing Kalisto kernel heap
  allocator to user space, since the basic functionality is the same.
  However, it is essential that the kernel and user space heap
  allocators stay independent -- the user space heap allocator should
  implement its business logic in user space.

  * void *malloc(const size_t size)

    This function allocates a block of memory from the user space heap.
    The size of the allocated block should be always at least @size
    bytes, but the size of the block might be slightly increased if
    necessary (e.g. due to the alignment of the bookkeeping data
    structures and due to other necessary overhead). The starting
    address of the allocated block should be aligned on a 4-byte
    boundary.

    If there is no free block of a sufficient size in the currently
    managed heap(s), the heap allocator should request an additional
    virtual memory area from the kernel in order to create an additional
    heap. Only if the kernel denies such request, then the heap
    allocation fails.

    If the allocation fails, then the function returns NULL. Otherwise
    the function returns the starting address of the allocated block of
    memory.

  * void free(const void *ptr)

    This function releases a block of memory starting at the address
    @ptr on the user space heap that has been previously allocated by
    calling malloc(). If the @ptr argument does not point to the
    beginning of a valid allocated memory block, then the function might
    cause a run-time fault (e.g. calling exit()), but this behavior is
    not mandatory (it is the responsibility of the caller to avoid heap
    corruption).

    After releasing the block, this function should also optimize the
    heap to minimize the external fragmentation of the heap (if
    possible) and release entirely unused heaps by unmapping the unused
    virtual memory area.


Unit Tests -------------------------------------------------------------

  The unit tests that validate the functionality of the implementation
  can be found in the user/tests/basic subtree of the source tree of
  Kalisto 0.9.11. The infrastructure for running and evaluating the
  tests can be executed using the tests-user.sh shell script. Your
  implementation must successfully pass all the supplied tests in a
  non-trivial way (i.e. the implementation must not use active polling,
  the heap allocator must actually allocate some memory, it must not
  leak memory, the tests must finish in a reasonable time, etc.).
  Modification of the internal logic of the tests is forbidden.

  Delete the original binary implementation of the input/output routines
  (user/librt/stdio.obj) from the source tree of Kalisto to make sure
  that the tests validate your original implementation.

  To integrate your implementation with the tests, add necessary
  #include directives to the user/librt/librt.h header file to make sure
  all required declarations are available for the tests.


Recommended Literature -------------------------------------------------

[1] Operating Systems Course study material, Chapter 2. Process
    Management, http://d3s.mff.cuni.cz/~ceres/sch/osy/text/ch02.html
[2] Wikipedia: User space,
    https://en.wikipedia.org/wiki/User_space


More information about the NSWI004 mailing list