Code style | NSWI200

Throughout this course we will enforce a common coding style. We are perfectly aware that every developer has different preferences, but it is a common practice on team projects to require that all team members adhere to a single style.

In this particular case, because you will be integrating your solution into provided skeleton code, this prevents harboring two (or more) different coding styles in a single code base, which is generally considered a bad practice. Because you will be required to merge changes to the skeleton code from the upstream repository in subsequent milestones, reformatting the skeleton code to suit your preferences is not really an option.

This page summarizes the style requirements for your code. We try to demonstrate various issues on actual code examples, always showing how the code should be transformed into code that is more readable and easier to maintain.

We have reformatted some of the code examples to better fit on one page but we try to use descriptive identifiers and proper naming in all good code examples.

Strive for consistency
Keep it simple, …
Avoid code duplication
Use proper abstraction layers
Check function arguments and “impossible” situations
Avoid resource leaks
Protect shared structures (but not local state)

Strive for consistency

Is your source code formatted the same in all your source files?

There are special targets in your Makefiles to check that your code is well formatted. Use them before committing or setup your editor to reformat the code for you. (It is fine, everyone sometimes forgets some spaces etc.)

Please, do not change the style configuration you were provided with.

Do you use the same naming conventions for all of your identifiers (variables, functions, types)?

Are type names terminated with _t?
Are initialization functions named as _init?

The code should not be a Christmas present full of surprises. Make it easy for any readers to digest the code without being distracted by irrelevancies. And the confused reader will be probably you anyway (in a few weeks).

Do not comment things that can be expressed in code. Use your IDEs and text editors to rename all instances of an identifier if you come up with a better name. Such renames are usually just a few keystrokes away and they will save a lot of time in the long run.

On the other hand, do not overdo things. If identifier is local to a certain context, short name is usually sufficient. That is why i is perfectly fine for a two line loop.

// first available block
static heap_block_t* heap_fst;
// last modified block
static heap_block_t* heap_ptr;

static heap_block_t* first_available_block;
static heap_block_t* last_modified_block;

Keep it simple, …

Are conditions for branching and looping easy to read? Are complex expressions split into multiple expressions, with intermediate results in properly named variables?

Replacing conditions with properly named variables simplifies reading. The reader sees the intent, not the implementation.

Do not force the reader to think simultaneously about the problem and about language rules at the same time: not everyone knows the operator priority by heart. Using operators that have close (or even same) priorities without explicit parenthesis or splitting into multiple variables is discouraged. Note that compiler will optimize the intermediate variables away as it sees fit regardless of the original code in most cases.

putchar(*--current);

current--;
putchar(*current);

Avoid code duplication

Is duplicate code extracted into a single function?

Before you hit Ctrl-C and Ctrl-V, think whether creating a function would not create more maintainable code.

The example below illustrates how splitting the operation into two parts can reduce code duplicity and also improve readability of the sources at the same time.

int freq_add(freq_t* freq, const char* str) {
   list_foreach(/* ... */, it) {
        if (str_equal(it->word, str)) {
            it->n++;
            return 0;
        }
    }
    // inserting a new item into the list
}
int freq_get(freq_t* freq, const char* str) {
    list_foreach(/* ... */, it) {
        if (str_equal(it->word, str)) {
            return it->n;
        }
    }
    return 0;
}

int freq_add(freq_t* freq, const char* str) {
    item_t* item = freq_find(freq, str);
    if (item != NULL) {
        item->n++;
    } else {
        // insert a new item into the list
    }
}
int freq_get(freq_t* freq, const char* str) {
    item_t* item = freq_find(freq, str);
    return (item != NULL) ? item->n : 0;
}
item_t* freq_find(freq_t* freq, const char* str) {
    list_foreach(/* ... */, it) {
        if (str_equal(it->word, str)) {
            return it;
        }
    }
    return NULL;
}

Do you use symbollic names for constant literals?

Symbollic names convey additional meaning, especially when it comes to numbers. In general, numbers other than 0 (for initialization), 1 (for incrementing counters) and 2 (for halving) should be only used through their symbollic name.

In C, symbollic names for constant literals are usually defined in one place using preprocessor macros or (in case of integers) as enum members.

char dec2hexchar(int digit, bool capitals) {
    if (digit < 10) {
        return digit + 48;
    }
    if (digit < 16) {
        return digit + (capitals ? 55 : 87);
    }
    return 0;
}

char get_hex_digit(int digit, bool capitals) {
    const char *alphabet = capitals
        ? "0123456789ABCDEF"
        : "0123456789abcdef";
    assert(digit >= 0 && digit < 16);
    return alphabet[digit];
}

Use proper abstraction layers

Does implementation of each function use the same abstraction level through the whole body?

This is arguably one of the most difficult aspects. Doing this right requires patience and experience.

if (next_block != NULL) {
    new_block->link.next = &next_block->link;
    new_block->link.prev = next_block->link.prev;
    next_block->link.prev->next = &new_block->link;
    next_block->link.prev = &new_block->link;
} else {
    list_append(&free_blocks, &new_block->link);
}

if (next_block != NULL) {
    list_append(&next_block, &new_block->link);
} else {
    list_append(&free_blocks, &new_block->link);
}

Sometime the difference might look negligible but remember that anything that helps the reader focus on the important parts is worth the effort. Especially in a team.

size_t rounded_size =
    size % 4 == 0
    ? size
    : size + 4 - (size % 4);

size_t size_aligned = align_up(size, 4);

We do not require any low-level optimizations that are present in production systems. If you want to try them, feel free to do so but even with these one can maintain worse and better readability.

typedef struct {
    // Size is always even:
    // Lowest bit used as a boolean
    // (0 -> free block, 1 -> used block)
    size_t size;
} block_t;
...
while ((block->size & 1) || block->size < size) {
    ...
}

...
while (!block_is_free(block)
        || block_get_size(block) < size) {
    ...
}

As a general advice: do not be afraid to create functions even if you call them from one place only. The function allows you to capture the intent of a block of code.

thread->registers = registers;
thread->entry_func = entry;
thread->data = data;
for (int i = 0; name[i] != '\0'; i++) {
    thread->name[i] = name[i];
}

thread->registers = registers;
thread->entry_func = entry;
thread->data = data;
str_copy(thread->name, name, THREAD_NAME_SIZE);

Check function arguments and “impossible” situations

Do your functions check that their arguments are valid (and that pointers are not NULL)? Does your code report running into unexpected situations?

Module functions that are visible to other modules should check their arguments, because they represent module entry points and as such they are responsible for enforcing contractual requirements. In addition, the code calling such functions is either not available or not immediately visible.

Internal modules calling each other within a single project can usually assume that the caller is well-behaved, so their checks are mostly intended to catch programmer errors and omissions that should never make it to production. Consequently, these checks can usually afford to stop code execution (crash the system) if an error is detected and should be implemented using assertions. Note that standard C assert macro can be coerced to provide a helpful message using a simple trick.

if (name == NULL)
    return -1;
...

assert(name != NULL && "thread_create: name cannot be NULL");

In contrast, library functions usually cannot stop program execution (crash the system), which is why they need to indicate the problem to the caller using error codes or exceptions.

Functions that are private to a module (static fuctions in C, which often do the actual work) should not need to check their arguments, because passing valid arguments is the responsibility of the “nearby” (higher-level) caller. Arguments passed from callers outside the module should have been already checked in the entry point function.

It is also much easier to use assertions than returning error codes when encountering an unexpected situation or an invalid system state.

thread_t thread = scheduler_get_next_thread();
if (thread != NULL) {
    // do something with thread
}
...

thread_t thread = scheduler_get_next_thread();
assert(thread != NULL && "scheduler: nothing to schedule")
// do something with thread

Avoid resource leaks

Do you release previously allocated resources in early returns from functions?

C does not have a garbage collector that would free unused memory for you. Memory leaks often happen when a function fails in the middle of resource allocation and previous (succesfully allocated) resources are not returned.

In C you will mostly deal with memory leaks. But the rule applies to any resource, ranging from open file to a database cursor.

void* stack = kmalloc(THREAD_STACK_SIZE);
thread_t* new_thread = kmalloc(sizeof(thread_t));
if (stack == NULL || new_thread == NULL) {
    *thread_out = NULL;
    return ENOMEM;
}
...

void* stack = kmalloc(THREAD_STACK_SIZE);
if (stack == NULL) {
    return ENOMEM;
}
thread_t* new_thread = kmalloc(sizeof(thread_t));
if (new_thread == NULL) {
    kfree(stack);
    return ENOMEM;
}
...

Protect shared structures (but not local state)

Is any access to shared data protected by mutex or is inside a section with disabled interrupts?

Once your kernel runs with interrupts enabled (and preemptive scheduling), a thread can be rescheduled virtually at any time. Thus any code that might leave any data in inconsistent state (when rescheduled) must either use locking or must be inside a section where interrupts are disabled (and hence rescheduling will not happen).

void *allocate(size_t size) {
    uintptr_t heap_next_free = heap_first_free + size;
    if (heap_next_free > heap_end) {
        return NULL;
    }
    void* result = (void*)heap_first_free;
    heap_first_free = heap_next_free;
    return result;
}

void *allocate(size_t size) {
    bool ipl = interrupts_disable();
    uintptr_t heap_next_free = heap_first_free + size;
    if (heap_next_free > heap_end) {
        interrupts_restore(ipl);
        return NULL;
    }
    void* result = (void*)heap_first_free;
    heap_first_free = heap_next_free;
    interrupts_restore(ipl);
    return result;
}

Are mutexes and disable-interrupt-blocks as short as possible?

While protecting shared structure is required, it is not necessary to guard sections where interrupts are harmless. Preventing rescheduling reduces throughput and is generally a bad practice (and it can also raise questions whether the author understands why the protection is needed).

errno_t thread_create(...) {
    bool ipl = interrupts_disable();
    thread_t* thread = kmalloc(...);
    if (thread == NULL) {
        interrupts_restore(ipl);
        return ENOMEM;
    }
    initialize_thread_context(&thread->context);
    scheduler_add_ready_thread(thread);
    list_append(&all_threads, &thread->link);
    interrupts_restore(ipl);
    return EOK;
}

errno_t thread_create(...) {
    thread_t* thread = kmalloc(...);
    if (thread == NULL) {
        return ENOMEM;
    }
    initialize_thread_context(&thread->context);
    // Add to list of all threads and
    // to scheduler atomically.
    bool ipl = interrupts_disable();
    scheduler_add_ready_thread(thread);
    list_append(&all_threads, &thread->link);
    interrupts_restore(ipl);
    return EOK;
}