Multiprocessing on machines with a single processor, or uniprocessors, is based on the ability of the operating system to suspend a process by setting its state aside and later resume the process by picking its state up from where it was set aside. Processes can be suspended and resumed frequently enough to achieve an illusion of multiple processes running in parallel.
In multiprocessing terminology, the state of a process is called process context, with the act of setting the process context aside and later picking it up denoted as context switching. Note that process context is not defined to be strictly equal to the process state, but instead vaguely incorporates those parts of the process state that are most relevant to context switching.
The individual parts of the process state and the related means of context switching are discussed next.
The part of the process state that is associated with the processor consists of the processor registers accessible to the process. On most processors, this includes general purpose registers and flags, stack pointer as a register that stores the address of the top of the stack, program counter as a register that stores the address of the instruction to be executed.
The very first step of a context switch is passing control from the executing process to the operating system. As this changes the value of the program counter, the original value of the program counter must be saved simultaneously. Typically, the processor saves the original value of the program counter on the stack of the process whose context is being saved. The operating system typically proceeds by saving the original values of the remaining registers on the same stack. Finally, the operating system switches to the stack of the process whose context will be restored and restores the original values of the registers in an inverse procedure.
When separate notions of processes and threads are considered, the processor context is typically associated with the thread, rather than the process. Exceptions to this rule include special purpose registers whose content does not concern the execution of the thread but rather the execution of the process.
Context switching and similar operations that involve saving and restoring the processor context, especially interrupt and exception handling and system calls, happen very frequently. Processors therefore often include special support for these operations.
The Intel 80x86 line of processors provides multiple mechanisms to support context switching. The simplest of those is the ability to switch to a different stack when switching to a different privilege level. This mechanism makes it possible to switch the processor context without using the stack of the executing process. Although not essential, this ability can be useful when the stack of the executing process must not be used, for example to avoid overflowing or mask debugging.
Another context switching support mechanism is the ability to save and restore the entire processor context of the executing process to and from the TSS (Task State Segment) structure as a part of a control transfer. One issue associated with this ability is efficiency. On Intel 80486, a control transfer using the CALL instruction with TSS takes 170 to 180 clock cycles. A control transfer using the CALL instruction without TSS takes only 20 clock cycles and the processor context can be switched more quickly using common instructions. Specifically, PUSHAD saves all general purpose registers in 11 clock cycles, PUSH saves each of the six segment registers in 3 clock cycles, PUSHF saves the flags in 4 clock cycles. Inversely, POPF restores the flags in 9 clock cycles, POP restores each of the six segment registers in 3 clock cycles, POPAD restores all general purpose registers in 9 clock cycles.
Additional context switching support mechanism takes care of saving and restoring the state of processor extensions such as FPU (Floating Point Unit), MMX (Multimedia Extensions), SIMD (Single Instruction Multiple Data). These extensions denote specialized parts of the processor that are only present in some processor models and only used by some executing processes, thus requiring special handling:
The processor supports the FXSAVE and FXRSTOR instructions, which save and restore the state of all the extensions to and from memory. This support makes it possible to use the same context switch code regardless of which extensions are present.
The processor keeps track of whether the extensions context has been switched after the processor context. If not, an exception is raised whenever an attempt to use the extensions is made, making it possible to only switch the extensions context when it is actually necessary.
For examples of a real processor context switching code for many
different processor architectures, check out the sources of Linux. Each
supported architecture has an extra subdirectory in the arch
directory, and an extra asm
subdirectory in
the include
directory. The processor context switching
code is usually stored in file arch/*/kernel/entry.S
.
The following fragment contains the code for saving and restoring processor context on the Intel 80x86 line of processors from the Linux kernel, before the changes that merged the support for 32-bit and 64-bit processors and made the code more complicated.
The __SAVE_ALL
and __RESTORE_ALL
macros save and restore the processor registers to and from stack.
The fixup sections handle situations where segment registers contain invalid values that need to be zeroed out.
Kalisto processor context switching code is stored in the head.S
file. The SAVE_REGISTERS
and
LOAD_REGISTERS
macros are used to save and
load processor registers to and from memory, typically stack. The
switch_cpu_context
function uses these two
macros to implement the context switch.
In principle, the memory accessible to a process can be saved and restored to and from external storage, such as disk. For typical sizes of memory accessible to a process, however, saving and restoring would take a considerable amount of time, making it impossible to switch the context very often. This solution is therefore used only when context switching is rare, for example when triggered manually as in DOS or CTSS.
When frequent context switching is required, the memory accessible to a process is not saved and restored, but only made inaccessible to other processes. This requires the presence of memory protection and memory virtualisation mechanisms, such as paging. Switching of the memory context is then reduced to switching of the paging tables and flushing of the associated caches.
When separate notions of processes and threads are considered, the memory state is typically associated with the process, rather than the thread. The need for separate stacks is covered by keeping the pointer to the top of the stack associated with the thread rather than the process, often as a part of the processor state rather than the memory state. Exceptions to this rule include thread local storage, whose content does not concern the execution of a process but rather the execution of a thread.
The process state can contain other parts besides the processor state and the memory state. Typically, these parts of the process state are associated with the devices that the process accesses, and the manner in which they are saved and restored depends on the manner in which the devices are accessed.
Most often, a process accesses a device through the operating system rather than directly. The operating system provides an abstract interface that simplifies the device state visible to the process, and keeps track of this abstract device state for each process. It is not necessary to save and restore the abstract device state, since the operating system decides which state to associate with which process.
In some cases, a process might need to access a device directly. In such a situation, the operating system either has to save and restore the device state or guarantee an exclusive access to the device.