Lecture 1: Parallel Programming in Rust

Information below is not for the current semester.

Rust’s safety model is built for parallel programming. The language works great with operating system threads. But do you really want to explicitly maintain a set of threads?

There are a few reasons you might opt for concurrency:

You want to speed up your computations by running code on multiple CPU cores.
Your application needs to handle I/O events from multiple different sources.
More subtle performance considerations that we’re not going to discuss.

Parallel computation

Parallel computation is best handled by set of communicating pieces of code running in parallel. These are the most common solutions.

Multiple running programs talking to each other via inter-proccess communication mechanisms provided by the operating system.
A multiprocessing scenario where a single program that spawns a few copies of itself that then use inter-process communication.
A multithreading scenario where lightweight copies are created that share the same memory space and other resources.

Multiprocessing is popular in languages like C (or maybe even C++) where maintaining correct operation may prove difficult. Some people might use it to overcome the lack of threading support in the Python interpreter.

Rust provides thread safety in form of core language features. In all code that is not marked unsafe, correct access to program data is required and checked by the compiler. More fine-grained synchronization tools are provided by the libraries. Multithreading is therefore a natural choice when working with Rust.

A naive implementation starts new threads or processes whenever they are needed. High performance application tend to avoid operating system overhead by creating a fixed number of threads or processes in advance.

Rust supports coroutines (or asynchronous functions) that can be safely executed in different threads. The standard library doesn’t provide a coroutine scheduler but the famous Tokio library does exactly that.

Futures and coroutines

Writing all parallel code in these coroutines and letting Tokio scheduler do the planing is by far the easiest way to perform parallel computation. But that’s not the whole story to be told.

Applications usually spend much more time waiting for I/O than performing heavy computations. In general those two cases can be split and handled separately but asynchronous functions and the Tokio library can be used to solve both cases. We will focus on the I/O case.

For comparison, an application that uses blocking calls would ask the operating system for new data and sleep until the data is available. A non-blocking application would usually wait for data from multiple sources and only sleep when there’s nothing to do.

Futures are essentially results that may not be available yet. An example of a future is the contents of a website we haven’t downloaded yet. An asynchronous function doesn’t run its code when called. Instead it returns a future that will run the code on-demand.

Simple I/O example

First let’s set up Cargo.toml. Full feature set will give you all the I/O tools and macros as well. If you forget it, #[tokio::main] won’t work.

[package]
name = "example"
version = "0.1.0"
edition = "2021"

[dependencies]
tokio = { version = "1", features = ["full"] }

Then let’s put some example code to src/main.rs. Let’s simulate a simplified HTTP client communication. We will explicitly use a single-threaded Tokio runtime.

use tokio::net::TcpStream;
use tokio::io::{AsyncReadExt, AsyncWriteExt};

#[tokio::main(flavor = "current_thread")]
async fn main() {
    let target = ("example.net", 80);
    let mut stream = TcpStream::connect(target)
        .await
        .expect("Connection failed.");

    stream.write_all(b"GET / HTTP/1.0\r\n\r\n")
        .await
        .expect("Write failed.");

    let mut content = Vec::new();
    stream.read_to_end(&mut content)
        .await
        .expect("Read failed.");

    let text = String::from_utf8(content).expect("UTF-8 conversion failed.");
    println!("{:?}", text);
}

This isn’t all that different from code using with connect(), write() and read() system calls? You can see three await points that mark where the code might wait for events.

However, that you see where waiting points in the function are already a huge difference. Another difference is that each of the await points waits for a future or coroutine provided by Tokio. Please note that the I/O layer and the scheduler depend on each other. You can only use I/O tools compatible with Tokio in a Tokio based application.

Just like the scheduler is hidden from your eyes, so is the event waiting mechanism. Whenever you wait for an I/O future, there is an event source added to Tokio that would later deliver an event and resume execution of the respective coroutine code. This provides the necessary blocking and resuming framework to your coroutines.

Concurrent and parallel execution

Tokio strictly distinguishes concurrency and parallelism. You can run multiple concurrent functions in a single thread using tokio::join! macro but only tokio::spawn() function provides parallel execution.

use tokio::io::{AsyncReadExt, AsyncWriteExt};
use tokio::net::TcpStream;

async fn download(host: &str) -> Result<String, std::io::Error> {
    let target = (host, 80);
    let mut stream = TcpStream::connect(target).await?;

    stream.write_all(b"GET / HTTP/1.0\r\n\r\n").await?;

    let mut content = Vec::new();
    stream.read_to_end(&mut content).await?;

    Ok(String::from_utf8(content).expect("UTF-8 conversion failed."))
}

#[tokio::main]
async fn main() {
    let download1 = tokio::spawn(download("example.com"));
    let download2 = tokio::spawn(download("example.net"));

    let result1 = download1
        .await
        .expect("First download crashed.")
        .expect("First download failed.");
    let result2 = download2
        .await
        .expect("Second download crashed.")
        .expect("Second download failed.");

    println!("{:?}, {:?}", result1, result2);
}

As you can see the usage of asynchronous functions as Tokio tasks closely resembles how threads are used in general. Tasks are created and joined just like threads, communicate just like threads and are distributed by Tokio into actual operating system threads. In the general case working with Tokio tasks in I/O applications is easier and more convenient.

The borrow checker

Your best friend and worst enemy is the borrow checker. Whether you’re writing client or server code, you often need to communicate in both directions simultaneously. You might want to enclose TcpStream in BufReader for reading but still keep it around for writing. This is not possible with a single stream object.

Tokio, just like the standard library, provides an option to .split() or convert .into_split() the TcpStream and get separate reader and writer objects. The reader can then be buffered and read line by line using BufRead while you can .write_all() your data to the writer. Using .into_split() rather than .split() creates two linked owned objects that are completely independent for the static borrow checker.

use tokio::io::{BufReader, AsyncBufReadExt, AsyncWriteExt};
use tokio::net::{TcpListener, TcpStream};

async fn communicate(mut stream: TcpStream) -> Result<(), std::io::Error> {
    let (rx, mut tx) = stream.split();
    let mut lines = BufReader::new(rx).lines();
    tx.write_all(b"HTTP/1.0 200 OK\r\nContent-Type: text/plain\r\n\r\n").await?;
    while let Some(line) = lines.next_line().await? {
        if line.len() == 0 {
            break;
        }
        tx.write_all(format!("{}\n", line).as_bytes()).await?;
    }
    Ok(())
}

async fn serve_forever() -> Result<(), std::io::Error>{
    let listener = TcpListener::bind(("localhost", 8080)).await?;
    loop {
        let (stream, _addr) = listener.accept().await?;
        communicate(stream).await.ok();
    }
}

#[tokio::main]
async fn main() {
    serve_forever().await.unwrap();
}

Note that the above code is single-threaded in Tokio even though it is run using multi-threaded Tokio runner. That is usually a mistake. What you need is to run communicate() as a separate task just like download() in the previous example. Then each of the clients gets its own task (effectively an application-level thread) that can be scheduled using the pool of operating system threads managed by Tokio.

This is how you create a threaded server with a fixed-sized thread pool with just a bunch of asynchronous functions or methods. Jou can pass any data, owned or borrowed, into your asynchronous tasks as long as you understand that they are held by a future object that exists since it’s created by calling the coroutine function until it is .await-ed. Tasks are just packaged separately scheduled futures.

You shouldn’t create structures so complex that you cannot make them work with the borrow checker. If the object interdependence and structure becomes to complex, you can always split it into multiple tasks that communicate via tokio::sync::mpsc. It is very often a better choice than holding data in a common tokio::sync::Mutex or std::sync::Mutex.

Notes

Bring your own questions to the next lessons as usual. Homeworks will start appearing as soon as I get familiar with ReCodEx. We’re going to use it for this semester.