Help! My function is Crimson!

Published 2024-02-15

By asonix

This post will hopefully be shorter than the last two. I just wanted to dig into the fourth function color brought up in Which "Red" is your function? I'm not going to get into the details of function coloring here. If you want to know you can read the linked article, and the articles the linked article links.

Note: If you want to talk about io-uring or io-ring or IOCP that's an entire other discussion that plays into this problem but will not be covered here.

How do you do IO?

This is really the crux of the issue. How do you do IO. In the standard library, you can have a TcpStream and issue a .read() on it and get the bytes. This works by passing the TcpStream's File Descriptor to the read syscall. Calling this function can have a couple results:

Bytes were read from the provided descriptor and placed into the provided buffer. The function's return value indicates how many bytes were read.
No bytes were read because there are no more bytes to read. No bytes are placed into the buffer and the function's return value is 0. This indicates that the read call should not be retried.
No bytes were read because some sort of error occured. No bytes are placed into the buffer and the function's return value is less than 0, each possible sub-zero value corresponds with a specific error.

This is all well and good, but it doesn't work well in contexts where we aren't supposed to block. Consider the case where we're talking with a very slow computer, and it is sending us bytes, but we don't receive them in a reasonable timeframe. Using the read syscall as-is doesn't allow our program to do anything else while we're waiting on the slow computer.

Luckily there's a very common workaround for this specific issue. We can put our TcpStream in non-blocking mode, which will slightly alter the behavior of the read syscall. Now instead of waiting for bytes to put into the provided buffer, if there are no bytes available the read call will return immediately with a new error value. Our program can notice this new error value and realize that although we didn't read anything this time, in the future we will be able to read something.

But how will we know when to try reading again? That is the core problem that creates our Crimson functions.

How will they know? (They're gonna know)

In Rust there's a number of async runtimes that you can choose from. tokio is by and large the one people will use or tell you to use, but there's also smol, async-std, embassy, glommio, monoio, actix-rt, jive (I wrote this one), and I'm sure others. The primary motivating factor for having multiple runtimes like this is to play with different ideas of IO. I said up front that I'm not going to talk about completion-based IO, embassy is another special case, actix-rt is just tokio with extra steps, and smol and async-std are basically the same runtime, so let's pretend for now that the only runtimes that exist are tokio and async-std.

The real distinction between tokio and async-std is how they attempt to figure out when to try reading more bytes. In both cases, after failing to read bytes from the TcpStream, the runtime will register the TcpStream's file descriptor with an event mechanism. This mechanism is backed by a blocking IO operation, just like a blocking read, but unlike a blocking read, the event mechanism does not itself read any bytes, and it is capable of waiting for available bytes for many file descriptors simultaneously. This means that if your program has 30 TcpStreams, you can block waiting for any one of them to become available for reading on a single thread. Since this operation does block, tokio and async-std need to strike a balance between waiting for events on these registered file descriptors, and making progress on other asynchronous tasks.

And that's why my crimson functions explode?

Yeah basically. tokio's TcpStream type knows how to register itself with tokio's event mechanism, and is designed to play with tokio's scheduling. Underneath, the bytes are still read with the read syscall we talked about earlier, but all the surrounding bits about deciding when to read make it unique.

async-std's TcpStream type is the same thing. It knows how to register itself with async-std's event mechanism, and plays as nicely as it can* with async-std's scheduling. It really comes down to that registration step for the difference between these two runtimes. It is feasible that the Future trait's Context argument could be extended with a method such as .register(fd: &BorrowedFd<'_>). This could potentially allow any type with access to a File Descriptor to register with any runtime's event mechanism. But it's not enough to handle every scenario.

*tokio has a cooperation mechanism that its TcpStream integrates with to reduce task starvation and async-std does not.

Why not?

Well for one, not every runtime has an event mechanism like async-std and tokio do. I left out runtimes using completion-based IO because they work in an entirely different way. I left out embassy because it's designed for microcontrollers and doesn't have access to an operating system that could provide read. Building an abstraction around an event mechanism and file descriptors leaves out other runtime implementations that don't include one or both of these concepts. These problems are why we don't have a unified API for dealing with runtimes outside of the Future trait. Unification is hard. It might not be possible. But there's a bunch of smart folks thinking about it, so maybe things will get better.

Your browser contains Google DRM

How do you do IO?

How will they know? (They're gonna know)

And that's why my crimson functions explode?

Why not?