This post will hopefully be shorter than the last two. I just wanted to dig into the fourth function
color brought up in
Which "Red" is your function?
I'm not going to get into the details of function coloring here. If you want to know you can read
the linked article, and the articles the linked article links.
Note: If you want to talk about io-uring or io-ring or IOCP that's an entire other discussion that
plays into this problem but will not be covered here.
How do you do IO?
This is really the crux of the issue. How do you do IO. In the standard library, you can have a
TcpStream and issue a .read()
on it and get the bytes. This works by passing the TcpStream's File
Descriptor to the read
syscall. Calling
this function can have a couple results:
- Bytes were read from the provided descriptor and placed into the provided buffer. The function's
return value indicates how many bytes were read.
- No bytes were read because there are no more bytes to read. No bytes are placed into the buffer
and the function's return value is 0. This indicates that the
read
call should not be retried.
- No bytes were read because some sort of error occured. No bytes are placed into the buffer and
the function's return value is less than 0, each possible sub-zero value corresponds with a
specific error.
This is all well and good, but it doesn't work well in contexts where we aren't supposed to block
.
Consider the case where we're talking with a very slow computer, and it is sending us bytes, but we
don't receive them in a reasonable timeframe. Using the read
syscall as-is doesn't allow our
program to do anything else while we're waiting on the slow computer.
Luckily there's a very common workaround for this specific issue. We can put our TcpStream in
non-blocking mode, which will slightly alter the behavior of the read
syscall. Now instead of
waiting for bytes to put into the provided buffer, if there are no bytes available the read
call
will return immediately with a new error value. Our program can notice this new error value and
realize that although we didn't read anything this time, in the future we will be able to read
something.
But how will we know when to try reading again? That is the core problem that creates our Crimson
functions.
How will they know? (They're gonna know)
In Rust there's a number of async runtimes that you can choose from. tokio is by and large the one
people will use or tell you to use, but there's also smol, async-std, embassy, glommio, monoio,
actix-rt, jive (I wrote this one), and I'm sure others.
The primary motivating factor for having multiple runtimes like this is to play with different ideas
of IO. I said up front that I'm not going to talk about completion-based IO, embassy is another
special case, actix-rt is just tokio with extra steps, and smol and async-std are basically the same
runtime, so let's pretend for now that the only runtimes that exist are tokio and async-std.
The real distinction between tokio and async-std is how they attempt to figure out when to try
reading more bytes. In both cases, after failing to read bytes from the TcpStream, the runtime will
register the TcpStream's file descriptor with an event mechanism. This mechanism is backed by a
blocking IO operation, just like a blocking read
, but unlike a blocking read
, the event
mechanism does not itself read any bytes, and it is capable of waiting for available bytes for many
file descriptors simultaneously. This means that if your program has 30 TcpStreams, you can block
waiting for any one of them to become available for reading on a single thread. Since this operation
does block, tokio and async-std need to strike a balance between waiting for events on these
registered file descriptors, and making progress on other asynchronous tasks.
And that's why my crimson functions explode?
Yeah basically. tokio's TcpStream type knows how to register itself with tokio's event mechanism,
and is designed to play with tokio's scheduling. Underneath, the bytes are still read with the
read
syscall we talked about earlier, but all the surrounding bits about deciding when to read
make it unique.
async-std's TcpStream type is the same thing. It knows how to register itself with async-std's event
mechanism, and plays as nicely as it can* with async-std's scheduling. It really comes down to
that registration step for the difference between these two runtimes. It is feasible that the Future
trait's Context
argument could be extended with a method such as .register(fd: &BorrowedFd<'_>)
.
This could potentially allow any type with access to a File Descriptor to register with any
runtime's event mechanism. But it's not enough to handle every scenario.
*tokio has a cooperation mechanism that its TcpStream integrates with to reduce task starvation
and async-std does not.
Why not?
Well for one, not every runtime has an event mechanism like async-std and tokio do. I left out
runtimes using completion-based IO because they work in an entirely different way. I left out
embassy because it's designed for microcontrollers and doesn't have access to an operating system
that could provide read
. Building an abstraction around an event mechanism and file descriptors
leaves out other runtime implementations that don't include one or both of these concepts. These
problems are why we don't have a unified API for dealing with runtimes outside of the Future trait.
Unification is hard. It might not be possible. But there's a bunch of smart folks thinking about it,
so maybe things will get better.