One of the issues we're working on with async functions in traits for Rust is how to attach Send bounds to futures returned by async methods. Niko Matsakis has been writing on this subject recently, so if you haven't seen his posts, definitely check them out! His first post outlines the problem, while the second post introduces a possible solution: Return Type Notation (RTN). I'm going to write this post assuming you're familiar with those.

I'm mostly a fan of the proposed return type notation. It's a very powerful feature that gives a solution to the Send bound problem but is also generally useful in other cases. There are some significant shortcomings though, so I don't think it should be the only or even primary solution people reach for to solve the Send bound question.

In this post I'd like to explore some more implicit approaches that address the issue while using much lighter syntax.

Return Type Notation is not without shortcomings🔗

As Niko points out in his conclusion, this notation is not without its drawbacks. In particular, it can easily become quite verbose. Consider a common trait like Read. It has twelve methods! Presumably once we have AsyncRead or similar, that trait will also be widely used and have a similar number of methods. As a user this isn't a problem, and as a trait implementer it's not bad because only one of the methods is required---the rest have default implementations. But consider if we wanted to write something like:

fn read_file_on_other_thread<R>(reader: R)
where
    R: Read,
{ ... }

To add all the Send bounds on an async version using RTN, we'd end up with something like this:

fn read_file_on_other_thread<R>(reader: R)
where
    R: AsyncRead,
    R::read(..): Send,
    R::read_vector(..): Send,
    R::read_to_end(..): Send,
    R::read_to_string(..): Send,
    R::read_exact(..): Send,
    R::read_buf(..): Send,
    R::read_buf_exact(..): Send,
    R::by_ref(..): Send,
    R::bytes(..): Send,
    R::chain(..): Send,
    R::take(..): Send,
{ ... }

Now, I'm perhaps being a little unfair. You don't have to list all the methods, only the ones you use. I don't think I've ever used all twelve methods. Usually I just use read_to_string or maybe read_to_end. But these methods exist, and someone uses them---perhaps code in some high performance library that I've pulled in.

These bounds also end up being viral. I have to add the bounds needed by any function I call, or any function the callee might call, and so on. Trait aliases can help, but those aren't currently implemented and it'd be nice if we could solve the issue without relying on them.

What if we could infer the bounds?🔗

While that verbosity can be useful, I suspect those will be somewhat niche and advanced use cases. If would be nice to be able to do something lighter weight that works in the common cases, while still keeping the more precise options when needed.

Perhaps one approach that will be fruitful is to take inspiration from the way auto traits are inferred.

A small detour: auto traits🔗

Auto traits are traits where Rust automatically generates an implementation for you in most cases. The Send trait is perhaps the most common example. While you can implement Send yourself, in most cases the compiler implements it for you if it can.

For structs, enums, and similar types the rules are pretty simple. They implement an auto trait if all their constituent fields also implement the trait. For closures, things are a little more subtle. Closures implement an auto trait if all of the things the closure captures also implement the auto trait.¹ Async blocks are trickier still. They work like closures in that you have to consider what they capture. However, because async blocks can suspend at an await point, we must also consider all of the things that are live across an await point.² We find these values in the generator interior analysis step in the compiler (because async functions and blocks desugar into generators).

Anyway, the point of this little detail is that we already have an analysis pass in the compiler that could be helpful for inferring Send bounds.

Idea: Inferred Send Bounds on `async fn`🔗

I was talking with Nick Cameron about this last week and we came up with an idea that I'd like to describe and flesh out here. The idea is that you would somehow annotate an async function to indicate you want to guarantee it returns a Send future and then the compiler infers whatever bounds are necessary to make this happen.

Let's look at an example inspired by Niko's post.

async fn do_health_check<H>(health_check: H, server: Server)
where
    H: HealthCheck + Send + 'static,
{
    info!("doing health check");
    health_check.check(server).await
}

This example is admittedly kind of pointless because you could just call .check directly rather than calling do_health_check. I added the info! line to make this seem a little more plausible, because maybe you want to make sure you have uniform logging. Still, I admit it's contrived.

Anyway, suppose we wanted to ensure no matter what type you use for H, the future returned by do_health_check was Send? With this proposal, we'd write something like this:³

async<Send> fn do_health_check<H>(health_check: H, server: Server)
where
    H: HealthCheck + Send + 'static,
{
    info!("doing health check");
    health_check.check(server).await
}

The compiler already has to infer whether do_health_check is Send. Since Send is an auto trait, the compiler looks at what's live across an await point (in this case the future returned by check(..) since that's the thing we're awaiting) and decides do_health_check returns a Send future depending on whether check(..) does or not.

For this proposal, instead of just checking the whether the result future is Send, the compiler would add any bounds necessary to make it so that do_health_check is always Send.

In our example, it would be as if you had written:

async fn do_health_check<H>(health_check: H, server: Server)
where
    H: HealthCheck + Send + 'static,
    <H as HealthCheck>::check(..): Send,
{
    info!("doing health check");
    health_check.check(server).await
}

This is exactly what we wanted!

Does it work?🔗

Partially. I was really excited by this idea at first but in thinking about it to write this post I've realized at best it solves a very tiny piece of the issue. This is kind of apparent in how contrived the example was that I introduced the feature with. But in looking at its shortcomings, maybe we can come up with something that would work better.

The biggest issue is it only works for cases where you await something, but as we saw in Niko's post, often times we care that a future is Send even if we don't await it.

Let's try and fill in a possible body for the example in Niko's post:

fn start_health_check<H>(health_check: H, server: Server)
where
    H: HealthCheck + Send + 'static,
{
    let task = async move {
        health_check.check(server).await;
    };
    spawn(task);
}

Here spawn would have a signature like fn spawn(task: impl Future<Output = ()> + Send + 'static).

The problem is, start_health_check is where we need to add the bounds, but it is not async, so we can't just change it to async<Send>. We could try using out do_health_check function with the async<Send> annotation, and we'd end up with something like:

fn start_health_check<H>(health_check: H, server: Server)
where
    H: HealthCheck + Send + 'static,
{
    let task = do_health_check(health_check, server);
    spawn(task);
}

We'd probably get a different error, but it's essentially the same. We still have no way to guarantee that any H will meet the requirements of this function.

But maybe this can still help. A lot of futures that need to be Send will be composed of lots of calls to other futures. These all will need to be Send as well. Adding async<Send> to those inner futures can still save a lot of boilerplate. We might still have to be more explicit at the boundary where a task is spawned, but we can avoid restating the bounds explicitly throughout the whole call tree. In the Read example, this can be significant.

Furthermore, looking at some internal async code we have at work, it looks like we already have code that would benefit from exactly this async<Send> inference proposal.

Another thing I like about it is that it allows us to partially avoid a semver hazard. Auto traits leak into public types, and because of the way inference works for async functions, we have a case where we could in one release have a function that always returns Send futures and in some later release due to a subtle change in the body of an async function it is no longer Send. What the async<Send> syntax proposed here does is allows to you explicitly promise that a function will always return Send futures. Of course, that Sendness may still depend on what methods on a trait the function uses. For example, maybe I switch from calling Read::read_to_string and instead call Read::read_buf. Still, this will catch some accidental semver breakages and the ones that remain are more likely to be within the caller's power to fix.

What if we inferred even more?🔗

So let's go back to the issue of start_health_check. What if we could carry this idea of lifting bounds out of the body even further? It might look something like this:

#[infer_bounds(Send)]
fn start_health_check<H>(health_check: H, server: Server)
where
    H: HealthCheck + Send + 'static,
{
    let task = do_health_check(health_check, server);
    spawn(task);
}

This is just strawman syntax. We may want to stick the annotation on H instead, or not use attributes and use real syntax instead. There's room to experiment with syntax, but the important thing is that somehow we opt in to a new inference behavior that I'll describe here.

For the sake of argument I'm going to assume do_health_check is still written as async<Send> as described above, but I think these two proposals can stand alone.

Without the #[infer_bounds(Send)] attribute (i.e. Rust's current behavior), we can't compile start_health_check because we get a trait error. The type checker / trait system treat type parameters parametrically, meaning we can only use facts about H that are stated in the where clauses. There are currently no facts that say the future returned by H::check is Send, which the compiler will inform us of in the error report. Using RTN, we can add any extra where clauses needed to make this compile. With #[infer_bounds], I'm proposing to let the compiler add these clauses for us.

We'll probably need to define a clear set of rules, but the gist of it is that if any missing bound would give us a type error, we instead add an implicit bound to the function. In this case, because the do_health_check requires H::check(..): Send, we'd add this bound to start_health_check as well.

Does this idea work?🔗

I'm not sure. One issue is that I think this could become a global inference problem, which for the most part Rust avoids doing. For example, consider the following:

#[infer_bounds(Send)]
fn foo(x: impl MyAsyncFuture) {
    spawn(x.some_async_fn());
}

#[infer_bounds(Send)]
fn bar(x: impl MyAsyncFuture) {
    foo(x);
}

In this short program, we need to figure out foo's inferred where clauses before we can figure out bar's. This means we can't check each function independently. It's even tougher if we end up with a recursive cycle.

This might not be a problem though. Auto traits are coinductive, which seems like is the case to solve exactly these kinds of problems. I'll need to learn more about the trait solver to know if this is a problem and if it's solvable.

Of course, another downside is that this introduces additional semver hazards. On the bright side, they are opt in, and arguably not much worse than the existing auto trait leakage hazards. Still, rather than creating new semver hazards using existing ones as precedent, it'd be better to go the other direction and remove old hazards across an edition because we made being explicit about the requirements ergonomic enough that we don't need auto traits on async functions. I don't think this proposal does this, but I'm hopeful that maybe this will spark some ideas that will lead to something much better.

Conclusion🔗

This post has been a bit of a roller coaster to write. I started out thinking we had come up with a really tidy solution to most of our problems. Then there was a moment of despair where it seemed like maybe the idea wasn't workable or didn't solve a useful part of the problem. Now that I'm at the conclusion, I'm cautiously optimistic. I think between async<Send> and #[infer_bounds(Send)] we might have something that's ergonomic that may cover the most common cases.

Let's sum up by looking at some pros and cons of the two proposals together.

Pros:

Doesn't require listing each method in type signatures
Generalizes to other auto traits like Sync
Reduces semver risks by making Sendness and other auto traits explicit in the type signature

Cons:

Introduces more inference based on function bodies, which means more semver hazards
Might require global inference
Rust generally prefers to be explicit over implict, so inferring new implicit bounds may obscure key details and make it harder to tell what's going on in the program

My main conclusion is that we should be thinking more about approaches based on implicit or inferred bounds though. I between RTN and implicit bounds there's a nice symmetry with functions that return -> impl Future and async functions. In the cases where one wants more control, we provide a more verbose, explicit way of doing it. On the other hand, if that's not needed, there's a simpler, more concise notation that lets the compiler handle a lot of details for you.

I think this is a promising direction, and I look forward to hearing how other's can improve on the idea!

You can think of a closure as a struct that holds all of the captured fields along with an impl of one of the function traits like FnOnce and FnMut, so in that sense the rules for structs and closures are the same.↩

While not completely accurate, I think of async blocks and generators as being like an enum where there are is a variant for the beginning of the function, each await point, and the end. The fields of each variant then store the captures and anything live across the corresponding await point. So in this sense, the rules for auto traits on async blocks are similar to those for enums.↩

There are lots of other syntaxes we could use, like async(Send) or an attribute like #[require(Send)]. The point is to illustrate the idea and we can bikeshed on syntax if we decide this is worth pursuing.↩