One of the major goals for the Rust Async Working Group is to allow async fn everywhere fn is allowed, especially in traits. In this post, I’d like to distill some of the proposed designs and show how async functions in traits could be implemented. We’ll look at one possible way this could work, although I’d like to emphasize that this is not the only way, and many details of the design that we’ll ultimately adopt are still being worked out.

Impact

We want the following program to work.

use std::sync::Arc;

trait AsyncTrait {
    async fn get_number(&self) -> i32;
}

impl AsyncTrait for i32 {
    async fn get_number(&self) -> i32 {
        *self
    }
}

async fn print_the_number(from: Arc<dyn AsyncTrait>) {
    let number = from.get_number().await;
    println!("The number is {number}");
}

#[tokio::main]
async fn main() {
    let number_getter = Arc::new(42);
    print_the_number(number_getter).await;
}

This short program demonstrates nearly all the features for async functions in traits. In particular, we start with AsyncTrait, which is a trait with a single async method in it. This trait could be used in a static context, such as in:

async fn print_the_number_static(from: impl AsyncTrait) {
    let number = from.get_number().await;
    println!("The number is {number}");
}

We’ll see how to do this in the section on Static Traits.

In this example, we instead call get_number dynamically in the print_the_number function. We’ll explore this in the section where we talk about dynamic traits.

Current Status

The above example works in synchronous Rust today (i.e. if we erase all the async and .await in the program):

use std::sync::Arc;

trait Trait {
     fn get_number(&self) -> i32;
}

impl Trait for i32 {
     fn get_number(&self) -> i32 {
        *self
    }
}

 fn print_the_number(from: Arc<dyn Trait>) {
    let number = from.get_number();
    println!("The number is {number}");
}

 fn main() {
    let number_getter = Arc::new(42);
    print_the_number(number_getter);
}

Playground

To do this in async Rust, we can but we need to use the async_trait crate:

use std::sync::Arc;
use async_trait::async_trait;

#[async_trait]
trait AsyncTrait {
    async fn get_number(&self) -> i32;
}

#[async_trait]
impl AsyncTrait for i32 {
    async fn get_number(&self) -> i32 {
        *self
    }
}

async fn print_the_number(from: Arc<dyn AsyncTrait>) {
    let number = from.get_number().await;
    println!("The number is {number}");
}

#[tokio::main]
async fn main() {
    let number_getter = Arc::new(42);
    print_the_number(number_getter).await;
}

Playground

We’d like to be able to do this without needing any extra crates. Also, async_trait works by boxing the resulting future from any async methods, which means we incur an extra allocation. It’d be nice to be able to avoid that as well.

For the rest of this post, we’ll look at how this might happen, starting with async functions in static traits.

Async functions in static traits

Step 1: Desugaring with Type Alias Impl Trait (TAIT)

Currently, for a top-level async function, the compiler desugars the function as follows:

Before:

async fn foo() -> i32 {
    42
}

After:

fn foo() -> impl Future<Output = i32> {
    async {
        42
    }
}

Basically, async functions are functions that return a future. That future has a concrete type, but it is not nameable because it is hidden behind the impl Future<Output = i32>. All we know is that it is something that implements the Future trait.

Now let’s try to do the same thing for our trait.

trait AsyncTrait {
    async fn get_number(&self) -> i32;
}

impl AsyncTrait for i32 {
    async fn get_number(&self) -> i32 {
        *self
    }
}

We can try to do the same conversion we did before:

trait AsyncTrait {
    fn get_number(&self) -> impl Future<Output = i32>;
}

impl AsyncTrait for i32 {
    fn get_number(&self) -> impl Future<Output = i32> {
        async {
            *self
        }
    }
}

There are a couple of problems though.

First, what type should we use for the return type of get_number? The compiler needs to find some concrete type for it, even if we can’t see the type. We could imagine that there is one concrete return type for AsyncTrait::get_number. If we did that, the desugaring would look something like this:

type SecretGetNumberReturn = impl Future<Output = i32>;

trait AsyncTrait {
    fn get_number(&self) -> SecretGetNumberReturn;
}

impl AsyncTrait for i32 {
    fn get_number(&self) -> SecretGetNumberReturn {
        async {
            *self
        }
    }
}

Here we’ve prefixed the type name with Secret to show that this isn’t a type that’s nameable; it would be generated and internal to the compiler.

One obvious problem is that type SecretGetNumberReturn = impl Future<Output = i32> is not allowed yet. The in-development feature that makes this possible is called type_alias_impl_trait or TAIT, and though it is not stable it seems to work reasonably well at this point and we could use it internally in the compiler to desugar async functions.

The more subtle problem is that each async { ... } block has a different, unnameable type. So to pick one concrete impl Future type for the whole program, we would only be able to implement AsyncTrait once. This kind of defeats the purpose of traits!

So instead we need to make SecretGetNumberReturn an associated type for the trait, like this:1

trait AsyncTrait {
    type SecretGetNumberReturn: Future<Output = i32>;

    fn get_number(&self) -> Self::SecretGetNumberReturn;
}

impl AsyncTrait for i32 {
    type SecretGetNumberReturn: Future<Output = i32> = impl Future<Output = i32>;

    fn get_number(&self) -> Self::SecretGetNumberReturn {
        async {
            *self
        }
    }
}

So we’re most of the way there, but there’s still a problem. At the moment, we can return a specialized future for each impl of a trait for a given type. The problem is that the returned future will usually need to borrow from self (we’ll see why in the next section), meaning the returned future’s type will need a lifetime bound. For this to work, we need Generic Associated Types (GATs).

Step 2: Desugaring with GATs

One detail we ignored in the last section was what the async { *self } block does in the body of foo. Much like when compiling closures, the compiler converts that block into a struct and an impl of Future for that struct. It would look something like this:

trait AsyncTrait {
    type SecretGetNumberReturn: Future<Output = i32>;

    fn get_number(&self) -> Self::SecretGetNumberReturn;
}

struct SecretGetNumberForI32Future {
    this: &i32,
}

impl Future for SecretGetNumberForI32Future {
    type Output = i32;

    fn poll(self: Pin<&mut Self>, _context: &mut Context<'_>) {
        Poll::Ready(*self.as_ref().this)
    }
}

impl AsyncTrait for i32 {
    type SecretGetNumberReturn: Future<Output = i32> = SecretGetNumberForI32Future;

    fn get_number(&self) -> Self::SecretGetNumberReturn {
        SecretGetNumberForI32Future {
            this: self,
        }
    }
}

Basically, we’ve copied the self parameter into the SecretGetNumberForI32Future struct, but renamed to this so that it’s a legal field name. Then we generate a poll function that dereferences this and returns the resulting integer. There’s a big problem though, which is that to have a reference in a struct, the struct needs to have a lifetime.

So to do this correctly, we need to add a lifetime parameter to SecretGetNumberForI32Future and thread it to all the other places where it is used:

trait AsyncTrait {
    type SecretGetNumberReturn<'a>: Future<Output = i32> + 'a;

    fn get_number(&self) -> Self::SecretGetNumberReturn;
}

struct SecretGetNumberForI32Future<'a> {
    this: &'a i32,
}

impl<'a> Future for SecretGetNumberForI32Future<'a> {
    type Output = i32;

    fn poll(self: Pin<&mut Self>, _context: &mut Context<'_>) {
        Poll::Ready(*self.as_ref().this)
    }
}

impl AsyncTrait for i32 {
    type SecretGetNumberReturn<'a>: Future<Output = i32> + 'a = SecretGetNumberForI32Future<'a>;

    fn get_number(&self) -> Self::SecretGetNumberReturn {
        SecretGetNumberForI32Future {
            this: self,
        }
    }
}

The problem is, associated types are currently not allowed to have parameters. To do so would make this a generic associated type (GAT).

GATs are not yet stable, but they work reasonably well which means Rust could use them in the desugaring.

Dynamic Traits with dyn*

Step 3: dyn*

What we have so far is enough to let us use async functions in traits in a static dispatch context. For example, we can write the following:

async fn print_the_number_static_dispatch(from: impl AsyncTrait) {
    let number = from.get_number().await;
    println!("The number is {number}");
}

However, this requires us to know at compile time the concrete type of from. In our starting example, we had from: Arc<dyn AsyncTrait>, which allows the concrete type behind the dyn AsyncTrait to change at runtime.

One of the main challenges is how to store the future returned from get_number(). In the static case, the compiler simply stores the future in print_the_number_static_dispatch’s stack frame. But this requires the compiler to know the size of the future, which cannot be known in the dyn AsyncTrait case.

This problem exists generally for dyn Trait objects. The way this is solved in the non-async case is by making it so that normally code cannot refer directly to a dyn Trait object. Instead, code interacts with dyn Traits through pointers such as &dyn Trait and Box<dyn Trait>. These pointers are fat pointers made up of a pair of a pointer to the object itself and a pointer to the object’s vtable for Trait. The key thing here is that using a pointer instead of the raw trait object gives the type a known size.

So for async functions in traits, we need a way to coerce a returned future into a pointer. We could try returning an &dyn Future like this:

trait AsyncTrait {
    type SecretGetNumberReturn<'a>: Future<Output = i32> + 'a;

    // return value is a reference to a SecretGetNumberReturn
    fn get_number(&self) -> &dyn Future<Output = i32>;
}

When we go to implement this, we’d end up with something like this:

impl AsyncTrait for i32 {
    type SecretGetNumberReturn<'a> = SecretGetNumberForI32Future<'a>;

    fn get_number(&self) -> &dyn Future<Output = i32> {
        &SecretGetNumberForI32Future {
            this: self,
        }
    }
}

As you might expect, there are problems with this. Most importantly, it’s horribly unsound (although the compiler would give you an error if you tried to write this). The problem is that we are returning a reference to a temporary. The temporary will be dropped when get_number returns, which means the reference has outlived its referent.

The other problem is that the trait is now aware of what kind of pointer it’s returning. With existing dyn-safe traits, you write one impl of a trait and that impl works regardless of whether it is used as a &dyn Trait, Box<dyn Trait>, Arc<dyn Trait>, or some pointer type that has not been invented yet.

The plan is to solve this by introducing a new dyn* which is a pointer to a dyn Trait that is agnostic to its underlying pointer type.2 I’ll summarize the way dyn* works here, but for more information, see Niko’s post, [dyn*: can we make dyn sized?]niko-dyn* Whereas now we must be explicit about whether we have a &dyn Trait, a Box<dyn Trait>, or something else, a dyn* Trait would be able to refer to any of these.

A dyn* Trait would be still be a pair of a pointer-sized thing and a pointer to a vtable. The pointer-sized thing will most often actually be a pointer to the trait object’s data, but in cases where the object itself is pointer-sized, we have the option of storing the object itself inline.

We can control how the dyn* Trait object is created through lang-item helper traits to facilitate coercing an object into a pointer and back. The trait will look something like this, which we will use as our example:

trait IntoDynPointer {
    type Raw: PointerSized;

    fn into_dyn(self) -> Self::Raw;
    fn from_dyn(raw: Self::Raw) -> Self;
    fn from_dyn_pin_mut(raw: Pin<Self::Raw>) -> Pin<&'static mut Self>;
}

unsafe trait PointerSized {}
unsafe impl PointerSized for usize {}
unsafe impl<T> PointerSized for *const T {}
unsafe impl<T> PointerSized for Box<T> {}
// etc.

When the compiler sees a x as dyn* Future<Output = ()> cast, it will be desugared to something like (mem::transmute<_, usize>(x.into_raw()), VTABLE_FOR_X_AS_FUTURE). This will be possible if x impls Future<Output = ()> and IntoDynPointer.

The compiler will also generate a vtable that looks something like this:

const VTABLE_FOR_SECRET_NUMBER_RETURN_AS_FUTURE = FutureVtable {
    poll_fn: |this, cx| {
        let this: <SecretNumberReturn as IntoDynPointer>::Raw =
            unsafe { mem::transmute(this) };
        let this = <SecretNumberReturn as IntoDynPointer>::from_dyn_pin_mut(this);
        this.poll(cx)
    },
    drop_fn: |this| {
        let this: <SecretNumberReturn as IntoDynPointer>::Raw =
            unsafe { mem::transmute(this) };
        SecretNumberReturn::from_dyn(this);
    }
}

The implementation of IntoDynPointer will determine whether the future is boxed or some other allocation strategy is used. Here is what the impl would look like with the default boxing strategy:

impl IntoDynPointer for SecretNumberReturn {
    type Raw = *const ();

    fn into_dyn(self) -> Self::Raw {
        Box::into_raw(Box::new(self)) as *const _
    }

    fn from_dyn(raw: Self::Raw) -> Self {
        unsafe { *Box::from_raw(raw as *mut _) }
    }

    fn from_dyn_pin_mut(raw: Pin<Self::Raw>) -> Pin<&'static mut Self> {
        unsafe { mem::transmute(raw) }
    }
}

There are a couple of things to note so far. First, we added a new unsafe trait PointerSized. Although it’s marked as unsafe, it’s not really unsafe. Since the compiler knows the actual types, it can check whether the type claimed to be pointer-sized is actually pointer-sized. (Pointer-sized can’t be inferred though because type layouts are computed too late in the compilation.) Second, we will also want versions of from_dyn like from_ref, from_ref_mut, from_pin, etc. to support the different self types that different methods in the trait might have. Finally, it is expected that the compiler will be able to generate an IntoDynPointer impl for compiler-generated futures (i.e. the results of async blocks and return types of async fns) such as the one showed above, although there will need to be an escape hatch for cases where boxing is not possible or desirable.

Next, we need to create the dyn* object and generate an impl of Future for the dyn*.

The compiler will represent a dyn* Future as a pair of some data and a vtable pointer.3 We’ll use the following struct to represent this:

struct DynStarFuture {
    data: *const (),
    vtable: &'static FutureVtable,
}

Now we need to generate an impl of Future for DynStarFuture so we have dyn* Future: Future:

impl Future for DynStarFuture {
    type Output = i32;

    fn poll(self: Pin<&mut Self>, cx: &mut Context) -> Poll<Self::Output> {
        (self.vtable.poll_fn)(self.data, cx)
    }
}

We’ll also need to implement Drop:

impl Drop for DynStarFuture {
    fn drop(&mut self) {
        (self.vtable.drop_fn)(self.data)
    }
}

The last missing piece is to generate an adapter trait that is used for a dyn AsyncTrait. The adapter trait, which we will call DynAsyncTrait is the same as AsyncTrait but with the impl Future types replaced with dyn* Future:4

trait DynAsyncTrait {
    fn get_number(&self) -> dyn* Future<Output = i32>;
}

We can write a blanket implementation for any T that impls AsyncTrait, which allows any implementation to be used in a dyn context:

impl<T: AsyncTrait> DynAsyncTrait for T {
    fn get_number(&self) -> dyn* Future<Output = i32> {
        let future = <Self as Future>::get_number(self);
        DynStarFuture {
            data: mem::transmute(Self::into_raw()),
            vtable: &VTABLE_FOR_SECRET_NUMBER_RETURN_AS_FUTURE,
        }
    }
}

So now we are finally able to rewrite print_the_number to dynamically dispatch to an async function. By way of reminder, here is the original function:

async fn print_the_number(from: Arc<dyn AsyncTrait>) {
    let number = from.get_number().await;
    println!("The number is {number}");
}

The compiler would then rewrite this as something like:

async fn print_the_number(from: Arc<dyn DynAsyncTrait>) {
    let number_future = from.get_number();
    let number = number_future.await;
    println!("The number is {number}");
}

The only substantive change between these functions is the type of from changed from Arc<dyn AsyncTrait> to Arc<dyn DynAsyncTrait>. The magic actually happens at the call site. Since DynAsyncTrait is now dyn-safe and we have a blanked impl of DynAsyncTrait for any type that impls AsyncTrait, the compiler can coerce the thing that impls AsyncTrait (i32 in our case) into a DynAsyncTrait just like any other dyn trait.

Conclusion

We’ve just walked through the rather length set of transformations rustc needs to make to implement async functions in dyn trait objects. We started by showing how to support async functions in static dispatch traits using generic associated types. Next we showed how to make these traits dyn-safe using the dyn* concept.

If you’d like to see the whole transformed program in one place, check out this Rust Playground link.

Note that these ideas are still under development, and I may have represented something wrong in this post. My goal was to summarize the challenges all in one place and show how some of the proposed solutions meet these challenges. The later parts of the design around dyn* are still rather new and may undergo significant change, but we seem to have something fleshed out well enough that we can start experimenting with implementation.

Thanks to Nick Cameron for his feedback on an earlier draft of this post.

  1. I’ve prefixed type names with Secret to indicate that these would be unnameable types generated by the compiler. Essentially the compiler already does the same thing for the return type of any -> impl Trait function. 

  2. Another possible approach to solving this is using unsized Rvalues. Unsized Rvalues allows some values of unsized types to be stored on the stack using alloca. Unfortunately, supporting alloca within an async fn is non-trivial at best and impossible at worst. 

  3. Does it make sense to store the vtable inline? It’d still be sized and we would avoid an indirection. The dyn* types for large traits would get large though, and since dyn* is passed by value this would lead to a lot of time spent copying memory around. 

  4. Interestingly, this isn’t actually a new trait; it’s equivalent to the trait AsyncTrait<SecretGetNumberReturn = dyn* Future<Output = i32>