h1
IO Delegation for Arc
-
2020-03-03
When Stjepan and I were designing async-std, one of the core design principles was to make it resemble the standard library as closely as possible. This meant the same names, APIs, and traits as the standard library so that the same patterns of sync Rust could be used in async Rust as well.
One of those patterns is perhaps lesser known but integral to std’s
functioning: impl Read/Write for &Type. What this means is that if you have
a reference to an IO type, such as File or TcpStream, you’re still able
to call Read and Write methods thanks to some interior mutability tricks.
The implication of this is also that if you want to share a std::fs::File
between multiple threads you don’t need to use an expensive
Arc<Mutex<File>> because an Arc<File> suffices. And because of how we
designed async-std, this works for async_std::fs::File as well!
For simplicity in this post we’ll be using std::io::* examples instead of
the futures-io variants because they’re
shorter. But since async and sync IO traits are closely
related,
everything in this post applies to both.
Problems with delegation
A simplified version of async-h1’s’s accept
function has the
following bounds (simplified):
fn accept<IO>(url: http_types::Url, io: IO) -> http_types::Result<()>
where
IO: AsyncRead + AsyncWrite + Clone;
Each call to accept takes a url and some IO object that implements
Clone and async versions of Read and Write. The Clone bound is
important because it enables freely copying owned handles of the same value
within the function. This was the only way to enable this API under Rust’s
current borrowing rules 1.
Once we get “streamable streams” / “non-overlapping streams” we might be able to express this without cloning at all because we can keep references alive in more cases. But that’s not scheduled to land anytime soon.
You might expect that if we wrap an IO type T in an Arc that it would
implement Clone + Read + Write. But in reality it only implements Clone + Deref<T>. This means that if we want to access any of the Read or Write
functions we must first dereference it using &:
let stream = TcpStream::connect("localhost:8080");
let stream = Arc::new(stream);
&stream1.write(b"hello world")?; // OK: Read is available for &Arc<TcpStream>.
stream1.write(b"hello world")?; // Error: Read is not available for Arc<TcpStream>.
However, there’s an escape hatch here: we can create a wrapper type around
Arc<TcpStream> that implements Read + Write by dereferencing &T internally:
#[derive(Clone)]
struct Stream(Arc<TcpStream>);
impl Read for Stream {
fn read(
&mut self,
buf: &mut [u8],
) -> io::Result<usize> {
(&mut &*self.0).read(buf)
}
}
impl Write for Stream {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
(&mut &*self.0).write(buf)
}
fn flush(&mut self) -> io::Result<()> {
(&mut &*self.0).flush()
}
}
There are a few shortcomings here though: it’s not generic over any type and
it’s yet another wrapper type around Arc. Wouldn’t it be much nicer if
Arc exposed Read and Write without the need for another wrapper type?
A better way forward
We implemented a proof of concept of conditional support for Read,
AsyncRead, Write, and AsyncWrite on Arc<T> as the io-arc
crate. The way it’s implemented is as follows:
/// A variant of `Arc` that delegates IO traits if available on `&T`.
#[derive(Debug)]
pub struct IoArc<T>(Arc<T>);
impl<T> IoArc<T> {
/// Create a new instance of IoArc.
pub fn new(data: T) -> Self {
Self(Arc::new(data))
}
}
impl<T> Read for IoArc<T>
where
for<'a> &'a T: Read,
{
fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
(&mut &*self.0).read(buf)
}
}
impl<T> Write for IoArc<T>
where
for<'a> &'a T: Write,
{
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
(&mut &*self.0).write(buf)
}
fn flush(&mut self) -> io::Result<()> {
(&mut &*self.0).flush()
}
}
Using higher ranked trait bounds we’re able to implement Read for T if
&T: Read. This removes the need entirely for intermediate structs, and
allows working directly with an Arc-like construct using any io type. This
should be possible to implement directly on Arc, which would remove the
need for wrapper types all together.
Are there any alternatives?
The way we solved this in async-std was to implement Clone for
TcpStream
directly.
This shouldn’t be a bottleneck in practice, but we haven’t done this for
other types yet such as File. And similarly: we can’t expect std to be
able to make the same tradeoffs. Arc<T> exposing Read + Write if
available is superior in all regards.
What are the downsides?
All of the bounds in this post are entirely conditional, don’t introduce any
extra overhead, and don’t make use of unsafe anywhere. This should make
them a fairly uncontroversial candidate for inclusion in std – all they do
is make an existing pattern easier to use.
What’s next?
It’d be great if these bounds could be made part of std. I haven’t
contributed much to std yet, and am somewhat daunted by making my first
contribution. But maybe this could be a first contribution?
If these bounds get accepted in std for the std::io traits, they’d
make an excellent addition for the futures::io traits as well. And finally
it’d be great if we could support BufRead and AsyncBufRead. There’s an
open issue for it on the
repo, but we haven’t quite
figured it out yet.
Conclusion
In this post we’ve shown how Arc interacts with &T: Read + Write today, and
explained existing ways of working around its shortcomings. We’ve introduced
a novel approach to work around this and published
io-arc as a proof of concept how these bounds
could be implemented as part of std.
All in all this seems like exactly the kind of quality of life improvement
that would make people’s lives easier. The trait bounds for conditional
detection of &T: Read / Write were incredibly tricky to write, but the
resulting usage is quite straight forward!
Thanks to llogiq for helping with the post and the library. And stjepang for helping with the post, and coming up with most of the ideas shared here.