h1
Uninit Read/Write
-
2021-12-07
The AsyncRead
and AsyncWrite
traits are async versions of the Read
and
Write
traits in Rust. They’re core to async Rust, providing the interface to
read and write bytes from for example the filesystem and network. But both the
async and non-async variants have an open issue: how can we use these traits to
write data into uninitialized memory?
In this post we look at the problem of reading into unitialized memory using
Read
/Write
, and at possible solution we could introduce to solve this.
Showing the problem
Both async and non-async Rust share the same issue for both Read
and Write
traits. So let’s take use the non-async Read
trait as our example. It’s
defined as follows:
pub trait Read {
fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
}
Calling Read::read
will read bytes into a mutable slice of memory, and return
an io::Result
containing either the number of bytes read, or an error. Usage
typically is something like this:
// open a file and init the buffer
let mut f = File::open("foo.txt")?;
let mut buffer = vec![0; 1024];
// read up to 1024 bytes
let read = f.read(&mut buffer)?;
let data = buffer[..read];
This will read 1024 bytes of the file into the buffer. But it comes with a slight inefficiency: we’re writing zeroes into the buffer, and then immediately writing more data into the buffer after that. This means we’re doing a double-write, which is not as efficient as it could be (report). Ideally we’d be able to do the following:
// open a file and init the buffer
let mut f = File::open("foo.txt")?;
let mut buf = Vec::with_capacity(1024); // reserve capacity, but don't initialize
// read up to 1024 bytes
let read = f.read(&mut buf)?;
let data = buffer[..read];
But this doesn’t work. Even if the capacity is 1024, because we haven’t initialized the vector it’s as if we passed an empty slice, and the following assertion will always hold:
assert_eq!(data.len(), 0);
And this is the problem this post is about: our slices can’t have uninitialized
memory, and our vectors which can have uninitialized memory are always
dereferenced into slices first. In order to support writing into uninitialized
memory over Read
bounds we need to resolve this.
Solution 1: unitialized slices
The current thinking is that we can solve this issue by introducing “slices of unitialized memory” and extending the IO traits with methods dedicated to taking these slices. This was proposed and subsequently merged as part of RFC 2930: read_buf (tracking issue).
The RFC is worth a read if you want a closer look at the problem space. But the gist of the solution it proposes is as follows:
- Add a new
ReadBuf
type which wraps[MaybeUninit::<u8>::uninit(); N];
- Add an optional
read_buf
method which takes&mut ReadBuf<'_>
instead of&mut [u8]
. - Users who want to read into uninitialized memory can implement and use
read_buf
.
The RFC provides the following usage example:
let mut buf = [MaybeUninit::<u8>::uninit(); 8192];
let mut buf = ReadBuf::uninit(&mut buf);
loop {
some_reader.read_buf(&mut buf)?;
if buf.filled().is_empty() {
break;
}
process_data(buf.filled());
buf.clear();
}
With the following definition for Read::read_buf
provided by default, intended
to be modified by implementers:
impl Read for MyReader {
fn read_buf(&mut self, buf: &mut ReadBuf<'_>) -> io::Result<()> {
let n = self.read(buf.initialize_unfilled())?;
buf.add_filled(n);
Ok(())
}
}
This achieves the goal it sets out to do: we successfully can read data into uninitialized memory. However for users it requires a little more work to setup:
// Read into uninit memory.
let mut buf = [MaybeUninit::<u8>::uninit(); 1024];
let mut buf = ReadBuf::uninit(&mut buf);
file.read_buf(&mut buf)?;
let data = buf.filled();
// Read into init memory
let mut buf = [0; 1024];
let read = file.read(&mut buf)?;
let data = buf[..read];
Overall this is not too bad, and gets us in the right direction. In order to
write bytes to the heap instead of the stack I believe you can wrap the buf in a
Box::new
1 2. The main downsides are that it requires using
a dedicated type just for this purpose, and it’s slightly more verbose.
I believe with placement new (box
keyword) this might save
yet another copy.
Or as maybewaffle pointed
out: this could be
used with Vec::spare_capacity_mut
as well.
Solution 2: uninitialized vectors
Since August 2020, the
Vec::spare_capacity_mut
method has been available on nightly. This returns the remaining spare capacity
of the vector as a slice of MaybeUninit<T>
. The docs show the following usage
example:
let mut v = Vec::with_capacity(10);
let uninit = v.spare_capacity_mut();
uninit[0].write(0);
uninit[1].write(1);
uninit[2].write(2);
// Mark the first 3 elements of the vector as being initialized.
unsafe {
v.set_len(3);
}
assert_eq!(&v, &[0, 1, 2]);
This API fills much the same role as the ReadBuf
type did in the first
solution, but without requiring a new type to be introduced. Usage of
uninitialized vectors becomes a lot closer to the current Read::read
behavior:
// Read into uninit memory.
let mut buf = Vec::with_capacity(1024);
let read = file.read_buf(&mut buf)?;
let data = buf[..read];
// Read into init memory
let mut buf = [0; 1024];
let read = file.read(&mut buf)?;
let data = buf[..read];
Implementers of “leaf” IO types (e.g. File
, etc.) could use this method to
implement a reader which reads directly into uninitialized memory. The default
implementation of read_buf
could become something like this:
impl Read for MyReader {
fn read_buf(&mut self, buf: &mut Vec<u8>) -> io::Result<usize> {
// zero-fill the unintialized memory
let uninit = buf.spare_capacity_mut();
mem::swap(uninit, &mut MaybeUninit::zeroed());
unsafe { buf.set_len(buf.capacity()) };
// read the data into the buffer
self.read(buf)
}
}
It’s worth pointing out that the name read_buf
at this point doesn’t convey
the intent particularly well anymore. A name such as read_to_capacity
might
more closely match our intent. To my knowledge: “Read into the uninitialized
bytes of a Vec
, but don’t grow beyond it”, doesn’t have any precedent in the
stdlib. The closest I can think of is Vec::fill
,
but it’s not quite it. If we choose to use this API, method naming and
documentation is something we should take a more closer look at.
Even reading into the stack becomes possible if we use
Vec::with_capacity_in
backed by a stack allocator. The ergonomics of that aren’t ideal yet, but that’s
expected to significantly improve over time 3.
I know folks are actively looking at the ergonomics of this, and am excited for some of the designs I’ve seen.
As we mentioned, the main benefit of this approach is that it doesn’t introduce a new type and as such maps nicely to existing patterns. Compared to the first solution we have better ergonomics when reading into the heap, and slightly worse ergonomics when reading into the stack. Overall I think this approach is promising, and I prefer it over the first solution.
Solution 3: specialization
Disclaimer: I’m by no stretch an authority on dynamic dispatch or specialization. It might be that I got details wrong, or misunderstood limitations. This section is “best effort”, and am happy to update it if turns out I got something wrong.
Going back to our earlier example, an ideal solution would be if instead of
defining a new read_buf
methods we could keep using the read
method instead:
// Read into uninit memory.
let mut buf = Vec::with_capacity(1024);
let read = file.read(&mut buf)?; // note: `read`, not `read_buf`
let data = buf[..read];
The way to do this in Rust would be using specialization. Specialization is still considered “unstable”, and has not yet been fully formed. So don’t expect the code below to work anytime soon, if ever. But if we squint a little, we could imagine a trait could be defined as follows:
pub trait Read {
// The "default" implementation.
default fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
// Implementation specialized for `&mut Vec<u8>`
fn read(&mut self, buf: &mut Vec<u8>) -> Result<usize> {
// zero-fill the unintialized memory
let uninit = buf.spare_capacity_mut();
mem::swap(uninit, &mut MaybeUninit::zeroed());
unsafe { buf.set_len(buf.capacity()) };
// read the data into the buffer
self.read(buf.as_slice())
}
}
In this example we define the trait Read
with a “default” implementation for
&mut [u8]
, and a built-in specialization for &mut Vec<u8>
. The
specialization needs to be built directly into the trait definition, to ensure
that &mut Vec<u8>
will never default to being interpreted as an zero-length
slice. Trait implementers would be expected to provide their own specialization
for &mut Vec<u8>
, to make use of the ability to write directly into
uninitialized memory.
Another thing that’s unclear about specializations is the interactions with
semver. In our example we’re now using Vec::capacity
rather than Vec::len
to
determine how many bytes to write into the vector. All observable changes in
behavior have the potential to be breaking, so any
modification to this behavior should always use crater to determine impact. But
it doesn’t seem like we would break any intended use of the APIs.
At first glance this interface might seem ideal. It’s the same interface that just works depending on what’s passed. But as remarked in RFC 2930: read_buf, this approach has issues:
- It must be compatible with dyn Read. Trait objects are used pervasively in IO code, so a solution can’t depend on monomorphization or specialization.
Further elaboration on this point exists in this
document
and
this interview.
My interpretation of these materials is: “Specialization is not dyn safe, so if
we add a specialization dyn Read
would no longer be possible”. That would mean
the if Read
specialized as we showed, the following code would fail to
compile:
// This alias would cause an error because `Read` is not dyn safe.
type MyType = Box<dyn Read>;
Intuitively I would’ve expected this to be possible if we wanted it to, even if
it’s currently not implemented in the compiler. But my understanding of dyn
is
limited compared to the folks who’ve written the docs, so I trust there are good
reasons why they marked it as a non-option.
That said though; the constraints around dyn
in Rust are currently being
reworked in order to enable things like an AsyncIterator
with async fn next
to work. Which makes me wonder: is the interaction between dyn
and
specialization something which can be reworked as well?
I asked my colleague Sy whether combining
dynamic dispatch with specialization is possible in C++, and it appears it is!
They walked me through this C++ example which
combines a dynamic dispatch based on the class it’s called on and whether a
vec
or span
(slice) is passed. Even though the code is very different from
Rust, the functionality matches how we would expect it to work.
If we think about the wider implications of this, it seems not great if specializations and dynamic dispatch would inherently remain mutually exclusive. Both are incredibly useful, and if they can’t be used together that seems incredibly limiting.
Summary
In this post we’ve shown three examples for reading into uninitialized memory in Rust. As I mentioned throughout the post: it’d be great if we could make reading into unitialized memory as close as possible to reading into initialized memory. And I think in particular the second, and third examples have a lot of potential.
I was motivated to write this post after conversations with
nrc last week. I remembered reading
Amanieu’s comment about spare_capacity_mut
on
the tracking issue for RFC 2930,
and started thinking of what it could look like if we used that instead of
ReadBuf
.
In terms of timelines, I don’t believe we’re under great pressure to land the
features described in this post into the stdlib. The performance benefits can be
significant under certain workloads; but as we’re collectively moving towards
completion-based IO, traits such as
AsyncBufRead
will become more relevant.
Because these traits manage buffers internally, uninitialized memory never needs
to cross trait boundaries and this post doesn’t apply.
I hope I was able to provide some insight on how we can enable reading into uninitialized memory for (async) IO traits. If you liked this post and would like to see my cats, you can follow me on Twitter.
Thanks to Sy Brand for showing me how dynamic dispatch and specialization interact in C++, and nrc for helping review this post.