h1
Nine Patches
-
2020-04-07
Contributing to rust-lang/rust
has been a long-standing desire of mine.
It’s a central of software that has enabled a lot of my work, and being able
to contribute directly back to it felt like something I wanted to do
1.
Contributing in this context leans mostly towards
“contributing code”. That’s not the only kind of contributions possible
though; the Scuttlebutt community has actively
been trying to shift the thinking around “contributions” to include everyone
in the process: people drafting issues, managing issues, organizing meetings,
reviewing, and writing code. I believe they keep a spreadsheet somewhere to
properly credit folks for their contributions. But yeah in my case, I wanted
to contribute patches to rust-lang/rust
because it seemed like fun.
Becuase rust-lang/rust
is such a big project (100.000+ commits, 2500+
contributors) I felt I wanted to be super careful about any contributions. So
my first attempt to contribute was over a year ago and I tried to contribute
the Num::leading_ones
and Num::trailing_ones
APIs. I drafted an
RFC,
then tried submitting an
implementation, but life got
in the way and I didn’t see it through to completion.
In January of this year someone else went ahead and landed a PR for the same features feature and that inspired me. So last weekend I had some time to spare, and figured I’d go ahead and try sending in my first patch! It’s been 4 days since, and I’ve now sent through 9 patches. Not all are merged, but I figured it’d be fun to talk about them still!
1. Fix a broken link
#70730. I noticed a broken link in the docs, and figured fixing it would both be easy and guaranteed to land. A perfect first patch! So I set out to check out the compiler, got it to build, fixed the docs and submitted a patch. And lo and behold: it got merged in under a day!
2. Arc::{incr,decr}_strong_count
#70733.
std::sync::Arc
is an
“Atomic Reference Counter” that’s provided by the stdlib. A common way to use
it is to wrap some structure you want to share between threads in it, and call
clone
on it to get owned copies that point to the same structure.
However there’s another use for it: to create an atomic reference counter for
use in other structures. A common structure where this comes in useful for is
when manually constructing
Waker
s: one of the
core building blocks used in async/await
. Each waker needs to implement
Clone
, and encapsulates a
RawWaker
.
Because of reasons the best way to use an Arc
for these kinds of cases is
to access it through a pointer. But unfortunately Arc
doesn’t provide
manual ways to access its internal counters so a lot of code uses tricks to
increment / decrement counters through pointers:
// Increment the strong reference count
let arc: Arc<W> = Arc::from_raw(ptr);
mem::forget(Arc::clone(&arc));
// Decrement the strong reference count
mem::drop(Arc::from_raw(ptr));
This is not ideal for many reasons; it’s easy to get issues such as double frees, and the consequences lead to memory unsafety. Instead having methods to change the reference counts without any of those risks would be desireable. And @mark-simulacrum came up with a really nice design for this:
// Increment the strong reference count
Arc::incr_strong_count(ptr);
// Decrement the strong reference count
Arc::decr_strong_count(ptr);
This allows incrementing / decrementing the Arc refcounts through pointers, which is a big win over manually trying to do things with atomics, or manually trying to mutate the refcounts.
3. Add slice::fill
#70752. This patch adds a way
to uniformly assign a value to a region of memory in Rust. A while ago I was
working with Ryan Levick on remem
and
during one of the experiments we noticed that there was no convenient API to
write zeroes into a memory region. What we ended up doing was a for
loop
and writing values one-by-one 2.
There are many more considerations around zeroing memory. I
implore you to check out zeroize
to see some of the other considerations beyond basic convenience and
performance. I do hope someday we’ll see parts of it come over into std.
This was patch was merged a few days ago, and is now available on nightly. So basically what this provides is that instead of doing this:
let mut buf = vec![0; 10];
for val in buf {
*val = 1;
}
assert_eq!(buf, vec![1; 10]);
You can now write this:
let mut buf = vec![0; 10];
buf.fill(1);
assert_eq!(buf, vec![1; 10]);
It comes with the added benefit of using
Clone::clone_from
under the hood which enables memory reuse in certain cases. Also in the case
of writing u8
or u16
it optimizes down to a single
memset(3)
call!
4. Add conversion from Arc for Waker.
The idea was that using the new Wake
trait, instead of having to write this:
// With the Wake trait
use std::task::{Waker, Wake};
use crossbeam::sync::{Unparker};
struct BlockOnWaker {
unparker: Unparker,
}
impl Wake for BlockOnWaker {
fn wake(self: Arc<Self>) {
self.unparker.unpark();
}
}
fn block_on<F: Future>(future: F) -> F::Output {
let unparker = ...;
let waker = Waker::from(Arc::new(BlockOnWaker { unparker }));
...
}
In basic cases you could just write this:
// With this API
use std::task::Waker;
fn block_on<F: Future>(future: F) -> F::Output {
let unparker = ...;
let waker = Waker::from(Arc::new(move || unparker.unpark()));
...
}
But for real-world use this is not great, and so we decided it was probably better to not merge it.
5. Add BufRead::read_while
#70772. I recently got
interested in parsers, and wrote about my
experience. One
of the APIs that came out of this was BufRead::read_while
, which forms a
counterpart to BufRead::read_until
. The idea is that we can read bytes into
a buffer as long as a “predicate” (a closure) returns true:
use std::io::{self, BufRead};
let mut cursor = io::Cursor::new(b"lorem-ipsum");
let mut buf = vec![];
cursor.read_while(&mut buf, |b| b != b'-')?;
assert_eq!(buf, b"lorem");
This patch hasn’t been merged yet, and there’s a question about whether
Read::take_while
could perhaps be a better fit. But overall I’m happy we’re
talking about ways to make reading bytes streams easier, which in turn will
make it easier to write streaming parsers.
6. Add ready! macro
When writing futures by hand (or streams, or async read and write impls) one
of the core building blocks is
Poll
. This enum
behaves the same as Option
, but instead of talking about “some” and
“none” we talk about “ready” and “pending”.
Because of reasons Try
isn’t implemented directly on Poll
, so simply
doing ?
on it doesn’t work to return early. Instead the whole ecosystem
relies on a Poll
equivalent of the try!
macro: ready!
. This 5-line macro
is relied on by every single project that implements futures, and is even
part of futures-core
.
This patch introduces it to stdlib, so after that the only API left to
include from futures-core
would be Stream
.
use core::task::{Context, Poll};
use core::future::Future;
use core::pin::Pin;
async fn get_num() -> usize {
42
}
pub fn do_poll(cx: &mut Context<'_>) -> Poll<()> {
let mut f = get_num();
let f = unsafe { Pin::new_unchecked(&mut f) };
let num = ready!(f.poll(cx));
// ... use num
Poll::Ready(())
}
7. Removed outdated comments
#70824. While investigating
how to add ready!
to std I found some outdated comments that had likely
been moved around by some tooling. This patch removed them since they they
didn’t really serve any purpose.
8. Add future::{pending, ready}
#70834. This patch adds two
APIs to the future
submodule: future::ready
and future::pending
. Much
like std::iter::once
and
std::iter::empty
they
allow quickly constructing a type with a certain behavior. future::ready
immediately returns a value. And future::pending
never returns a value.
This is particularly useful for tests. Especially since both types implement
Unpin
which means we don’t need to worry about Pin
in examples. So the
example we saw in the last section about the ready
macro could be written
without using unsafe
:
use core::task::{Context, Poll};
use core::future::{self, Future};
use core::pin::Pin;
pub fn do_poll(cx: &mut Context<'_>) -> Poll<()> {
let mut fut = future::ready(42_u8);
let num = ready!(Pin::new(fut).poll(cx));
// ... use num
Poll::Ready(())
}
9. Add Integer::{log, log2, log10}
#70835. This was a particular
fun patch set because I got to work with a friend! A while back Substack
tweeted they wish
they had Integer::log2
in the stdlib because implementing it by hand
requires a lot of care.
Now that I felt somewhat confident submitting patches to rust-lang/rust
I
figured this would be a fun, somewhat out-there PR to attempt. We ended up
working together, with me handling submitting the PR and integrating it into
Rust, and Substack making the implementation work.
What this patch enables is to be able to do the following:
assert_eq!(1_000_u16.log10(), 3);
Which doesn’t look like much; but if you look at the implementation there is
a lot going on to make this work. This is useful because it allows us to
calculate logarithms for much higher numbers than if we had to convert ints
to floats and back, and by providing a arbitrary base log
method it also
allows bypassing subtle precision bugs.
Conclusion
The past 4 days have been a lot of fun. I’m happy I got to implement the patches that I did, and excited some of them have even landed on nightly.
Generally the experience submitting patches was positive. Getting up and running took some work (hi cmake, python, x.py). But once I was a few patches in I mostly knew how to go about it.
This has been an overview of me submitting a few patches. It was fun!