The concurrency features that are included in the Rust standard library are quite similar to what was available in C++11: threads, atomics, mutexes, condition variables, and so on. In the past few years, however, C++ has gained quite a few new concurrency related features as part C++17 and C++20, with more proposals still coming in for future versions.
Let’s take some time to review C++ concurrency features, discuss what their Rust equivalent could look like, and what it’d take to get there.
std::atomic_ref to C++.
It’s a type that allows you to use a non-atomic object as an atomic one.
For example, you can create a
atomic_ref<int> that references a regular
allowing you the same functionality as if it were an
While in C++ this needed a whole new type that duplicates most of the
the equivalent Rust feature is a one-line function:
This function allows you to convert, for example, a
&mut u32 to a
which is a form of aliasing that’s perfectly sound in Rust.
atomic_ref type comes with safety requirements that you need to uphold manually.
As long as you’re using an
atomic_ref to access an object, all access to that object must
be through an
Accessing it directly when there’s still an
atomic_ref results in undefined behavior.
In Rust, however, this is already fully taken care of by the borrow checker.
The compiler understands that by borrowing the
nothing is allowed to access that
u32 directly until that borrow ends.
The lifetime of the
&mut u32 that goes into the
is preserved as part of the
&AtomicU32 you get out of it.
You can make as many copies of that
&AtomicU32 as you want,
but the original borrow only ends once all copies of that reference are gone.
function is currently unstable, but perhaps it’s time we stabilize it.
Generic atomic type
In C++, the
std::atomic is generic: you can have a
atomic<int>, but also an
In Rust, on the other hand, we only have specific atomic types:
C++’s atomic type supports objects of any size, regardless of what the platform supports.
It automatically falls back to a lock-based implementation for objects of a size that are not supported by the platform’s native atomic operations.
On the other hand, Rust only provides the types that are natively supported by the platform.
If you’re compiling for a platform that does not have 64 bit atomics,
AtomicU64 does not exist.
This has advantages and disadvantages.
It means Rust code using
AtomicU64 might fail to compile for certain platforms,
but it also means no performance related surprises when some types silently fall back to a very different implementation.
It also means we can assume a
AtomicU64 is represented exactly the same as an
u64 in memory, allowing for functions like
Having a generic
Atomic<T> in Rust that works for types of any size can be tricky.
Without specialization, we can’t make
Atomic<LargeThing> include a
Mutex, while not including it in
What we could do, however, is to store the mutexes in a global
HashMap, indexed by memory address.
Atomic<T> can be identical in size to a
T, and use a
Mutex from this global hash map when necessary.
This is exactly what the popular
atomic crate does.
A proposal for adding such a universal
Atomic<T> type to the Rust standard library would need to discuss
whether it should be usable in
HashMap requires allocation, which isn’t possible in
A fixed size table could work for
no_std programs, but might be undesirable for various reasons.
Compare-exchange with padding
P0528R3 changes how
compare_exchange deals with padding.
A compare exchange operation on a
atomic<TypeWithPadding> used to compare the padding bits as well,
but that turned out to be a bad idea.
Nowadays, padding bits are no longer included in the comparison.
Since Rust currently only provides atomic types for integers, without any padding, this change is irrelevant for Rust.
However, a proposal for a
Atomic<T> with a
compare_exchange method would need to discuss how padding is handled,
and should probably take input from this proposal.
Compare-exchange memory ordering
functions required the success memory ordering to be at least as strong as the failure ordering.
compare_exchange(…, …, memory_order_release, memory_order_acquire) was not accepted.
This requirement was copied verbatim to Rust’s
P0418R2 argued that this restriction should be lifted, which happened as part of C++17.
The same restriction is lifted as part of Rust 1.64, as part of rust-lang/rust#98383.
constexpr Mutex constructor
std::mutex has a
which means it can be constructed as part of constant evaulation at compile time.
However, not all implementations actually provide this.
For example, Microsoft’s implementation of
std::mutex doesn’t include a
So, relying on this is a bad idea for portable code.
Also, interestingly, C++’s
constexpr constructor at all.
Mutex in Rust 1.0 did not include a
const fn new.
Combined with how Rust’s strict requirements for static initialization,
this made the
Mutex quite annoying to use in a
Latches and barriers
P1135R6 introduced, among other things,
std::barrier to C++20.
Both are types that allow waiting for several threads to reach a certain point.
A latch is basically just a counter that gets decremented by each thread and allows you to wait for it to reach zero.
It can only be used once.
A barrier is a more advanced version of this idea that can be reused, and accepts a “completion function”
to be automatically executed when the counter reaches zero.
Rust has had a similar
Barrier type since 1.0.
It was inspired by pthread (
pthread_barrier_t) rather than C++.
Rust’s (and pthread’s) barrier is less flexible than what’s now included in C++.
It only has a “decrement and wait” operation (called
and lacks the “only wait”, “only decrement”, and “decrement and drop” functions that C++’s
std::barrier comes with.
On the other hand, unlike C++, Rust’s (and pthread’s) “decrement and wait” operation assigns one thread to be the group leader. This is a (perhaps more flexible) alternative to a completion function.
The missing operations on the Rust version could easily be added at any point. All we need is a good proposal for the names of these new methods. :)
Rust does not have a general semaphore type,
although it does equip every single thread with what’s effectively a binary semaphore,
A semaphore can be easily constructed manually using a
Mutex<u32> and a
but most operating systems allow for a more efficient and smaller implementation using a single
For example, through
futex() on Linux
WaitOnAddress() on Windows.
It depends on the operating system and its version which sizes of atomics can be used for these operations.
counting_semaphore is a template that takes an integer as argument to indicate how far we want to be able to count.
For example, a
counting_semaphore<1000> can count up to at least 1000, and will therefore be 16 bit or larger.
binary_semaphore type is just an alias for
counting_semaphore<1>, and can be a single byte on some platforms.
In Rust, we’re probably not quite ready for this kind of generic type any time soon. Rust’s generics force a certain kind of consistency that puts some limitations on what we can do with constants as generic arguments.
We could have separate
Semaphore64, and so on, but that seems a bit overkill.
Semaphore<u64> and perhaps even
Semaphore<bool> could be possible,
but is something we haven’t done before in the standard library.
Our atomic types are simply
AtomicU64, and so on.
As mentioned above, for our atomic types, we only provide the ones that are natively supported by the platform you’re compiling for.
If we were to apply the same philosophy to
Semaphore, it wouldn’t exist on platforms that don’t have a
function, such as macOS.
And if we had separate semaphore types for different sizes, some sizes wouldn’t exist on (some versions of) Linux and various BSDs.
If we want a standard semaphore type in Rust, we’d first need some input on
whether we actually need semaphores of different sizes, and what form of flexibility and portability would be necessary to make them useful.
Perhaps we should go with just a single 32-bit
Semaphore type that’s always available (using a lock-based fallback),
but any such proposal would have to include a detailed explanation of use cases and limitations.
Atomic wait and notify
However, they are available on atomics of all sizes, on all platforms, regardless of what the operating system supports.
Linux futexes (before FUTEX2) are always 32 bit, but C++ allows for
atomic<uint64_t>::wait just fine.
A way of doing this, is using something resembling a “parking lot”:
effectively a global
HashMap that maps memory addresses to locks and queues.
That means that a 32 bit wait operation on Linux could use the very fast futex based implementation,
while the other sizes would use a very different implementation.
If we were to follow the philosophy of only providing the types and functions that are natively supported
(like we do for the atomic types), we wouldn’t provide such a fallback implementation.
That’d mean we only have
AtomicI32::wait) on Linux,
while all atomic types would include this
wait method on Windows.
A proposal for
Atomic*::notify in Rust would need to include
a discussion on whether a fall back to a global table is desirable in Rust or not.
jthread and stop_token
If we ignore the
stop_token for a second,
jthread is basically just a regular
std::thread that automatically gets
join()‘ed on destruction.
This avoids accidentally detaching a thread and letting it run for longer than expected, which might happen with a regular
However, it also introduces a potential new pitfall: immediately destructing a
jthread object will immediately join the thread, effectively removing any potential parallelism.
As of Rust 1.63.0, we have scoped threads
Just like a
jthread, a scoped thread is automatically joined.
However, point before which they are joined is made explicit, and is a guarantee that can be relied upon for safety.
The borrow checker even understands this guarantee, allowing you to safely borrow local variables in the scoped thread(s), as long as those variables
outlive the scope.
In addition to automatically joining, a main feature of
jthreads is their
stop_token and corresponding
One can call
request_stop() on a
stop_source to make the corresponding
stop_requested() method on
stop_token return true.
This can be used to nicely ask the thread to please stop, and is automatically done in the destructor of
jthread before joining.
It’s up to the code of the thread to actually check the token and stop if it was set.
So far, it almost looks like a plain
Where things get very different is the
This type allows registering a callback, a “stop function”, to be registered with a stop token.
Requesting a stop using the corresponding stop source will execute this function.
Effectively, a thread can use this to let others know how to stop or cancel its work.
In Rust, we could easily add the
AtomicBool-like functionality to the
Scope object of
is_finished(&self) -> bool or
stop_requested(&self) -> bool that indicates whether the main
is finished might suffice. Maybe combined with a
request_stop(&self) method to request it from anywhere.
stop_callback feature is more complicated, and any Rust equivalent would probably need a detailed proposal discussing its interface, use cases and limitations.
P0020R6 adds support for atomic floating point addition and subtraction to C++20.
It’d be easy to add a
AtomicF64 to Rust as well,
but it seems that the only platforms that natively support atomic floating point operations
are some GPUs that are not supported by Rust (yet?).
A proposal to add these types to Rust would have to present some compelling use cases.
Atomic per byte memcpy
Currently, it’s not possible to efficiently implement sequence locks in Rust or C++ that abides by all the rules of the memory model.
P1478R7 proposes to add
to a future version of C++ to solve this issue.
For Rust, I wrote a proposal to expose the functionality through a
AtomicPerByte<T> type: RFC 3301.
P0718R2 added specializations for
atomic<weak_ptr> to C++20.
Reference counted pointers (
shared_ptr in C++,
Arc in Rust) are quite commonly used for concurrent lock-free data structures.
atomic<shared_ptr> specialization makes it easier to do this correctly, by handling the reference count properly.
In Rust, we could add equivalent
AtomicArc sounds a bit weird maybe, considering the
Arc already stands for “atomic”. :) )
shared_ptr<T> is nullable, while in Rust that requires a
It’d not immediately clear whether
AtomicArc<T> should be nullable, or whether should we also have a
arc-swap crate already provides all these variants in Rust,
but, as far as I know, there hasn’t been any proposal yet to add anything similar to the standard library.
P0290R2 was not accepted, but proposed a type called
synchronized_value<T> which combines a
mutex with a
Even though it wasn’t accepted at that time into C++, it’s an interesting proposal, because
synchronized_value<T> is pretty much exactly what a
Mutex<T> is in Rust.
In C++, a
std::mutex does not contain the data it protects, nor does it even know what it is protecting at all.
This means that it is the responsibility of the user to remember which data is protected and by which mutex,
and ensure the right mutex is locked every time “protected” data is accessed.
Mutex<T> design with a
MutexGuard behaving like a (mutable) reference to
T allows for much more safety, while still allowing for a
Mutex<()> in cases where
you need only a mutex, without any data directly attached to it.
The proposal for
synchronized_value<T> was an attempt at adding this pattern to C++, but used closures instead of a mutex guard, since C++ doesn’t track lifetimes.
It seems to me that C++ can continue to be a source of inspiration for Rust, although we should take care not to copy-paste ideas directly.
As we’ve seen with
Mutex<T>, scoped threads,
Atomic*::from_mut and others, things can often take a very different (often more ergonomic) shape in Rust
while providing the same functionality.
Providing the exact same functionality as C++ shouldn’t be a primary goal. The goal should be to provide exactly what the Rust ecosystem needs from the language and standard library, which might be different than what C++ users need from their language.
If you have concurrency needs from the Rust standard library that we currently don’t fulfill, I’d love to hear from you, regardless of whether it’s something that’s already solved in another language or not.