Rust Temporary Lifetimes and "Super Let"
Contents
The lifetime of temporaries in Rust is a complicated but often ignored topic. In simple cases, Rust keeps temporaries around for exactly long enough, such that we don’t have to think about them. However, there are plenty of cases were we might not get exactly what we want, right away.
In this post, we (re)discover the rules for the lifetime of temporaries,
go over a few use cases for temporary lifetime extension,
and explore a new language idea, super let
, to give us more control.
Temporaries
Here’s a Rust statement, with no context, which uses a temporary String
:
f(&String::from('🦀'));
How long does this temporary String
live?
If we were designing Rust today, we could pick between basically two options:
- The string gets dropped right away, before
f
is called. Or, - The string will only be dropped after
f
is called.
If we went with option 1, the statement above will always result in a borrow checking error,
since we can’t let f
borrow something that’s already gone.
So, Rust went with option 2: the String
is first allocated, then a reference to it is passed to f
,
and only after f
returns will we drop the temporary String
.
In a let statement
Now a slightly harder one:
let a = f(&String::from('🦀'));
…
g(&a);
Again: how long does the temporary String
live?
- The string gets dropped at the end of the
let
statement: afterf
returns, but beforeg
is called. Or, - The string will be dropped at the same time as
a
, afterg
is called.
This time, option 1 might work, depending on the signature of f
.
If f
was defined as fn f(s: &str) -> usize
(like str::len
),
then it’s perfectly fine to immediately drop the String
after the let
statement.
However, if f
was defined as fn f(s: &str) -> &[u8]
(like str::as_bytes
),
then a
would borrow from the temporary String
, so we’d get a borrow checking error if we keep a
around for longer.
With option 2, it’d compile fine in both cases,
but we might keep a temporary around for much longer than necessary, which can waste resources
or cause subtle bugs (e.g. a deadlock when a MutexGuard
is dropped later than expected).
This sounds like we might want to go for a third option:
make it depend on the signature of f
.
However, Rust’s borrow checker only performs a check; it does not influence the behavior of the code.
This is a very important and useful property for a whole variety of reasons.
As one example, a change from fn f(s: &str) -> &[u8]
(where the return value borrows the argument)
to fn f(s: &str) -> &'static [u8]
(where the return value does not borrow the argument) does not change anything at the call site,
such as the point at which temporaries are dropped.
So, as a choice between only option 1 and 2, Rust went for option 1: drop the temporary at the end of the let
statement.
It’s easy enough to manually move the String
to a separate let
statement to keep it around longer.
let s = String::from('🦀'); // Moved to its own `let` to give it a longer lifetime.
let a = f(&s);
…
g(&a);
In a nested call
Okay, one more:
g(f(&String::from('🦀')));
Again, two options:
- The string gets dropped after
f
is called, but beforeg
is called. Or, - The string will be dropped at the end of the statement, so after
g
is called.
The snippet is nearly identical to the previous one: a reference to the temporary String
is passed to f
, and its return value is passed to g
.
This time, though, it’s all expressed as a single statement, using a nested call expression.
It is still true that option 1 might or might not work depending on the signature of f
,
and that option 2 might keep the temporary alive for longer than necessary.
However, this time, option 1 would result in far more surprises to the programmer.
For example, even something simple as String::from('🦀').as_bytes().contains(&0x80)
would not compile,
because the string would be dropped after as_bytes
(f
), before contains
(g
).
It’s also arguable that there is far less harm in keeping the temporaries around a bit longer, since they’ll still be dropped at the end of the statement.
So, Rust went with option 2: regardless of the signature of f
, the String
is kept alive until the end of the statement, until after g
is called.
In an if
statement
Now let’s move on to a simple if
statement:
if f(&String::from('🦀')) {
…
}
Same question: when is the String
dropped?
- After the condition of the
if
is evaluated but before the body of the if is executed (that is, at the{
). Or, - After the body of the
if
(that is, at the}
).
In this case, there is no reason to keep the temporary alive during the body of the if
.
The condition results in a boolean value (just true
or false
), which by definition doesn’t borrow anything.
So, Rust went for option 1.
An example where this is useful is when using Mutex::lock
, which returns a temporary MutexGuard
that will unlock the Mutex
when it is dropped:
fn example(m: &Mutex<String>) {
if m.lock().unwrap().is_empty() {
println!("the string is empty!");
}
}
Here, the temporary MutexGuard
from m.lock().unwrap()
is dropped right after .is_empty()
,
such that the Mutex
will not unnecessarily stay locked during the println
statement.
In an if let
statement
The situation is different for if let
(and match
) though, because then our expression doesn’t necessarily evaluate to a boolean:
if let … = f(&String::from('🦀')) {
…
}
Once more, two options:
- The string is dropped after pattern matching, before the body of the
if let
(that is, at the{
). Or, - The string is dropped after the body of the
if let
(that is, at the}
).
This time, there are reasons to go for option 2 rather than 1.
It is quite common for a pattern in an if let
statement or match
arm to borrow something.
So, in this case, Rust went for option 2.
For example, if we have a vec
of type Mutex<Vec<T>>
, this compiles fine:
if let Some(x) = vec.lock().unwrap().first() {
// The mutex is still locked here. :)
// This is necessary, because we're borrowing `x` from the `Vec`. (`x` is a `&T`)
println!("first item in vec: {x}");
}
We get a temporary MutexGuard
from m.lock().unwrap()
, and use the .first()
method to borrow the first element.
This borrow lasts for the entire body of the if let
, because the MutexGuard
is only dropped at the final }
.
However, there are also situations where this is not what we want.
For example, if instead of first
we use pop
, which returns a value rather than a reference:
if let Some(x) = vec.lock().unwrap().pop() {
// The mutex is still locked here. :(
// This is unnecessary, because we don't borrow anything from the `Vec`. (`x` is a `T`)
println!("popped item from the vec: {x}");
}
This can be surprising and result in subtle bugs or reduced performance.
Perhaps this is an argument that Rust picked the wrong option here, and maybe an argument for changing this in a future edition of Rust. See Niko’s blog post on this topic for thoughts on how these rules could be changed.
For now, the workaround is to use a separate let
statement, to limit the temporary lifetime to that statement:
let x = vec.lock().unwrap().pop(); // The MutexGuard is dropped after this statement.
if let Some(x) = x {
…
}
Temporary lifetime extension
How about this situation?
let a = &String::from('🦀');
…
f(&a);
Two options:
- The string gets dropped at the end of the
let
statement. Or, - The string will be dropped at the same time as
a
, afterf
is called.
Option 1 will always result in a borrow checking error. So, option 2 probably makes more sense. And that’s indeed how Rust works today: the lifetime of the temporary is extended such that the snippet above compiles fine.
This phenomenon, of a temporary living longer than the statement it appears in, is called temporary lifetime extension.
Temporary lifetime extension does not apply to all temporaries that appear in a let
statement, as we’ve already seen:
the temporary string in let a = f(&String::from('🦀'));
does not outlast the let
statement.
In let a = &f(&String::from('🦀'));
(note the extra &
), temporary lifetime extension does apply to the outermost &
,
which borrows the temporary that is the return value of f
, but not to the inner &
, which borrows the temporary String
.
For example, substituting str::len
for f
:
let a: &usize = &String::from('a').len();
Here, the String is dropped at the end of the let
statement, but the usize
returned by .len()
lives as long as a
.
This is not limited to just let _ = &…;
syntax. For example:
let a = Person {
name: &String::from('🦀'), // Extended!
address: &String::from('🦀'), // Extended!
};
In the snippet above, the temporary strings will have their lifetime extended, because even without knowing anything
about the Person
type, we know for sure that lifetime extension is necessary for the resulting object to be usable afterwards.
The rules for which temporaries in a let
statement get their lifetimes extended is documented in the Rust Reference,
but effectively comes down to those expressions where you can tell from just the syntax that extending the lifetime is necessary,
independent of any types, function signatures, or trait implementations:
let a = &temporary().field; // Extended!
let a = MyStruct { field: &temporary() }; // Extended!
let a = &MyStruct { field: &temporary() }; // Both extended!
let a = [&temporary()]; // Extended!
let a = { …; &temporary() }; // Extended!
let a = f(&temporary()); // Not extended, because it might not be necessary.
let a = temporary().f(); // Not extended, because it might not be necessary.
let a = temporary() + temporary(); // Not extended, because it might not be necessary.
While this seems reasonable, it does result in surprises when we consider that
the syntax for constructing a tuple struct or tuple variant
is just a function call: Some(123)
is, syntactically, a function call to the function Some
.
For example:
let a = Some(&temporary()); // Not extended! (Because `Some` could have any signature...)
let a = Some { 0: &temporary() }; // Extended! (I bet you have never used this syntax.)
And that can be quite confusing. :(
This is one of the reasons why it might be worth considering revisiting the rules.
Temporary Lifetime Extension in Blocks
Imagine we have some kind of Writer
type that holds a reference to a File
to write to:
pub struct Writer<'a> {
pub file: &'a File
}
And some code that creates a Writer
that writes to a newly created file:
println!("opening file...");
let filename = "hello.txt";
let file = File::create(filename).unwrap();
let writer = Writer { file: &file };
The scope now contains filename
, file
and writer
.
However, the code that follows should only write through the Writer
.
Ideally, filename
and especially file
would not be visible in the scope.
Because temporary lifetime extension also applies to the final expression of a block, we can achieve that as follows:
let writer = {
println!("opening file...");
let filename = "hello.txt";
Writer { file: &File::create(filename).unwrap() }
};
Now, the creation of the Writer
is neatly wrapped in its own scope,
such that only writer
is visible to the outer scope, and nothing else.
Thanks to temporary lifetime extension, the File
that was created as a temporary in the
inner scope stays alive as long as writer
.
Limitations of Temporary Lifetime Extension
Now imagine if we made the file
field of the Writer
struct private:
pub struct Writer<'a> {
file: &'a File
}
impl<'a> Writer<'a> {
pub fn new(file: &'a File) -> Self {
Self { file }
}
}
Then we wouldn’t need to change much in the original usage snippet:
println!("opening file...");
let filename = "hello.txt";
let file = File::create(filename).unwrap();
let writer = Writer::new(&file); // Only this line changed.
We’d just need to call Writer::new()
instead of using Writer {}
syntax for construction.
However, that wouldn’t work for the scoped version:
let writer = {
println!("opening file...");
let filename = "hello.txt";
Writer::new(&File::create(filename).unwrap()) // Error: Does not live long enough!
};
writer.something(); // Error: File no longer alive here!
As we’ve seen before, while temporary lifetime extension propagates through Writer {}
construction syntax,
it doesn’t go through Writer::new()
function call syntax.
(Because the signature could be fn new(&File) -> Self<'static>
or fn new(&File) -> i32
, for example,
which wouldn’t need the temporary lifetime to be extended.)
Unfortunately, there is no way to explicitly opt in to temporary lifetime extension.
We’ll have to put a let file
in the outermost scope.
The best we can do, today, is by using delayed intialization:
let file;
let writer = {
println!("opening file...");
let filename = "hello.txt";
file = File::create(filename).unwrap();
Writer::new(&file)
};
But that brings the file
back in the scope, which was what we were trying to prevent. :(
While it’s arguable not a big deal to have to put the let file
on the outside of the scope,
this workaround is not obvious to most Rust programmers.
Delayed initialization is not a commonly used feature,
and the compiler currently doesn’t suggest this workaround when giving a temporary lifetime error.
And even if the compiler could, it’s not a trivial change to suggest.
It’d be nice to fix this, somehow.
Macros
It might be useful to have a function that both creates a file and returns a Writer
to it.
Something like:
let writer = Writer::new_file("hello.txt");
But, because Writer
only borrows the File
,
that would require new_file
to store that File
somewhere.
It could leak
the File
or store it in a static
somehow,
but there is (currently) no way it could make the File
live as long as the returned Writer
.
So, instead, let’s use a macro to define both the file and the writer wherever it is invoked:
macro_rules! let_writer_to_file {
($writer:ident, $filename:expr) => {
let file = std::fs::File::create($filename).unwrap();
let $writer = Writer::new(&file);
};
}
Usage would look like this:
let_writer_to_file!(writer, "hello.txt");
writer.something();
Thanks to macro hygiene,
file
isn’t accessible in this scope.
This works, but wouldn’t it be much nicer if it looked more like a regular function call, as follows?
let writer = writer_to_file!("hello.txt");
writer.something();
As we’ve seen before, the way to create a temporary File
that lives long enough
inside a let writer = …;
statement, is by using temporary lifetime extension:
macro_rules! writer_to_file {
($filename:expr) => {
Writer { file: &File::create($filename).unwrap() }
};
}
let writer = writer_to_file!("hello.txt");
This would expand to:
let writer = Writer { file: &File::create("hello.txt").unwrap() };
Which would extend the lifetime of the File
temporary as necessary.
But we simply can’t do that if file
isn’t public and we need to use Writer::new()
instead.
The macro would somehow need to be able to insert let file;
before the let writer = …;
statement
in which it was invoked. That’s not possible.
format_args!()
This issue is also the reason why (today) the result of format_args!()
can not be stored
in a let
statement:
let f = format_args!("{}", 1); // Error!
something.write_fmt(f);
The reason is that format_args!()
expands to something like
fmt::Arguments::new(&Argument::display(&arg), …)
,
where some of the arguments are references to temporaries.
Temporary lifetime extension doesn’t apply to arguments to a function call,
so the fmt::Arguments
object can only be used within the same statement.
It’d be nice to fix this.
pin!()
Another type that is often created through a macro is Pin
.
Roughly speaking, it represents a reference to something that will never be moved.
(The exact details are complicated, but not very relevant right now.)
It is created through an unsafe
function called Pin::new_unchecked
,
because you need to promise the value it references is never moved, even after the Pin
itself is gone.
The best way to use this function, is by making use of shadowing:
let mut thing = Thing { … };
let thing = unsafe { Pin::new_unchecked(&mut thing) };
Because the second thing
shadows the first one, the first thing
(which still exists)
can no longer be named.
Because it cannot be named, we can be sure that it cannot be moved (even after dropping the second thing
),
which is what we promised with our unsafe
block.
Because this is a common pattern, this pattern is often captured in a macro.
For example, one might define a let_pin
macro as follows:
macro_rules! let_pin {
($name:ident, $init:expr) => {
let mut $name = $init;
let $name = unsafe { Pin::new_unchecked(&mut $name) };
};
}
Usage looks similar as the let_writer_to_file
macro we had before:
let_pin!(thing, Thing { … });
thing.something();
This works, and nicely encapsulates and hides the unsafe code.
But, just like with our Writer
example, wouldn’t it be much nicer if it worked as follows?
let thing = pin!(Thing { … });
As we know by know, we can only do this if we can make use of temporary lifetime extension
to make the Thing
live long enough.
And that is is only possible if we can construct the Pin
with Pin {}
syntax:
Pin { pinned: &mut Thing { … } }
invokes temporary lifetime extension,
but Pin::new_unchecked(&mut Thing { … })
does not.
That would mean making the field of Pin
public, which defeats the purpose of Pin
.
It can only provide any meaningful guarantees if the field is private.
This means that, unfortunately, it is impossible (today) to write such a pin!()
macro yourself.
The standard library does it anyway,
by committing a terrible crime:
the “private” field of Pin
is actually defined as pub
, but also marked as “unstable”
which causes the compiler to complain if you try to use it.
It’d be nice to not need this hack.
super let
We’ve now seen several cases where we are limited by the restrictive rules of temporary lifetime extension:
- Our failed attempt at keeping
let writer = { … };
nicely scoped, - Our failed attempt at making
let writer = writer_to_file!(…);
work, - The inability to to do
let f = format_args!(…);
, and - The terrible hack required to make
pin!()
work nicely.
Each of these would have a nice solution if we had a way to opt in to lifetime extension.
What if we could write a special kind of let
statement that
makes things live a bit longer than a regular let
statement?
Kind of like a super power (or a let
that defines something in the ‘super’ scope)?
How about: super let
?
I imagine it would work like this:
let writer = {
println!("opening file...");
let filename = "hello.txt";
super let file = File::create(filename).unwrap();
Writer::new(&file)
};
The super
keyword would make the file
live as long as the writer
,
as long as the Writer
that the surrounding block results in.
The exact rules of how super let
works still need to be worked out,
but the main goal is that it allows a ‘desugaring’ of temporary lifetime extension:
let a = &temporary();
andlet a = { super let t = temporary(); &t };
should be equivalent.
This feature makes the pin!()
macro possible to define without any hacks:
macro_rules! pin {
($init:expr) => {
{
super let pinned = $init;
unsafe { Pin::new_unchecked(&pinned) }
}
};
}
let thing = pin!(Thing { … });
Similarly, it’d allow the format_args!()
macro to use super let
for its temporaries,
such that the result can be stored as part of a let a = format_args!()
statement.
UX and Diagnostics
Having both let
and super let
with slightly different semantics might not sound like a great idea.
It solves some problems, especially with macros,
but is it really worth the potential confusion when trying to understand the difference between let
and super let
?
I think yes, as long as we make sure the compiler emits suggestions for
both adding and removing super
from let
wherever possible.
Imagine that someone writes:
let output: Option<&mut dyn Write> = if verbose {
let mut file = std::fs::File::create("log")?;
Some(&mut file)
} else {
None
};
Today, this results in the following error:
error[E0597]: `file` does not live long enough
--> src/main.rs:16:14
|
14 | let output: Option<&mut dyn Write> = if verbose {
| ------ borrow later stored here
15 | let mut file = std::fs::File::create("log")?;
| -------- binding `file` declared here
16 | Some(&mut file)
| ^^^^^^^^^ borrowed value does not live long enough
17 | } else {
| - `file` dropped here while still borrowed
While it’s relatively clear what the problem is, this doesn’t really provide them with a solution. I’ve often had Rust programmers come to me with a similar example, resulting in me explaining the delayed initialization pattern, resulting in this solution:
let mut file;
let output: Option<&mut dyn Write> = if verbose {
file = std::fs::File::create("log")?;
Some(&mut file)
} else {
None
};
This solution is not obvious to many Rust programmers,
perhaps because it feels so weird to leave file
uninitialized in one branch.
Wouldn’t it have been much nicer if the error would have been the following instead?
error[E0597]: `file` does not live long enough
--> src/main.rs:16:14
|
15 | let mut file = std::fs::File::create("log")?;
| --------
|
help: try using `super let`
|
15 | super let mut file = std::fs::File::create("log")?;
| +++++
Even without knowing much about “super let” or its exact semantics,
this provides the programmer with a clear and simple solution to their problem,
teaching them that super
makes things live longer.
Similarly, when unnecessarily using super let
, the compiler should suggest removing it:
warning: unnecessary use of `super let`
--> src/main.rs:16:14
|
15 | super let mut file = std::fs::File::create("log")?;
| ^^^^^ help: remove this
|
= note: `file` would live long enough with a regular `let`
I believe that these diagnostics would make the super let
feature usable
for all Rust programmers, even when they never encountered it before.
Together with the increased ergonomics of macros like pin
and format_args
,
I think super let
results in a net win in user (programmer) experience.
Temporary Lifetimes 2024 RFC
I shared my idea super let
with Niko Matsakis and Ding Xiang Fei
a few months ago, who were excited about a way to ‘desugar’ temporary lifetime extension.
They have been working hard on the exact definition and detailed rules of super let
,
together with a few new rules for temporary lifetimes for the next Rust edition.
This combined “temporary lifetimes 2024” effort
is leading up to an RFC,
which basically proposes reducing the temporary lifetimes where possible,
to prevent a deadlock caused by a temporary MutexGuard
in an if let
or match
,
and adding super let
as a way to opt in to longer lifetimes.
Feedback
Have you ever consciously used temporary lifetime extension? Or have you been bitten by it?
What do you think of super let
? Would you use it? Or do you have a better idea?
Let me know in the comments below or on GitHub, or join the discussion on Reddit, Twitter, or Mastodon.