Path Generics in Rust: A Sketch Proposal for Simplicity and Generality
The Rust programming language is best-known for its memory-related type system features that encode ownership and borrowing: these ensure memory safety (no dangling pointers), and also enforce a kind of "mutual exclusion" discipline that allows for provably safe parallelism. It's fantastic stuff; but it can also be utterly maddening when one attempts to twist the borrow checker in a direction it doesn't want to go.
In a recent blog post, Niko Matsakis described an ambitious (but in my opinion very achievable) vision for perfecting the "borrow checker within": a series of well-considered generalizations and features that make Rust's lifetime and borrowing system more ergonomic and a better fit for a wider variety of usage patterns.
In this post, my goal is to tie an analogy from the "place-based lifetimes" in that post and an idea I proposed in 2020 called "deferred borrows" (paper1). I'll describe how I see path generics (one might call them "place generics" in analogy with place-based lifetimes) as a new idea in Rust that could allow place-based lifetimes, but also when generalized sufficiently, allow significant new expressivity in terms of branded types -- that is, one value in one place (say, a handle) irrevocably tied to another (say, a container). Hopefully all of this will become clear soon.
Note also that this is very much not my day job: I work in Rust, not on Rust -- so take these as the semi-amateur scribblings that they are, for inspiration or discussion material at best; but I have no plans at the moment to try to push this further, beyond writing up the ideas as they exist in my head, with the hope they might be interesting.
Recap: Places as Lifetimes
In Rust today, memory is managed primarily by an "ownership tree" idiom built around RAII and linear types; some types -- structs and standard-library containers -- can own other types in turn. Unique ownership that is tied to program scopes with RAII is extremely useful because, by itself, it guarantees memory safety: we always know when to free memory, and there is no way to access it later because it is no longer in scope. This scheme also allows for race-free parallelism because subpieces of the program heap have completely distinct names and no aliasing.
However, unique tree ownership is quite cumbersome on its own, hence
borrows: temporary loaning of a subtree (by reference) as its own
first-class value. To ensure this is safe, the compiler checks that we
don't hold the pointer for too long -- for example, longer than the
lifetime of the original owning path. Rust reifies this concept at the
syntax level with a lifetime and allows naming explicit lifetimes at
the type level when describing borrows. Because ownership ultimately
traces back to RAII and local bindings (perhaps in main()
, but
somewhere up the stack2), these lifetimes correspond to static
spans of code in some active stack frame.
This is all very abstract; the language has precise definitions, of course, but the intuition can be difficult to internalize, especially when lifetimes arise as lifetime parameters. For example, the function signature
defines a function that works in an abstract context where two
lifetimes -- two sets of code spans -- exist either directly in the
caller or further up the stack, and for all times when 'a
is valid,
'b
is too (due to the "outlives" constraint introduced with
:
). This is a useful context to establish -- for example, it may let
one store a reference to something in 'b
in a field something in
'a
-- but it's very hard for one to grasp if one hasn't seen this
concept before.
At the core of the intuitive difficulty is the fact that this is another kind of program entity for the programmer to mentally track. In scope-based resource-management paradigms (such as C++ or Rust RAII) one is already somewhat accustomed to thinking of an object's lifetime, in some scope, conveying ownership. Lifetimes can feel like some other semantic layer that is either redundantly describing that structure, or perhaps coarsening/summarizing it into "categories" that are enough to prove safety to the borrow checker somehow.3
The idea conveyed in Niko Matsakis' post -- noted as already existing under-the-hood in a new borrow checker, but surfaced as syntax and type-system semantics in a new proposal -- is a wonderful simplification: name storage places, which are already roots of the ownership tree, as lifetimes as well. In essence, a borrow temporarily transfers ownership; so why not name the location the ownership came from, to allow checking compatibility of the lifetimes directly?
The blog post gives a simple example where a lifetime parameter is needed today, but use of a place-name instead is more intuitive: functions that return borrows to a piece of an argument. Rather than writing
where the 'a
exists only to tie self
to the return value, we can
instead write
with the same result. The post then expands further into self-referential types, where one element of a struct may borrow from another:
which is an excellent improvement not just in ergonomics, as above,
but actual expressivity as well. A variant of this pattern can occur
when text
is a local binding and we build a local index into it:
then given a local let text: String = ...;
we may later have a type
such as Vec<&'text str>
.
There is one remaining expressivity gap with the new kind of lifetime: it must have a binding in-scope to use it as a lifetime. In other words, one can say
but if whole
is not a parameter in that signature, or equivalently
not a sibling field in a struct with borrow fields, one still needs a
traditional lifetime parameter. As far as I can tell, the proposal
does not propose removing lifetime parameters entirely (nor would one
want to in a backwards-compatible evolution of the language); but
could we close the gap by introducing a little more syntax (thus in
turn creating a simpler and more uniform semantic landscape)?
Replacing Lifetimes Completely: Path Parameters?
Here's the core of my proposal: allow path parameters alongside type parameters and lifetime parameters for any generic type. This generalizes the places-as-lifetimes proposal by allowing introductions of abstract place names, and would look something like:
... which looks almost exactly like a lifetime parameter. So what has changed exactly?
The key bit is that when we use this type elsewhere, it is tied to a specific path (that is, a variable binding) rather than an abstract lifetime. This is an important difference: it binds two values, the borrow and the borrow-ee, together more tightly than a lifetime does.4
For example, where we use S
, we might write (with full type
ascriptions here for clarity):
when the generic struct type is instantiated with the path parameter
i
, we get a borrow of type &'i u32
, just as we saw above. So far
so good.
We gain some nice clarity-of-intent when we use these path parameters in context structs and the like:
This is fairly nice for self-documentation purposes, and is perhaps
more intuitive than a separate concept of lifetimes, but in idiomatic
Rust today we sometimes already see descriptive lifetime names --
'ctx
, 'data
, 'input
, and the like. What actual new expressive
powers -- powers to describe invariants to the compiler and get better
safety checks -- does this grant us?
The main "new power" we obtain is a means of branding, as we alluded
to above: Ctx<data>
is truly tied to data
, not anything with a
compatible lifetime. For memory safety this may not matter as such,
but the ability to tie a handle to a "parent" object is something that
has been discussed
(1,
2)
and seems generally useful as a matter of expressing API
invariants. Certainly whenever implementing an index-based (ECS-like)
system in Rust -- such as for general graphs in a compiler IR -- it
would be nice to express Id<graph>
(where graph
is a specific
Graph
object) rather than just Id
. Especially so when multiple
index spaces exist (instruction indices in two different function
bodies or the like).
So to recap: we've seen how taking paths rather than lifetimes in general parameter lists can give us the same "borrow to an external thing" capability inside an aggregate type, and also lets us bind which external thing a little more tightly. It (kind of) reduces the inventory of cognitive concepts by one, as well -- we no longer have to think about lifetimes, just about local bindings. The genericity over a path makes it clear and explicit what Rust paths have always been -- the lifetime of some object held by some stack frame somewhere else (up the call stack).
Good so far! But some readers might now wonder: how does this actually work with respect to the borrow checker's tracking? The way that traditional lifetime parameters "hold a borrow open" on an external source of data is a little subtle, and depends on the code that constructs a type: for example,
causes data
to be borrowed for as long as S
exists because of the
combination of two constraints: well-formedness of S
means that any
lifetime mentioned in S
(here 'a
) outlives S
, and then that
lifetime 'a
is tied to a borrow of data
that is initially created
by the caller. So, in the caller's context, whatever local path we
needed to borrow to get data
will be blocked (cannot be borrowed
mutably, dropped, written to, or moved out of) while this S<'a>
exists.
How do we get equivalent behavior with path parameters? With the equivalent signature
in order to enforce the same "data
borrowed as long as S
is alive"
property that we had from the signature and well-formedness above, we
need some analogous well-formedness rules for path parameters.
The simple way out would be to say that a path parameter always
borrows the path it mentions. Then we can use it for lifetimes, and we
have the same semantics as above. But wait: what kind of borrow,
immutable or mutable? In the lifetime-based signature above, we knew
this because the caller created the borrow then explicitly saw that
the borrow's lifetime was captured by S
; here S
captures the
path but isn't necessarily tied to the borrow on data
. So we need
something more explicit, somewhere, to denote the long-term borrow.
Here's an even more interesting question: could it ever be meaningful and useful to mention, and bind to, a path, without borrowing it?
Path Parameter Modes
To resolve these questions, we add borrow modes to path parameters:
and then add a rule about well-formed types: the borrow-mode of a path
in the parameter list must subsume its mode in any use. For example,
if used to denote the lifetime of an immutable borrow as above, we
must have path &Data
; it would be a compile-time error to have only
path Data
.5
The meaning of this path parameter mode is exactly as if the S
held
the corresponding borrow for its lifetime. Thus we recover the full
semantics of the struct with the traditional lifetime parameter above,
and likewise we can encode the semantics of a mutable borrow.
This also neatly handles the question of whether paths can overlap or
must be disjoint: e.g., for struct S<path P, path Q>
, are P
and
Q
separate bindings at the instantiation site, so we can have two
fields &'P mut u32
and &'Q mut u32
? If we require path parameter
modes to subsume their uses, then we would need struct S<path &mut P, path &mut Q>
, and then at the instantiation site the disjointness
would be enforced. Immutably-borrowed paths need not be disjoint (and
consequently we cannot assume that when typechecking within their
scope).
Now a new idea arises: if we can have immutably-borrowed and mutably-borrowed modes on path parameters, and these are explicit, what would it mean to have no borrow on a path?
For example, one might declare
and then create
and handle_a
and handle_b
are tied irrevocably to a
and b
. If
a
goes out of scope, then handle_a
's lifetime must have ended by
that point: that works the same as any lifetime constraint. However,
extremely importantly, we do not borrow 'a
while this handle
exists. a
could be passed as a &mut self
to various method calls,
we could do arbitrary things to it, and the handle type allows that;
it is only tied to the continued existence of a
, the binding.6
What good is this, then? For one, we can define methods on the handle
that must take the original container or parent object. So we don't
hold a borrow open for the duration of the handle, but we do take the
borrow just for a particular access or mutation. This is analogous to
the ubiquitous "index into Vec
as reference" pattern in Rust (in
fact, keep reading!) but at the type level:
(Some might call this a "singleton type". Please feel free to bikeshed that syntax!). The idea is that we take a mutable borrow to the path that we irrevocably tied ourselves to -- but we only take that during the method call.
But if the typechecker will only let us pass that path to the method, why require it to be written at all? Hence the next idea, implicit path-constrained parameters:
(Please really bikeshed that syntax: is a separate argument list
really the best?) This means that we call the method with a signature
that takes only self
-- in some sense, its type really is (a subtype
of) Fn(Self)
-- but it has captured the path and will borrow it
when called.
Then we can do something like
Pretty convenient, no? (If you're worried about the dangling-iterator problem -- what if a mutation invalidates a handle? -- read on.)
There are a few bikeshedding points noted above, and possible
extensions such as type constraints on paths (say that I want path Parent
to be a T
specifically) but overall I like this design and
believe it has some potential beyond just ergonomics. Before I get to
that though -- under "deferred borrows" below -- one more detour, to
talk about mutable borrows and invalidation.
Precise Invalidation and Encoding Data Structure Properties
Above, we created "handles" that are tied to a parent object; they
hold no borrow or other form of reservation until use. We irrevocably
tied the handles to the binding a
, but what stops an intervening
call
let old = replace;
from replacing whatever internal state the handles referenced?
Conventional, idiomatic Rust with borrow-based APIs avoids this
scenario by holding the borrow on the container/parent object for the
entire lifetime of the handle. This essentially freezes the container
in place so that whatever piece we're logically referencing with the
handle continues to exist. But by doing so, we give up all the
convenience (and expressive power!) of handles-without-borrows: it is
often useful to collect "fingers" into a data structure, then perform
mutations through them later. For example, we may have several
handle_a
s above (an array of them, or a hashmap, perhaps), and wish
to write to a referenced element through some handle, and may not know
which until we've collected several handles.
But why is that desired usage pattern safe? The only reason is that we
know that our later mutations will mutate elements, but not the
shape of the container itself. In essence, idiomatic Rust with
handles-that-hold-borrows blurs this distinction because that is the
best the type system can do. This is where the indices-as-references
pattern can save us again: if we use indices into a Vec
, we don't
actually borrow until we directly access an element. This works too,
but leads to other sorts of awkwardness. For example, in
regalloc2, which
makes pervasive use of the index-as-reference pattern, I often have to
"re-derive" a true borrow from an index because I perform some other
mutation (to some other disjoint state that I know shouldn't matter)
in the meantime.
In principle we should be able to encode in the type system that some handle relies on the "shape" of the container or parent object remaining the same, but nothing else. For this, we propose an idiom that we call "virtual fields" -- encoded as zero-sized unit fields, perhaps with better syntax later -- that we can hold borrows on, combined with the use of view types (also recapped in Niko's latest post) in path parameter specs and/or method bodies.
(To be absolutely clear, the proposal in this subsection is an idiom, and depends only on the "view types" extension proposed in those posts.)
For example, to sketch what this might look like:
Note that this is quickly encroaching on territory well-trodden by
Pin
and pin-projection and self-referential types; this post is
certainly not the first to ever meditate on the difference between
"immutable shape" and deep immutability in Rust. Interesting
hypothesis: perhaps someday Pin
could be encoded with careful use of
view-types and virtual fields denoting properties. Even more
interesting hypothesis: containers with invariants not currently
described in the type system today -- such as the fact that moving a
String
does not move the pointed-to content, likewise for a Vec
--
could loosen their signatures and encode this as well (with careful
thought toward backward-compatibility). This might be a way to encode
the "self-referential struct" example
where the parts
hold data
's contents borrowed, but not data
itself, so the S
is still movable.
I'll readily admit I've gotten increasingly handwavy as this section continues; but I believe there is something here and with enough careful specification it could form a cohesive, backward-compatible language overlay that adds expressive power to Rust APIs.
New Pattern: Deferred Borrows
Finally we've reached (one) end application of all this language machinery: implementing container types with deferred borrows, as I had proposed in my 2020 paper.
The key idea is to define a kind of auto-deref trait, like
Deref
, but
with a captured path and implicit parameter such that it only
(necessarily) borrows that implicit "context" when converting to a
true borrow. Then we could implement this trait for handles and have
them work as any other smart-pointer type today (such as Rc
or
Box
) except that they would not unnecessarily hold open a borrow on
the parent container; only its "shape" or whatever invariants are
needed to keep elements alive.
In the paper I proposed several variants of each container and a kind
of API-level typestate to encode what handles do hold borrows on,
hence how efficient they can be. For example, a Vec
might become a
FrozenVec
, where the length is fixed; then handles can be true
pointers, and the deferred borrows are as efficient as real
borrows. Or it could be an AppendOnlyVec
, with indices, incurring an
indirection through the storage base pointer (which may change as
growth causes reallocations) on each conversion to a true borrow but
not a dynamic bounds-check. With the above "virtual fields" idea and
view types, I think we could get away without the typestate-like API,
instead handing out different kind of handles (PtrHandle
vs
IndexHandle
vs ...) directly. Perhaps other variants could exist as
well; read the paper for more. (My intent here is not to summarize the
entire paper, but rather to show that its proposals become possible
once one has handles tied to paths and able to implicitly borrow
them.)
Related Approaches
One might reasonably ask: is there a way to "trick" the lifetime system into giving us branded types today, where one value is irrevocably tied to another?
Inherently what one needs is generativity, a kind of type-system feature where each execution or instance or an operation yields a different type.7
I am aware of at least one instance of a "generativity trick" with
Rust lifetimes, described by Gankra's
thesis
in section 6.3, where closures and forced-invariance of lifetimes are
used to create separate lifetimes for separate arrays (in the example)
such that the handle type (indices in the example) can be uniquely
associated with only one array. This is undoubtedly an extremely
impressive hack; nevertheless, better would be for the language to
provide a first-class, readable way for the user to write their intent
(Handle<myvec>
) without workaround.
Conclusion
This blog post has outlined an idea for "path generics", where path parameters live alongside (and perhaps someday mostly replace?) lifetime parameters as a more intuitive means of describing the lifetime origin of some borrowed data. Along with that, we've seen how they add expressive power in "branding" one type to be tied irrevocably to some parent value (path), allowing for typesafe "handle patterns", and how a bit of configurability in their interaction with borrow checking can lead to interesting and useful new APIs that were strictly inexpressible before.
What do I hope to come of all this? As I noted in a footnote1 above, in hindsight I've tried to push ideas in the "deferred borrows family" in sort of halfhearted ways before; mainly, I have a day-job that is "in Rust"8 but is not "the Rust language" and I don't feel I have the energy to push forward a full language proposal on the side or really go much beyond this post. (See above also re: me not being a properly trained PL person.) But I do think there may be something here, to the extent I want to braindump a bit and see what folks think, especially now that there is some explicit thinking and discussion about "places as lifetimes", view types, and other borrow-system extensions.
So: maybe someone else will also see some value here (and see past the undoubtedly rough bikesheddable surface design details) and explore further. Maybe not. Either way the ideas are written down and out there. Feedback welcome!
In hindsight, this paper was very much not the right way to get the idea out. I published this paper just after entering industry from academia, still not quite sure what I wanted to do; I didn't feel I had the energy to engage with Rust and actually drive an idea forward, but I did want to write it up somewhere, so I attempted that rare beast, a single-author paper written in one's free time (with all the corresponding limits on completeness). This blog post is, in one way of seeing it, another attempt at "putting it out there": I almost certainly don't have the bandwidth to spec or implement this but I'm curious what folks think and it feels more relevant now that it is an "incremental generalization" on top of another idea just proposed.
For pedagogical reasons let's ignore global variables and
'static
as well for now. The former is unsafe for a reason,
and the latter is the "boring pathological case" where data is
valid because it literally lives forever. In principle any
interesting lifetime in Rust is tied to a span of code executing
in some stack frame at some level.
In day-to-day use of Rust, explicit lifetimes are fairly rare outside of writing custom containers and perhaps "context structs" that bundle a set of borrows; but they do occur, and sometimes efficiency (non-copying) concerns and layering force nested lifetimes as well. The worst I've personally had to write was a three-level nesting necessary to encode a long-lived data structure, an analysis context over it, and an iteration context within that analysis context. It would have been great to have some better abstraction here! Even I was a bit frustrated, and I am solidly on Team "Borrow Checker Saves Me Time and Improves My Code In The Long Run"!
Path variables used as lifetimes are semantically different,
and in the way described more powerful, but they also do not
fully subsume separate lifetime variables, as far as I can tell:
one cannot have a path that is the "meet" of two different
paths, e.g. a borrow in a struct that comes from either P
or
Q
(both of which live long enough). For this reason one may
still want to use conventional lifetimes, or perhaps path
parameters could be extended with a notion of "union paths"
(TBD!).
There are likely some more details in exactly how path
parameters interact with both well-formedness and subtyping that
I haven't worked out, mostly because I am not a type theorist (I
only play one in blog posts, and then only for the fun
parts). For example, I suspect one would want a kind of "plus
constraint" analogous to T + 'a
for paths too, to say
something like "T
and it's allowed to borrow P
".
One might worry that this is insufficient for a really robust
handle type: what if I take a mutable borrow of a
and do a
std::mem::replace
on it? This proposal's answer is that it is
the responsibility of the API author to define signatures such
that this is not possible; but with "virtual fields" (keep
reading) there is a way to enfoce this.
Note this is a bit subtle: we could mean each operation at a different static program point -- this is the kind of generativity that, say, OCaml generative functor application provides -- or we could mean that each dynamic instance of, say, an object allocation produces a conceptually new type. We want something like the latter for branding: it shouldn't be possible to allocate a list of objects in a loop, put them all in a vector of same-typed values and mix up their handles.)
Cranelift, Wasmtime, and other compilers and runtimes stuff, so very intensively "in Rust", but definitely the language itself is out of scope!