mirror of https://github.com/rust-lang/rfcs.git
542 lines
21 KiB
Markdown
542 lines
21 KiB
Markdown
- Feature Name: dropck_parametricity
|
|
- Start Date: 2015-08-05
|
|
- RFC PR: [rust-lang/rfcs#1238](https://github.com/rust-lang/rfcs/pull/1238)/
|
|
- Rust Issue: [rust-lang/rust#28498](https://github.com/rust-lang/rust/issues/28498)
|
|
|
|
# Summary
|
|
|
|
Revise the Drop Check (`dropck`) part of Rust's static analyses in two
|
|
ways. In the context of this RFC, these revisions are respectively
|
|
named `cannot-assume-parametricity` and `unguarded-escape-hatch`.
|
|
|
|
1. `cannot-assume-parametricity` (CAP): Make `dropck` analysis stop
|
|
relying on parametricity of type-parameters.
|
|
|
|
2. `unguarded-escape-hatch` (UGEH): Add an attribute (with some name
|
|
starting with "unsafe") that a library designer can attach to a
|
|
`drop` implementation that will allow a destructor to side-step
|
|
the `dropck`'s constraints (unsafely).
|
|
|
|
# Motivation
|
|
|
|
## Background: Parametricity in `dropck`
|
|
|
|
The Drop Check rule (`dropck`) for [Sound Generic Drop][] relies on a
|
|
reasoning process that needs to infer that the behavior of a
|
|
polymorphic function (e.g. `fn foo<T>`) does not depend on the
|
|
concrete type instantiations of any of its *unbounded* type parameters
|
|
(e.g. `T` in `fn foo<T>`), at least beyond the behavior of the
|
|
destructor (if any) for those type parameters.
|
|
|
|
[Sound Generic Drop]: https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md
|
|
|
|
This property is a (weakened) form of a property known in academic
|
|
circles as *Parametricity*.
|
|
(See e.g. [Reynolds, IFIP 1983][Rey83], [Wadler, FPCA 1989][Wad89].)
|
|
|
|
* Parametricity, in this context, essentially says that the compiler
|
|
can reason about the body of `foo` (and the subroutines that `foo`
|
|
invokes) without having to think about the particular concrete
|
|
types that the type parameter `T` is instantiated with.
|
|
`foo` cannot do anything with a `t: T` except:
|
|
|
|
1. move `t` to some other owner expecting a `T` or,
|
|
|
|
2. drop `t`, running its destructor and freeing associated resources.
|
|
|
|
* For example, this allows the compiler to deduce that even if `T` is
|
|
instantiated with a concrete type like `&Vec<u32>`, the body of
|
|
`foo` cannot actually read any `u32` data out of the vector. More
|
|
details about this are available on the [Sound Generic Drop][] RFC.
|
|
|
|
## "Mistakes were made"
|
|
|
|
The parametricity-based reasoning in the
|
|
[Drop Check analysis][Sound Generic Drop] (`dropck`) was clever, but
|
|
fragile and unproven.
|
|
|
|
* Regarding its fragility, it has been shown to have
|
|
[bugs][parametricity-insufficient]; in particular, parametricity is
|
|
a necessary but *not* sufficient condition to justify the
|
|
inferences that `dropck` makes.
|
|
|
|
* Regarding its unproven nature, `dropck` violated the heuristic in
|
|
Rust's design to not incorporate ideas unless those ideas had
|
|
already been proven effective elsewhere.
|
|
|
|
[parametricity-insufficient]: https://github.com/rust-lang/rust/issues/26656
|
|
|
|
These issues might alone provide motivation for ratcheting back on
|
|
`dropck`'s rules in the short term, putting in a more conservative
|
|
rule in the stable release channel while allowing experimentation with
|
|
more-aggressive feature-gated rules in the development nightly release
|
|
channel.
|
|
|
|
However, there is also a specific reason why we want to ratchet back
|
|
on the `dropck` analysis as soon as possible.
|
|
|
|
## Impl specialization is inherently non-parametric
|
|
|
|
The parametricity requirement in the Drop Check rule over-restricts
|
|
the design space for future language changes.
|
|
|
|
In particular, the [impl specialization] RFC describes a language
|
|
change that will allow the invocation of a polymorphic function `f` to
|
|
end up in different sequences of code based solely on the concrete
|
|
type of `T`, *even* when `T` has no trait bounds within its
|
|
declaration in `f`.
|
|
|
|
[impl specialization]: https://github.com/rust-lang/rfcs/pull/1210
|
|
|
|
# Detailed design
|
|
|
|
Revise the Drop Check (`dropck`) part of Rust's static analyses in two
|
|
ways. In the context of this RFC, these revisions are respectively
|
|
named `cannot-assume-parametricity` (CAP) and `unguarded-escape-hatch` (UGEH).
|
|
|
|
Though the revisions are given distinct names, they both fall under
|
|
the feature gate `dropck_parametricity`. (Note however that this
|
|
might be irrelevant to CAP; see [CAP stabilization details][]).
|
|
|
|
## cannot-assume-parametricity
|
|
|
|
The heart of CAP is this: make `dropck` analysis stop relying on
|
|
parametricity of type-parameters.
|
|
|
|
### Changes to the Drop-Check Rule
|
|
|
|
The Drop-Check Rule (both in its original form and as revised here)
|
|
dictates when a lifetime `'a` must strictly outlive some value `v`,
|
|
where `v` owns data of type `D`; the rule gave two circumstances where
|
|
`'a` must strictly outlive the scope of `v`.
|
|
|
|
* The first circumstance (`D` is directly instantiated at `'a`)
|
|
remains unchanged by this RFC.
|
|
|
|
* The second circumstance (`D` has some type parameter with
|
|
trait-provided methods, i.e. that could be invoked within `Drop`)
|
|
is broadened by this RFC to simply say "`D` has some type
|
|
parameter."
|
|
|
|
That is, under the changes of this RFC, whether the type parameter has
|
|
a trait-bound is irrelevant to the Drop-Check Rule. The reason is that
|
|
any type parameter, regardless of whether it has a trait bound or not,
|
|
may end up participating in [impl specialization], and thus could
|
|
expose an otherwise invisible reference `&'a AlreadyDroppedData`.
|
|
|
|
`cannot-assume-parametricity` is a breaking change, since the language
|
|
will start assuming that a destructor for a data-type definition such
|
|
as `struct Parametri<C>` may read from data held in its `C` parameter,
|
|
even though the `fn drop` formerly appeared to be parametric with
|
|
respect to `C`. This will cause `rustc` to reject code that it had
|
|
previously accepted (below are some examples that
|
|
[continue to work][examples-continue-to-work] and
|
|
some that [start being rejected][examples-start-reject]).
|
|
|
|
### CAP stabilization details
|
|
[CAP stabilization details]: #cap-stabilization-details
|
|
|
|
`cannot-assume-parametricity` will be incorporated into the beta
|
|
and stable Rust channels, to ensure that destructor code atop
|
|
stable channels in the wild stop relying on parametricity as soon
|
|
as possible. This will enable new language features such as
|
|
[impl specialization].
|
|
|
|
* It is not yet clear whether it is feasible to include a warning
|
|
cycle for CAP.
|
|
|
|
* For now, this RFC is proposing to remove the parts of Drop-Check
|
|
that attempted to prove that the `impl<T> Drop` was parametric with
|
|
respect to `T`. This would mean that there would be more warning
|
|
cycle; `dropck` would simply start rejecting more code.
|
|
There would be no way to opt back into the old `dropck` rules.
|
|
|
|
* (However, during implementation of this change, we should
|
|
double-check whether a warning-cycle is in fact feasible.)
|
|
|
|
## unguarded-escape-hatch
|
|
|
|
The heart of `unguarded-escape-hatch` (UGEH) is this: Provide a new,
|
|
unsafe (and unstable) attribute-based escape hatch for use in the
|
|
standard library for cases where Drop Check is too strict.
|
|
|
|
### Why we need an escape hatch
|
|
|
|
The original motivation for the parametricity special-case in the
|
|
original Drop-Check rule was due to an observation that collection
|
|
types such as `TypedArena<T>` or `Vec<T>` were often used to
|
|
contain values that wanted to refer to each other.
|
|
|
|
An example would be an element type like
|
|
`struct Concrete<'a>(u32, Cell<Option<&'a Concrete<'a>>>);`, and then
|
|
instantiations of `TypedArena<Concrete>` or `Vec<Concrete>`.
|
|
This pattern has been used within `rustc`, for example,
|
|
to store elements of a linked structure within an arena.
|
|
|
|
Without the parametricity special-case, the existence of a destructor
|
|
on `TypedArena<T>` or `Vec<T>` led the Drop-Check analysis to conclude
|
|
that those destructors might hypothetically read from the references
|
|
held within `T` -- forcing `dropck` to reject those destructors.
|
|
|
|
(Note that `Concrete` itself has no destructor; if it did, then
|
|
`dropck`, both as originally stated and under the changes of this RFC,
|
|
*would* force the `'a` parameter of any instance to strictly outlive
|
|
the instance value, thus ruling out cross-references in the same
|
|
`TypedArena` or `Vec`.)
|
|
|
|
Of course, the whole point of this RFC is that using parametricity as
|
|
the escape hatch seems like it does not suffice. But we still need
|
|
*some* escape hatch.
|
|
|
|
### The new escape hatch: an unsafe attribute
|
|
|
|
This leads us to the second component of the RFC, `unguarded-escape-hatch` (UGEH):
|
|
Add an attribute (with a name starting with "unsafe") that a library
|
|
designer can attach to a `drop` implementation that will allow a
|
|
destructor to side-step the `dropck`'s constraints (unsafely).
|
|
|
|
This RFC proposes the attribute name `unsafe_destructor_blind_to_params`.
|
|
This name was specifically chosen to be long and ugly; see
|
|
[UGEH stabilization details] for further discussion.
|
|
|
|
Much like the `unsafe_destructor` attribute that we had in the past,
|
|
this attribute relies on the programmer to ensure that the destructor
|
|
cannot actually be used unsoundly. It states an (unproven) assumption
|
|
that the given implementation of `drop` (and all functions that this
|
|
`drop` may transitively call) will never read or modify a value of
|
|
any type parameter, apart from the trivial operations of either
|
|
dropping the value or moving the value from one location to another.
|
|
|
|
* (In particular, it certainly must not dereference any `&`-reference
|
|
within such a value, though this RFC is adopts a somewhat stronger
|
|
requirement to encourage the attribute to only be used for the
|
|
limited case of parametric collection types, where one need not do
|
|
anything more than move or drop values.)
|
|
|
|
The above assumption must hold regardless of what impact
|
|
[impl specialization][] has on the resolution of all function calls.
|
|
|
|
### UGEH stabilization details
|
|
[UGEH stabilization details]: #ugeh-stabilization-details
|
|
|
|
The proposed attribute is only a *short-term* patch to work-around a
|
|
bug exposed by the combination of two desirable features (namely
|
|
[impl specialization] and [`dropck`][Sound Generic Drop]).
|
|
|
|
In particular, using the attribute in cases where control-flow in the
|
|
destructor can reach functions that may be specialized on a
|
|
type-parameter `T` may expose the system to use-after-free scenarios
|
|
or other unsound conditions. This may a non-trivial thing for the
|
|
programmer to prove.
|
|
|
|
* Short term strategy: The working assumption of this RFC is that the
|
|
standard library developers will use the proposed attribute in
|
|
cases where the destructor *is* parametric with respect to all type
|
|
parameters, even though the compiler cannot currently prove this to
|
|
be the case.
|
|
|
|
The new attribute will be restricted to non-stable channels, like
|
|
any other new feature under a feature-gate.
|
|
|
|
* Long term strategy: This RFC does not make any formal guarantees
|
|
about the long-term strategy for including an escape hatch. In
|
|
particular, this RFC does *not* propose that we stabilize the
|
|
proposed attribute
|
|
|
|
It may be possible for future language changes to allow us to
|
|
directly express the necessary parametricity properties.
|
|
See further discussion in the [continue supporting parametricity][] alternative.
|
|
|
|
The suggested attribute name (`unsafe_destructor_blind_to_params`
|
|
above) was deliberately selected to be long and ugly, in order to
|
|
discourage it from being stabilized in the future without at least
|
|
some significant discussion. (Likewise, the acronym "UGEH" was
|
|
chosen for its likely pronunciation "ugh", again a reminder that
|
|
we do not *want* to adopt this approach for the long term.)
|
|
|
|
|
|
## Examples of code changes under the RFC
|
|
|
|
This section shows some code examples, starting with code that works
|
|
today and must continue to work tomorrow, then showing an example of
|
|
code that will start being rejected, and ending with an example of the
|
|
UGEH attribute.
|
|
|
|
### Examples of code that must continue to work
|
|
[examples-continue-to-work]: #examples-of-code-that-must-continue-to-work
|
|
|
|
Here is some code that works today and must continue to work in the future:
|
|
|
|
```rust
|
|
use std::cell::Cell;
|
|
|
|
struct Concrete<'a>(u32, Cell<Option<&'a Concrete<'a>>>);
|
|
|
|
fn main() {
|
|
let mut data = Vec::new();
|
|
data.push(Concrete(0, Cell::new(None)));
|
|
data.push(Concrete(0, Cell::new(None)));
|
|
|
|
data[0].1.set(Some(&data[1]));
|
|
data[1].1.set(Some(&data[0]));
|
|
}
|
|
```
|
|
|
|
In the above, we are building up a vector, pushing `Concrete` elements
|
|
onto it, and then later linking those concrete elements together via
|
|
optional references held in a cell in each concrete element.
|
|
|
|
We can even wrap the vector in a struct that holds it. This also must
|
|
continue to work (and will do so under this RFC); such structural
|
|
composition is a common idiom in Rust code.
|
|
|
|
```rust
|
|
use std::cell::Cell;
|
|
|
|
struct Concrete<'a>(u32, Cell<Option<&'a Concrete<'a>>>);
|
|
|
|
struct Foo<T> { data: Vec<T> }
|
|
|
|
fn main() {
|
|
let mut foo = Foo { data: Vec::new() };
|
|
foo.data.push(Concrete(0, Cell::new(None)));
|
|
foo.data.push(Concrete(0, Cell::new(None)));
|
|
|
|
foo.data[0].1.set(Some(&foo.data[1]));
|
|
foo.data[1].1.set(Some(&foo.data[0]));
|
|
}
|
|
```
|
|
|
|
### Examples of code that will start to be rejected
|
|
[examples-start-reject]: #examples-of-code-that-will-start-to-be-rejected
|
|
|
|
The main change injected by this RFC is this: due to `cannot-assume-parametricity`,
|
|
an attempt to add a destructor to the `struct Foo` above will cause the
|
|
code above to be rejected, because we will assume that the destructor for `Foo`
|
|
may invoke methods on the concrete elements that dereferences their links.
|
|
|
|
Thus, this code will be rejected:
|
|
|
|
```rust
|
|
use std::cell::Cell;
|
|
|
|
struct Concrete<'a>(u32, Cell<Option<&'a Concrete<'a>>>);
|
|
|
|
struct Foo<T> { data: Vec<T> }
|
|
|
|
// This is the new `impl Drop`
|
|
impl<T> Drop for Foo<T> {
|
|
fn drop(&mut self) { }
|
|
}
|
|
|
|
fn main() {
|
|
let mut foo = Foo { data: Vec::new() };
|
|
foo.data.push(Concrete(0, Cell::new(None)));
|
|
foo.data.push(Concrete(0, Cell::new(None)));
|
|
|
|
foo.data[0].1.set(Some(&foo.data[1]));
|
|
foo.data[1].1.set(Some(&foo.data[0]));
|
|
}
|
|
```
|
|
|
|
NOTE: Based on a preliminary crater run, it seems that mixing together
|
|
destructors with this sort of cyclic structure is sufficiently rare
|
|
that *no* crates on `crates.io` actually regressed under the new rule:
|
|
everything that compiled before the change continued to compile after
|
|
it.
|
|
|
|
### Example of the unguarded-escape-hatch
|
|
[examples-escape-hatch]: #example-of-the-unguarded-escape-hatch
|
|
|
|
If the developer of `Foo` has access to the feature-gated
|
|
escape-hatch, and is willing to assert that the destructor for `Foo`
|
|
does nothing with the links in the data, then the developer can work
|
|
around the above rejection of the code by adding the corresponding
|
|
attribute.
|
|
|
|
```rust
|
|
#![feature(dropck_parametricity)]
|
|
use std::cell::Cell;
|
|
|
|
struct Concrete<'a>(u32, Cell<Option<&'a Concrete<'a>>>);
|
|
|
|
struct Foo<T> { data: Vec<T> }
|
|
|
|
impl<T> Drop for Foo<T> {
|
|
#[unsafe_destructor_blind_to_params] // This is the UGEH attribute
|
|
fn drop(&mut self) { }
|
|
}
|
|
|
|
fn main() {
|
|
let mut foo = Foo { data: Vec::new() };
|
|
foo.data.push(Concrete(0, Cell::new(None)));
|
|
foo.data.push(Concrete(0, Cell::new(None)));
|
|
|
|
foo.data[0].1.set(Some(&foo.data[1]));
|
|
foo.data[1].1.set(Some(&foo.data[0]));
|
|
}
|
|
```
|
|
|
|
# Drawbacks
|
|
|
|
As should be clear by the tone of this RFC, the
|
|
`unguarded-escape-hatch` is clearly a hack. It is subtle and unsafe,
|
|
just as `unsafe_destructor` was (and for the most part, the whole
|
|
point of [Sound Generic Drop][] was to remove `unsafe_destructor` from
|
|
the language).
|
|
|
|
* However, the expectation is that most clients will have no need to
|
|
ever use the `unguarded-escape-hatch`.
|
|
|
|
* It may suffice to use the escape hatch solely within the collection
|
|
types of `libstd`.
|
|
|
|
* Otherwise, if clients outside of `libstd` determine that they *do*
|
|
need to be able to write destructors that need to bypass `dropck`
|
|
safely, then we can (and *should*) investigate one of the
|
|
[sound alternatives][continue supporting parametricity], rather
|
|
than stabilize the unsafe hackish escape hatch..
|
|
|
|
# Alternatives
|
|
[alternatives]: #alternatives
|
|
|
|
## CAP without UGEH
|
|
|
|
One might consider adopting `cannot-assume-parametricity` without
|
|
`unguarded-escape-hatch`. However, unless some other sort of escape
|
|
hatch were added, this path would break much more code.
|
|
|
|
## UGEH for lifetime parameters
|
|
|
|
Since we're already being unsafe here, one might consider having
|
|
the `unsafe_destructor_blind_to_params` apply to lifetime parameters
|
|
as well as type parameters.
|
|
|
|
However, given that the `unsafe_destructor_blind_to_params` attribute
|
|
is only intended as a short-term band-aid (see
|
|
[UGEH stabilization details][]) it seems better to just make it only as
|
|
broad as it needs to be (and no broader).
|
|
|
|
## "Sort-of Guarded" Escape Hatch
|
|
|
|
We could add the escape hatch but continue employing the current
|
|
dropck analysis to it. This would essentially mean that code would have
|
|
to apply the unsafe attribute to be considered for parametricity, but
|
|
if there were obvious problems (namely, if the type parameter had a trait bound)
|
|
then the attempt to opt into parametricity would be ignored and the
|
|
strict ordering restrictions on the lifetimes would be imposed.
|
|
|
|
I only mention this because it occurred to me in passing; I do not
|
|
really think it has much of a benefit. It would potentially lead
|
|
someone to think that their code has been proven sound (since the
|
|
`dropck` would catch some mistakes in programmer reasoning) but the
|
|
pitfalls with respect to specialization would remain.
|
|
|
|
## Continue Supporting Parametricity
|
|
[continue supporting parametricity]: #continue-supporting-parametricity
|
|
There may be ways to revise the language so that functions can declare
|
|
that they must be parametric with respect to their type parameters.
|
|
Here we sketch two potential ideas for how one might do this, mostly to
|
|
give a hint of why this is not a trivial change to the language.
|
|
|
|
Neither design is likely to be adopted, at least as described here,
|
|
because both of them impose significant burdens on implementors of
|
|
parametric destructors, as we will see.
|
|
|
|
(Also, if we go down this path, we will need to fix other bugs in the
|
|
Drop Check rule, where, as previously noted, parametricity is a
|
|
[necessary but *insufficient* condition][parametricity-insufficient] for soundness.)
|
|
|
|
### Parametricity via effect-system attributes
|
|
|
|
One feature of the [impl specialization] RFC is that all functions that
|
|
can be specialized must be declared as such, via the `default` keyword.
|
|
|
|
This leads us to one way that a function could declare that its body
|
|
must not be allows to call into specialized methods: an attribute like
|
|
`#[unspecialized]`. The `#[unspecialized]` attribute, when applied to
|
|
a function `fn foo()`, would mean two things:
|
|
|
|
* `foo` is not allowed to call any functions that have the `default` keyword.
|
|
|
|
* `foo` is only allowed to call functions that are also marked `#[unspecialized]`
|
|
|
|
All `fn drop` methods would be required to be `#[unspecialized]`.
|
|
|
|
It is the second bullet that makes this an ad-hoc effect system: it provides
|
|
a recursive property ensuring that during the extent of the call to `foo`,
|
|
we will never invoke a function marked as `default` (and therefore, I *think*,
|
|
will never even potentially invoke a method that has been specialized).
|
|
|
|
It is also this second bullet that represents a significant burden on
|
|
the destructor implementor. In particular, it immediately rules out
|
|
using any library routine unless that routine has been marked as
|
|
`#[unspecialized]`. The attribute is unlikely to be included on any
|
|
function unless the its developer is making a destructor that calls it
|
|
in tandem.
|
|
|
|
### Parametricity via some `?`-bound
|
|
|
|
Another approach starts from another angle: As described earlier,
|
|
parametricity in `dropck` is the requirement that `fn drop` cannot do
|
|
anything with a `t: T` (where `T` is some relevant type parameter)
|
|
except:
|
|
|
|
1. move `t` to some other owner expecting a `T` or,
|
|
|
|
2. drop `t`, running its destructor and freeing associated resources.
|
|
|
|
So, perhaps it would be more natural to express this requirement
|
|
via a bound.
|
|
|
|
We would start with the assumption that functions may be
|
|
non-parametric (and thus their implementations may be specialized to
|
|
specific types).
|
|
|
|
But then if you want to declare a function as having a stronger
|
|
constraint on its behavior (and thus expanding its potential callers
|
|
to ones that require parametricity), you could add a bound `T: ?Special`.
|
|
|
|
The Drop-check rule would treat `T: ?Special` type-parameters as parametric,
|
|
and other type-parameters as non-parametric.
|
|
|
|
The marker trait `Special` would be an OIBIT that all sized types would get.
|
|
|
|
Any expression in the context of a type-parameter binding of the form
|
|
`<T: ?Special>` would not be allowed to call any `default` method
|
|
where `T` could affect the specialization process.
|
|
|
|
(The careful reader will probably notice the potential sleight-of-hand
|
|
here: is this really any different from the effect-system attributes
|
|
proposed earlier? Perhaps not, though it seems likely that the finer
|
|
grain parameter-specific treatment proposed here is more expressive,
|
|
at least in theory.)
|
|
|
|
Like the previous proposal, this design represents a significant
|
|
burden on the destructor implementor: Again, the `T: ?Special`
|
|
attribute is unlikely to be included on any function unless the its
|
|
developer is making a destructor that calls it in tandem.
|
|
|
|
# Unresolved questions
|
|
|
|
* What name to use for the attribute?
|
|
Is `unsafe_destructor_blind_to_params` sufficiently long and ugly? ;)
|
|
|
|
* What is the real long-term plan?
|
|
|
|
* Should we consider merging the discussion of alternatives
|
|
into the [impl specialization] RFC?
|
|
|
|
# Bibliography
|
|
|
|
### Reynolds
|
|
[Rey83]: #reynolds
|
|
John C. Reynolds. "Types, abstraction and parametric polymorphism". IFIP 1983
|
|
http://www.cse.chalmers.se/edu/year/2010/course/DAT140_Types/Reynolds_typesabpara.pdf
|
|
|
|
### Wadler
|
|
[Wad89]: #wadler
|
|
Philip Wadler. "Theorems for free!". FPCA 1989
|
|
http://ttic.uchicago.edu/~dreyer/course/papers/wadler.pdf
|
|
|