8.9 KiB
- Start Date: 2014-09-11
- RFC PR #: rust-lang/rfcs#198
- Rust Issue #: rust-lang/rust#17177
Summary
This RFC adds overloaded slice notation:
foo[]
forfoo.as_slice()
foo[n..m]
forfoo.slice(n, m)
foo[n..]
forfoo.slice_from(n)
foo[..m]
forfoo.slice_to(m)
mut
variants of all the above
via two new traits, Slice
and SliceMut
.
It also changes the notation for range match
patterns to ...
, to
signify that they are inclusive whereas ..
in slices are exclusive.
Motivation
There are two primary motivations for introducing this feature.
Ergonomics
Slicing operations, especially as_slice
, are a very common and basic thing to
do with vectors, and potentially many other kinds of containers. We already
have notation for indexing via the Index
trait, and this RFC is essentially a
continuation of that effort.
The as_slice
operator is particularly important. Since we've moved away from
auto-slicing in coercions, explicit as_slice
calls have become extremely
common, and are one of the
leading ergonomic/first impression
problems with the language. There are a few other approaches to address this
particular problem, but these alternatives have downsides that are discussed
below (see "Alternatives").
Error handling conventions
We are gradually moving toward a Python-like world where notation like foo[n]
calls fail!
when n
is out of bounds, while corresponding methods like get
return Option
values rather than failing. By providing similar notation for
slicing, we open the door to following the same convention throughout
vector-like APIs.
Detailed design
The design is a straightforward continuation of the Index
trait design. We
introduce two new traits, for immutable and mutable slicing:
trait Slice<Idx, S> {
fn as_slice<'a>(&'a self) -> &'a S;
fn slice_from(&'a self, from: Idx) -> &'a S;
fn slice_to(&'a self, to: Idx) -> &'a S;
fn slice(&'a self, from: Idx, to: Idx) -> &'a S;
}
trait SliceMut<Idx, S> {
fn as_mut_slice<'a>(&'a mut self) -> &'a mut S;
fn slice_from_mut(&'a mut self, from: Idx) -> &'a mut S;
fn slice_to_mut(&'a mut self, to: Idx) -> &'a mut S;
fn slice_mut(&'a mut self, from: Idx, to: Idx) -> &'a mut S;
}
(Note, the mutable names here are part of likely changes to naming conventions that will be described in a separate RFC).
These traits will be used when interpreting the following notation:
Immutable slicing
foo[]
forfoo.as_slice()
foo[n..m]
forfoo.slice(n, m)
foo[n..]
forfoo.slice_from(n)
foo[..m]
forfoo.slice_to(m)
Mutable slicing
foo[mut]
forfoo.as_mut_slice()
foo[mut n..m]
forfoo.slice_mut(n, m)
foo[mut n..]
forfoo.slice_from_mut(n)
foo[mut ..m]
forfoo.slice_to_mut(m)
Like Index
, uses of this notation will auto-deref just as if they were method
invocations. So if T
implements Slice<uint, [U]>
, and s: Smaht<T>
, then
s[]
compiles and has type &[U]
.
Note that slicing is "exclusive" (so [n..m]
is the interval n <= x < m
), while ..
in match
patterns is "inclusive". To avoid
confusion, we propose to change the match
notation to ...
to
reflect the distinction. The reason to change the notation, rather
than the interpretation, is that the exclusive (respectively
inclusive) interpretation is the right default for slicing
(respectively matching).
Rationale for the notation
The choice of square brackets for slicing is straightforward: it matches our indexing notation, and slicing and indexing are closely related.
Some other languages (like Python and Go -- and Fortran) use :
rather than
..
in slice notation. The choice of ..
here is influenced by its use
elsewhere in Rust, for example for fixed-length array types [T, ..n]
. The ..
for slicing has precedent in Perl and D.
See Wikipedia for more on the history of slice notation in programming languages.
The mut
qualifier
It may be surprising that mut
is used as a qualifier in the proposed
slice notation, but not for the indexing notation. The reason is that
indexing includes an implicit dereference. If v: Vec<Foo>
then
v[n]
has type Foo
, not &Foo
or &mut Foo
. So if you want to get
a mutable reference via indexing, you write &mut v[n]
. More
generally, this allows us to do resolution/typechecking prior to
resolving the mutability.
This treatment of Index
matches the C tradition, and allows us to
write things like v[0] = foo
instead of *v[0] = foo
.
On the other hand, this approach is problematic for slicing, since in general it would yield an unsized type (under DST) -- and of course, slicing is meant to give you a fat pointer indicating the size of the slice, which we don't want to immediately deref. But the consequence is that we need to know the mutability of the slice up front, when we take it, since it determines the type of the expression.
Drawbacks
The main drawback is the increase in complexity of the language syntax. This
seems minor, especially since the notation here is essentially "finishing" what
was started with the Index
trait.
Limitations in the design
Like the Index
trait, this forces the result to be a reference via
&
, which may rule out some generalizations of slicing.
One way of solving this problem is for the slice methods to take
self
(by value) rather than &self
, and in turn to implement the
trait on &T
rather than T
. Whether this approach is viable in the
long run will depend on the final rules for method resolution and
auto-ref.
In general, the trait system works best when traits can be applied to
types T
rather than borrowed types &T
. Ultimately, if Rust gains
higher-kinded types (HKT), we could change the slice type S
in the
trait to be higher-kinded, so that it is a family of types indexed
by lifetime. Then we could replace the &'a S
in the return value
with S<'a>
. It should be possible to transition from the current
Index
and Slice
trait designs to an HKT version in the future
without breaking backwards compatibility by using blanket
implementations of the new traits (say, IndexHKT
) for types that
implement the old ones.
Alternatives
For improving the ergonomics of as_slice
, there are two main alternatives.
Coercions: auto-slicing
One possibility would be re-introducing some kind of coercion that automatically
slices.
We used to have a coercion from (in today's terms) Vec<T>
to
&[T]
. Since we no longer coerce owned to borrowed values, we'd probably want a
coercion &Vec<T>
to &[T]
now:
fn use_slice(t: &[u8]) { ... }
let v = vec!(0u8, 1, 2);
use_slice(&v) // automatically coerce here
use_slice(v.as_slice()) // equivalent
Unfortunately, adding such a coercion requires choosing between the following:
-
Tie the coercion to
Vec
andString
. This would reintroduce special treatment of these otherwise purely library types, and would mean that other library types that support slicing would not benefit (defeating some of the purpose of DST). -
Make the coercion extensible, via a trait. This is opening pandora's box, however: the mechanism could likely be (ab)used to run arbitrary code during coercion, so that any invocation
foo(a, b, c)
might involve running code to pre-process each of the arguments. While we may eventually want such user-extensible coercions, it is a big step to take with a lot of potential downside when reasoning about code, so we should pursue more conservative solutions first.
Deref
Another possibility would be to make String
implement Deref<str>
and
Vec<T>
implement Deref<[T]>
, once DST lands. Doing so would allow explicit
coercions like:
fn use_slice(t: &[u8]) { ... }
let v = vec!(0u8, 1, 2);
use_slice(&*v) // take advantage of deref
use_slice(v.as_slice()) // equivalent
There are at least two downsides to doing so, however:
-
It is not clear how the method resolution rules will ultimately interact with
Deref
. In particular, a leading proposal is that for a smart pointers: Smaht<T>
when you invokes.m(...)
only inherent methodsm
are considered forSmaht<T>
; trait methods are only considered for the maximally-derefed value*s
.With such a resolution strategy, implementing
Deref
forVec
would make it impossible to use trait methods on theVec
type except through UFCS, severely limiting the ability of programmers to usefully implement new traits forVec
. -
The idea of
Vec
as a smart pointer around a slice, and the use of&*v
as above, is somewhat counterintuitive, especially for such a basic type.
Ultimately, notation for slicing seems desirable on its own merits anyway, and
if it can eliminate the need to implement Deref
for Vec
and String
, all
the better.