rfcs/text/3107-derive-default-enum.md

11 KiB

Summary

An attribute #[default], usable on enum unit variants, is introduced thereby allowing some enums to work with #[derive(Default)].

#[derive(Default)]
enum Padding {
    Space,
    Zero,
    #[default]
    None,
}

assert_eq!(Padding::default(), Padding::None);

The #[default] and #[non_exhaustive] attributes may not be used on the same variant.

Motivation

#[derive(Default)] in more cases

Currently, #[derive(Default)] is not usable on enums. To partially rectify this situation, a #[default] attribute is introduced that can be attached to unit variants. This allows you to use #[derive(Default)] on enums wherefore you can now write:

#[derive(Default)]
enum Padding {
    Space,
    Zero,
    #[default]
    None,
}

Guide-level explanation

The ability to add default values to fields of enum variants does not mean that you can suddenly #[derive(Default)] on the enum. A Rust compiler will still have no idea which variant you intended as the default. This RFC adds the ability to mark one unit variant with #[default]:

#[derive(Default)]
enum Ingredient {
    Tomato,
    Onion,
    #[default]
    Lettuce,
}

Now the compiler knows that Ingredient::Lettuce should be considered the default and will accordingly generate an appropriate implementation:

impl Default for Ingredient {
    fn default() -> Self {
        Ingredient::Lettuce
    }
}

Note that after any cfg-stripping has occurred, it is an error to have #[default] specified on zero or multiple variants.

As fields may be added to #[non_exhaustive] variants that necessitate additional bounds, it is not permitted to place #[default] and #[non_exhaustive] on the same variant.

Reference-level explanation

#[default] on enums

An attribute #[default] is provided the compiler and may be legally placed solely on one exhaustive enum unit variants. The attribute has no semantics on its own. Placing the attribute on anything else will result in a compilation error. Furthermore, if the attribute occurs on zero or multiple variants of the same enum data-type after cfg-stripping and macro expansion is done, this will also result in a compilation error.

#[derive(Default)]

Placing #[derive(Default)] on an enum named $e is permissible if and only if that enum has some variant $v with #[default] on it. In that event, the compiler shall generate the following: implementation of Default where the function default is defined as:

impl ::core::default::Default for $e {
    fn default() -> Self {
        $e::$v
    }
}

Generated bounds

As exhaustive unit variants have no inner types, no bounds shall be generated on the derived implementation. For example,

#[derive(Default)]
enum Option<T> {
    #[default]
    None,
    Some(T),
}

would generate:

impl<T> Default for Option<T> {
    fn default() -> Self {
        Option::None
    }
}

Interaction with #[non_exhaustive]

The Rust compiler shall not permit #[default] and #[non_exhaustive] to be present on the same variant. Non-default variants may be #[non_exhaustive], as can the enum itself.

Drawbacks

The usual drawback of increasing the complexity of the language applies. However, the degree to which complexity is increased is not substantial. One notable change is the addition of an attribute for a built-in #[derive], which has no precedent.

Rationale

The inability to derive Default on enums has been noted on a number of occasions, with a common suggestion being to add a #[default] attribute (or similar) as this RFC proposes.

In the interest of forwards compatibility, this RFC is limited to only exhaustive unit variants. Were this not the case, adding a field to a #[non_exhaustive] variant could lead to more stringent bounds being generated, which is a breaking change. For example,

A definition of

#[derive(Default)]
enum Foo<T> {
    #[default]
    #[non_exhaustive]
    Alpha,
    Beta(T),
}

would not have any required bounds on the generated code. If this were changed to

#[derive(Default)]
enum Foo<T> {
    #[default]
    #[non_exhaustive]
    Alpha(T),
    Beta(T),
}

then any code where T: !Default would now fail to compile, on the assumption that the generated code for the latter has the T: Default bound (nb: not part of this RFC).

Alternatives

One alternative is to permit the user to declare the default variant in the derive itself, such as #[derive(Default(VariantName))]. This has the disadvantage that the variant name is present in multiple locations in the declaration, increasing the likelihood of a typo (and thus an error).

Another alternative is assigning the first variant to be default when #[derive(Default)] is present. This may prevent a #[derive(PartialOrd)] on some enums where order is important (unless the user were to explicitly assign the discriminant).

Prior art

Procedural macros

There are a number of crates which to varying degrees afford macros for default field values and associated facilities.

#[derive(Derivative)]

The crate derivative provides the #[derivative(Default)] attribute. With it, you may write:

#[derive(Derivative)]
#[derivative(Default)]
enum Foo {
    #[derivative(Default)]
    Bar,
    Baz,
}

Contrast this with the equivalent in the style of this RFC:

#[derive(Default)]
enum Foo {
    #[default]
    Bar,
    Baz,
}

Like in this RFC, derivative allows you to derive Default for enums. The syntax used in the macro is #[derivative(Default)] whereas the RFC provides the more ergonomic and direct notation #[default] in this RFC.

#[derive(SmartDefault)]

The smart-default provides #[derive(SmartDefault)] custom derive macro. It functions similarly to derivative but is specialized for the Default trait. With it, you can write:

#[derive(SmartDefault)]
enum Foo {
    #[default]
    Bar,
    Baz,
}
  • The same syntax #[default] is used both by smart-default and by this RFC. While it may seem that this RFC was inspired by smart-default, this is not the case. Rather, this notation has been independently thought of on multiple occasions. That suggests that the notation is intuitive and a solid design choice.

  • There is no trait SmartDefault even though it is being derived. This works because #[proc_macro_derive(SmartDefault)] is in fact not tied to any trait. That #[derive(Serialize)] refers to the same trait as the name of the macro is from the perspective of the language's static semantics entirely coincidental.

    However, for users who aren't aware of this, it may seem strange that SmartDefault should derive for the Default trait.

Unresolved questions

  • None so far.

Future possibilities

Non-unit variants

One significant future possibility is to have #[default] permitted on non-unit variants. This was originally proposed as part of this RFC but has been postponed due to disagreement over what the generated bounds should be. This is largely due to the fact that #[derive(Default)] on structs may generate incorrect bounds.

Overriding default fields

The #[default] attribute could be extended to override otherwise derived default values, such as

#[derive(Default)]
struct Foo {
    alpha: u8,
    #[default = 1]
    beta: u8,
}

which would result in

impl Default for Foo {
    fn default() -> Self {
        Foo {
            alpha: Default::default(),
            beta: 1,
        }
    }
}

being generated.

Alternatively, dedicated syntax could be provided as proposed by @Centril:

#[derive(Default)]
struct Foo {
    alpha: u8,
    beta: u8 = 1,
}

If consensus can be reached on desired bounds, there should be no technical restrictions on permitting the #[default] attribute on a #[non_exhaustive] variant.

Clearer documentation and more local reasoning

Providing good defaults when such exist is part of any good design that makes a physical tool, UI design, or even data-type more ergonomic and easily usable. However, that does not mean that the defaults provided can just be ignored and that they need not be understood. This is especially the case when you are moving away from said defaults and need to understand what they were. Furthermore, it is not too uncommon to see authors writing in the documentation of a data-type that a certain value is the default.

All in all, the defaults of a data-type are therefore important properties. By encoding the defaults right where the data-type is defined gains can be made in terms of readability particularly with regard to the ease of skimming through code. In particular, it is easier to see what the default variant is if you can directly look at the rustdoc page and read the previous snippet, which would let you see the default variant without having to open up the code of the Default implementation.

Error trait and more

As this is the first derive macro that includes an attribute, this may open the flood gates with regard to permitting additional macros with attributes. Crates such as thiserror could be, in some form or another, upstreamed to the standard library as #[derive(Error)], #[derive(Display)] or more.