You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Introduce a new, extendable interface to pick multiple enum representations for the same type. In a nutshell we would introduce a new --enum-style CLI argument with its equivalent Builder method which would supersede the already existing --constified-enum-module, --bitfield-enum, --newtype-enum, --newtype-global-enum, --rustified-enum and --rustified-non-exhaustive-enum CLI arguments and their respective Builder method counterparts.
This interface would allow users to pick multiple enum representations for a single C enum and to enable different features for each representation.
Motivation
The main motivation is the lack of a "silver bullet" representation for C enums in Rust.
Currently we have two (#2908 and #2980) upcoming enum representations that interact with the already existing representation in non-trivial ways. In particular, #2908 introduces extensions to the already --rustified-enum representation, these extensions generate safe and unsafe conversions between the C enum values and the "rustified" enum values to avoid unsoundness issues.
At the same time, the existing interface has become increasingly bloated, as each new representation requires the addition of a new CLI flag and method, even when it's essentially an old representation with just an extra feature. An example of this is, --newtype-enum and --newtype-global-enum, where the only difference is the namespacing of the constants for each variant.
Guide-level explanation
Bindgen can map C/C++ enums into Rust in different ways. The way bindgen maps enums can be customized using the Builder::enum_style method, which receives a sequence of EnumVariations and a regex pattern:
implBuilder{/// Apply the provided representations to the C enums whose name matches// the provided regex pattern.pubfnenum_style<I,P>(mutself,representations:I,pattern:P,) -> SelfwhereI:IntoIterator<Item=EnumRepresentation>,P:AsRef<str>;}/// This is just `EnumVariations` with a new name for clarity.pubenumEnumRepresentation{/// Represent a C enum using a Rust enum.Rust{/// Indicates whether the Rust enum should be `#[non_exhaustive]`.non_exhaustive:bool,},/// Represent a C enum using a newtype over the enum's ctype.NewType{/// Indicates whether the newtype will have bitwise operators.bitfield:bool,/// Indicates if the variants will be represented as global/// constants instead of being inside an `impl` block of the newtype.global:bool,},/// Represent a C enum using a ctype constant for each variant.Const{/// The generated constants will be inside a module with the same name/// as the enum.module:bool},}
When this method is used, bindgen will generate the provided representation for each C enum whose name matches the provided regex pattern.
This interface has a CLI equivalent under the --enum-style. Which takes arguments of the form <REPRS>=<REGEX>. Where <REGEX> is a regex pattern and <REPRS> is a comma-separated sequence of enum representations. Each enum representation consists of a name optionally followed by a comma-separated list of features:
rust(non_exhaustive?): See EnumRepresentation::Rust.
newtype(bitfield?, global?): See EnumRepresentation::NewType.
const(module?): See EnumRepresentation::Const.
Reference-level explanation
This feature would be fairly self contained and its only interaction would be with the already existing enum representation features.
Internally, RegexSets for each enum representation would still be stored separatedly. However, a declarative macro would be used to generate both EnumRepresentation and a constant slice with all the possible values that EnumRepresentation could have. With the current representations that would be:
This constant would allow us to generate another slice of type &[(EnumRepresentation, RegexSet)] which would replace all the existing fields of BindgenOptions related to enum representation as iterating over it would allow us to choose the right representation for each enum.
With this approach adding a new feature to the existing representation or adding a new representation would require less changes and should be easier to maintain.
Drawbacks
The main drawback is the fact that this is a breaking change, as it would deprecate the existing interface for enum representation.
Another drawback is related to allowing multiple representations for a single C enum, as the current behavior of bindgen is to choose one documented option by default. This is, if the user calls bindgen with --rustified-enum Foo and --constified-enum Foo, only the Rust representation for Foo will be chosen by bindgen. With the new interface --enum-style rust,const=Foo, both representations would be generated. Which is a breaking change and might cause unexpected behavior on users that rely on bindgen choosing one of the two.
Additionally, allowing multiple representations for a single C enum would make bindgen more likely to generate invalid Rust code, for example, calling bindgen with --enum-style rust,newtype=Foo would produce both a Rust enum and a Newtype for Foo, which would cause a name collision.
Finally, the heavy reliance on macros makes this code less intuitive for new contributors.
Rationale and alternatives
An alternative would be simply to not implement this RFC, which would keep the enum representation interface prone to bloat and increasingly difficult to maintain.
Unresolved questions
Currently it is not clear if we should prevent the generation of invalid Rust code by adding extra checks which guarantee that incompatible representations won't be used for the same C enum.
Future possibilities
The advantages of this design are the ease to extend it to new representations or new features. Examples of this, are #2908 and #2980, which could be integrated by adding new fields to EnumRepresentation::Rust and by adding a new variant to EnumRepresentation respectively.
This interface would be easily representable if bindgen were to adopt a configuration format like TOML, as enum styles could be represented by arrays:
enum_style
Summary
Introduce a new, extendable interface to pick multiple
enum
representations for the same type. In a nutshell we would introduce a new--enum-style
CLI argument with its equivalentBuilder
method which would supersede the already existing--constified-enum-module
,--bitfield-enum
,--newtype-enum
,--newtype-global-enum
,--rustified-enum
and--rustified-non-exhaustive-enum
CLI arguments and their respectiveBuilder
method counterparts.This interface would allow users to pick multiple enum representations for a single C enum and to enable different features for each representation.
Motivation
The main motivation is the lack of a "silver bullet" representation for
C
enums in Rust.Currently we have two (#2908 and #2980) upcoming enum representations that interact with the already existing representation in non-trivial ways. In particular, #2908 introduces extensions to the already
--rustified-enum
representation, these extensions generate safe and unsafe conversions between theC
enum values and the "rustified" enum values to avoid unsoundness issues.At the same time, the existing interface has become increasingly bloated, as each new representation requires the addition of a new CLI flag and method, even when it's essentially an old representation with just an extra feature. An example of this is,
--newtype-enum
and--newtype-global-enum
, where the only difference is the namespacing of the constants for each variant.Guide-level explanation
Bindgen can map C/C++ enums into Rust in different ways. The way bindgen maps enums can be customized using the
Builder::enum_style
method, which receives a sequence ofEnumVariation
s and a regex pattern:When this method is used, bindgen will generate the provided representation for each C enum whose name matches the provided regex pattern.
This interface has a CLI equivalent under the
--enum-style
. Which takes arguments of the form<REPRS>=<REGEX>
. Where<REGEX>
is a regex pattern and<REPRS>
is a comma-separated sequence of enum representations. Each enum representation consists of a name optionally followed by a comma-separated list of features:rust(non_exhaustive?)
: SeeEnumRepresentation::Rust
.newtype(bitfield?, global?)
: SeeEnumRepresentation::NewType
.const(module?)
: SeeEnumRepresentation::Const
.Reference-level explanation
This feature would be fairly self contained and its only interaction would be with the already existing enum representation features.
Internally,
RegexSet
s for each enum representation would still be stored separatedly. However, a declarative macro would be used to generate bothEnumRepresentation
and a constant slice with all the possible values thatEnumRepresentation
could have. With the current representations that would be:This constant would allow us to generate another slice of type
&[(EnumRepresentation, RegexSet)]
which would replace all the existing fields ofBindgenOptions
related to enum representation as iterating over it would allow us to choose the right representation for each enum.With this approach adding a new feature to the existing representation or adding a new representation would require less changes and should be easier to maintain.
Drawbacks
The main drawback is the fact that this is a breaking change, as it would deprecate the existing interface for enum representation.
Another drawback is related to allowing multiple representations for a single C enum, as the current behavior of bindgen is to choose one documented option by default. This is, if the user calls bindgen with
--rustified-enum Foo
and--constified-enum Foo
, only the Rust representation forFoo
will be chosen by bindgen. With the new interface--enum-style rust,const=Foo
, both representations would be generated. Which is a breaking change and might cause unexpected behavior on users that rely on bindgen choosing one of the two.Additionally, allowing multiple representations for a single C enum would make bindgen more likely to generate invalid Rust code, for example, calling bindgen with
--enum-style rust,newtype=Foo
would produce both a Rust enum and a Newtype forFoo
, which would cause a name collision.Finally, the heavy reliance on macros makes this code less intuitive for new contributors.
Rationale and alternatives
An alternative would be simply to not implement this RFC, which would keep the enum representation interface prone to bloat and increasingly difficult to maintain.
Unresolved questions
Currently it is not clear if we should prevent the generation of invalid Rust code by adding extra checks which guarantee that incompatible representations won't be used for the same C enum.
Future possibilities
The advantages of this design are the ease to extend it to new representations or new features. Examples of this, are #2908 and #2980, which could be integrated by adding new fields to
EnumRepresentation::Rust
and by adding a new variant toEnumRepresentation
respectively.This interface would be easily representable if bindgen were to adopt a configuration format like TOML, as enum styles could be represented by arrays:
cc @emilio @jbaublitz
The text was updated successfully, but these errors were encountered: