-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(enum): u16/u32 discriminants #315
Conversation
@frol @dzmitry-lahoda - i have added u16 enum discriminant. As for Alternative is delete EnumExt, so it would not make assumption about discriminant size (compatible with u32 in future). |
Seems all tests work except snaps, do not change snaps until have some decision about EnumExt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see little point in generalizing just up to u16.
this has to be generalized all the way up to https://docs.rs/borsh/1.5.1/borsh/schema/type.DiscriminantValue.html .
so i8, u16/i16, u32/i32 and i64 have to be covered (all of these are Into<i64>
)
Also taking into account tag_width in https://docs.rs/borsh/1.5.1/borsh/schema/enum.Definition.html#
Probably EnumExt can be left intact, but EnumExtI8, EnumExtU16, EnumExtI16,
EnumExtU32, EnumExtI32, EnumExtI64 can be added instead. Doing that with a macro won't be too space-consuming.
input | ||
.attrs | ||
.iter() | ||
.find(|x| x.path() == REPR)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, this behaviour should be controlled by borsh-specific item attribute, say #[borsh(enum_repr = "i16")]
,
not by repr
, which is used to control memory layout in program, which is, strictly speaking, independent of how it's serialized for later transfer or saving to disk (which is what serialization is usually used for).
Say, an enum may not have #[repr(u16)]
and have all discriminants withing u8
range, but still have #[borsh(enum_repr = "u16")]
and use 2 bytes for tag serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming we have borsh enum_repr, how should borsh handle this
fn main(){
#[borsh(enum_repr = "u16")]
#[repr(u32)]
enum A {
Q = 1u8,
}
}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may use u16
instead of "u16"
? may just do borsh(rerp=
instead of #[borsh(enum_repr` because it is on enum anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may it be like this?
#[borsh(repr = u16)]
enum A {
Q = 1u8,
}
and fail compile same way #repr
fails because u8 was set (so borsh rerp interpreted types same way/algorithm) as native repr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may it be like this?
looks good, repr
should then check it annotates an enum item.
fails because u8 was set
well, one would've to change 1u8 to 1 to compile with borsh attribute.
Specifying Q = -10
would allow annotating with #[repr(i8)]
and #[borsh(repr = i16)]
at the same time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may it be like this?
Still, maybe naming it enum_tag
will be better, rust's repr
has other subattributes, not related to enums at all
@dzmitry-lahoda , actually , as https://docs.rs/borsh/1.5.1/borsh/schema/enum.Definition.html has |
What is reason have signed instead of unsigned? Can we have all UNsigned? We can convert all singed to "singed" byte by byte. I am concerned because I do not recall cases when needed negative variants. We have errors enum in protobuf(blockchain domain) - only unsigned, windows winapi/COM return codes - unsigned, http codes - unsigned. Do not recall places where need singed enums? |
Well, if signed discriminants had been useless, rust wouldn't have added support for them. Doing it better really calls for adding that |
Alternatively ,existing |
giving an example in runnable playground , in case above phrasing was too confusing fn main() {
for int_type in ["u8", "i8", "i16", "u16", "i32", "u32", "i64"] {
println!("int_type: {}", int_type);
let tag_width_encoded = forward(int_type);
let (width, signed) = reverse(tag_width_encoded);
println!("tag_width: {}, signed: {}", width, signed);
println!("########");
}
}
fn forward(int_type: &str) -> u8 {
let (width, signed): (u8, bool) = match int_type {
"u8" => (1, false),
"i8" => (1, true),
"u16" => (2, false),
"i16" => (2, true),
"u32" => (4, false),
"i32" => (4, true),
"i64" => (8, true),
"u64" | _ => panic!("unsupported"),
};
let signed_bit = if signed { 2u8.pow(7) } else { 0u8 };
let tag_width_encoded = width | signed_bit;
println!("width {}, tag_width_encoded {}", width, tag_width_encoded);
tag_width_encoded
}
fn reverse(tag_width_encoded: u8) -> (u8, bool) {
// ############### reverse direction for https://docs.rs/borsh/latest/borsh/schema/struct.BorshSchemaContainer.html#method.validate
let signed = tag_width_encoded.leading_ones() > 0;
let width = tag_width_encoded & !(2u8.pow(7));
(width, signed)
} |
Seems OK, supports unsigned and signed and handles backward compatibility. Will update PR to described pattern. Thank you |
After attempt to handle via proposed way, couple observations:
Said that, I think right code for now is next:
All of these changes are backward compatble and as I think forward too. I do not think proposed design with many traits per type nor suggested hack to support for negative are good solutions. |
negative discriminants are not supported at all on current master (with i didn't really get the meaning of points 2. and 3. of the first numbered bullets block, but i created a playground https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=60fcba754c164f666dc3fc2e678d9bd6 to illustrate that same literal can be interpreted as different types depending on containing struct.
i'll take a look at your implementation, and maybe scratch up an alternative one, and we'll choose a simpler one
why not?
i've created a playground https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=255d369b7ab66eda0dd0afb93ecb8b8a to showcase that
see above. there's no need to rush this, it'a complex feature
hack is to express sign without breaking the schema, a separate |
@dj8yfo thank you for response.
I do not feel this is fully reasonable, i think in some cases, specifically in my case, they are tightly coupled and cannot be decoupled because of how rust works with enums. i feel that better statement is In general, So I think that introducing Specifically, rule to be: So: #[borsh(use_discriminant = true)]
#[repr(u16)] will serde into 1 byte. But #[borsh(use_discriminant = true, tag_width = 2)]
#[repr(u16)] into 2 bytes. Next will be errors (new tag_width meta added, leads to new checks - fine) (some of them may be just for now), build fail test cases: #[borsh(use_discriminant = true, tag_width = 2)]
#[repr(i16)] #[borsh(use_discriminant = false, tag_width = 2)]
#[repr(u16)] #[borsh(use_discriminant = true, tag_width = 2)]
#[repr(u8)] #[borsh(use_discriminant = true, tag_width = 2)]
#[repr(u32)] #[repr(C)]
#[repr(transparent)]
#[repr(usize)]
#[repr(isize)] |
pleaze update pr from master branch
i don't won't to reuse rust's please, do this as a separate attribute if someone finds it impossible to specify literals for both doing a mandatory per-variant adding a once-per-item |
I find ConcernsLet have example of enum where user have set Why would reading variant of discriminant to read into From that, why at all So I think that enum_tag is useless. And there are 2 options. What if borsh has u16 and repr has u32? And what is repr has u32 and borsh has u16? What is expected behavior in all parts of borsh serde? Is ProposalThere is If If during ser, actual rust repr data is bigger than borsh placeholder, ser error. |
please take a look into my comments. I still think my current solution is best. As it avoids need to ANSWER all these questions now. It just forces repr to be in sync with width if new tag_width was set. You may consider relax it later as needed, including removal any connection to repr. But forcing tag_witdth to be same as repr, if you want to use new tag - is super safe backward and forward compatible way. |
that will be documented |
I closing this PR for now. I think docs should be first, than impl should go as it is unclear what is expected. |
Added support for
tag_width
to support u16/u32 enums.tag_width
requiresuse_discriminant = true
andrepr(u8/u16/u32)
to ensure backward and forward compatiblity.See #315 (comment) for full design