Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve pointer provenance in the Rust backend. #870

Merged

Conversation

sunfishcode
Copy link
Member

@sunfishcode sunfishcode commented Feb 27, 2024

Use the new Pointer and Length types in the Rust backend to emit code that uses *mut c_void and usize instead of i32 when working with pointers and array lengths.

To represent PointerOrI64, use a MaybeUninit<u64>, since that type can hold any u64 and is documented to also preserve provenance.

This change happens to get the generated Rust code close to supporting memory64, however it isn't complete; the abi code still emits hard-coded + 4 offsets for loading the length of a pointer+length pair in memory, and record layout isn't aware of 64-bit offsets, and possibly other things.

Opening as a draft PR, as it depends on bytecodealliance/wasm-tools#1423.

Use the new `Pointer` and `Length` types in the Rust backend to emit
code that uses `*mut c_void` and `usize` instead of `i32` when working
with pointers and array lengths.

To represent `PointerOrI64`, use a `MaybeUninit<u64>`, since that type
can hold any `u64` and is documented to also preserve provenance.

This change happens to get the generated Rust code close to supporting
memory64, however it isn't complete; the abi code still emits hard-coded
`+ 4` offsets for loading the length of a pointer+length pair in memory.
@sunfishcode sunfishcode force-pushed the sunfishcode/pointer-length branch from 2238944 to 407e94a Compare February 27, 2024 21:02
@sunfishcode sunfishcode marked this pull request as ready for review February 27, 2024 21:02
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I'm mostly curious if you think we can remove I32ToP and PToI32 entirely

crates/rust/src/lib.rs Outdated Show resolved Hide resolved
// Neither a `u64` nor a pointer type can portably do both, so we use
// `MaybeUninit<u64>`, since `MaybeUninit` is [documented] to preserve
// provenance.
// [documented]: https://github.com/rust-lang/rfcs/blob/master/text/3559-rust-has-provenance.md#reference-level-explanation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My read of the text in the RFC here is that MaybeUninit<T> has the same provenance of T where u64 has no provenance here, but do you know if the RFC text meant differently?

Either way though I don't know of anything better so I think it's also ok to leave as-is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere, the RFC discusses how the MaybeUninit rule was added to allow arrays of MaybeUninit<u8> to hold pointer values, in which case the provenance isn't related to the T. But to be sure, I've now asked for clarification on Zulip.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok then in that case it sounds like exactly what we want here 👍

Comment on lines 1310 to 1313
// Convert an `i32` into a pointer.
Bitcast::I32ToP => {
format!("{} as *mut ::core::ffi::c_void", operand)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be removed entirely? It seems like we should avoid generating this ever because this implies that provenance can be manifest out of nowhere right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and similarly below, could PToI32 be removed entirely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we have a variant of an i32 and a pointer, we unify the two types to a pointer type, so that when the variant does hold a pointer, it preserves provenance. When the variant holds an i32, there is no provenance anyway, so it's ok that we're doing these conversions without preserving provenance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha I see, makes sense!

Could you add some docs with respect to provenance on the Bindgen enum variants? If I understand right it's intended that I32ToP adds arbitrary provenance and PToI32 removes that arbitrary provenance, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added some docs.

The way it's discussed in the Rust provenance docs, they say that what I32ToP does is creates a pointer with no provenance. Once the API is stablized in Rust, the functions that will replace these as casts have names like without_provenance.

And switch to `add` from `byte_add`.
@sunfishcode sunfishcode force-pushed the sunfishcode/pointer-length branch from 9d66ba9 to d295a51 Compare February 27, 2024 23:38
@sunfishcode sunfishcode force-pushed the sunfishcode/pointer-length branch from 8b38e28 to 4f46808 Compare February 28, 2024 00:08
@alexcrichton alexcrichton added this pull request to the merge queue Feb 28, 2024
Merged via the queue into bytecodealliance:main with commit 4a2bf63 Feb 28, 2024
12 checks passed
@sunfishcode sunfishcode deleted the sunfishcode/pointer-length branch February 28, 2024 01:30
sunfishcode added a commit to sunfishcode/wit-bindgen that referenced this pull request Mar 5, 2024
Teach the C backend to use C pointer types. This is similar to bytecodealliance#870
for Rust. C is less concerned about provenance, but it reduces the
amount of casting needed in the generated code.

While here, I also optimized the code for list deallocation slightly.
sunfishcode added a commit to sunfishcode/wit-bindgen that referenced this pull request Mar 5, 2024
Teach the C backend to use C pointer types. This is similar to bytecodealliance#870
for Rust. C is less concerned about provenance, but it makes the ABI
easier to understand and reduces the amount of casting needed in the
generated code.

While here, I also optimized the code for list deallocation slightly.
github-merge-queue bot pushed a commit that referenced this pull request Mar 6, 2024
* C: Use pointer types

Teach the C backend to use C pointer types. This is similar to #870
for Rust. C is less concerned about provenance, but it makes the ABI
easier to understand and reduces the amount of casting needed in the
generated code.

While here, I also optimized the code for list deallocation slightly.

* Fix casting for variants.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants