Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for alloc_layout_extra #55724

Open
Amanieu opened this issue Nov 6, 2018 · 34 comments
Open

Tracking issue for alloc_layout_extra #55724

Amanieu opened this issue Nov 6, 2018 · 34 comments
Labels
A-allocators Area: Custom and system allocators B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Amanieu
Copy link
Member

Amanieu commented Nov 6, 2018

This issue tracks additional methods on Layout which allow layouts to be composed to build complex layouts.

pub const fn padding_needed_for(&self, align: usize) -> usize;
pub const fn repeat(&self, n: usize) -> Result<(Layout, usize), LayoutErr>;
pub const fn repeat_packed(&self, n: usize) -> Result<Layout, LayoutErr>;
pub const fn extend_packed(&self, next: Layout) -> Result<Layout, LayoutErr>;
pub const fn dangling(&self) -> NonNull<u8>;

The main use case is to construct complex allocation layouts for use with the stable global allocator API. For example:

One concern is that not many of these methods have been extensively used in practice. In the examples given above, only extend, array and align_to are used, and I expect that these will be the most used in practice. padding_needed_for is used in the implementation of Rc::from_raw and Arc::from_raw, but in theory could be superseded by the offset returned by extend.

@alexcrichton alexcrichton added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Nov 6, 2018
@alexcrichton
Copy link
Member

After reading over these again, we may want to either include (Layout, usize) in the return value of array or remove the usize return value from repeat, they seem like similar use cases to warrant having similar return values.

@Amanieu
Copy link
Member Author

Amanieu commented Nov 6, 2018

In the case of array<T> the offset is already available as size_of::<T>(). So it doesn't make sense to add a usize to the return value. I would rather remove the usize from repeat.

@alexcrichton
Copy link
Member

Why isn't the same argument true for repeat? Isn't the size available via layout.size()?

@Amanieu
Copy link
Member Author

Amanieu commented Nov 7, 2018

Because in a Layout the size does not necessarily have to be a multiple of the alignment. But in a Rust type this is always the case.

@RalfJung
Copy link
Member

RalfJung commented Nov 7, 2018

See #55747 for an issue caused by misuse of the current API. Would be great if we could do something to avoid such mistakes.

@Amanieu
Copy link
Member Author

Amanieu commented Nov 7, 2018

The fundamental question here is: do we want to allow layouts where the size is not a multiple of the alignment? And if we do then we will want to add a convenience method to round up a Layout's size to its alignment.

@RalfJung
Copy link
Member

RalfJung commented Nov 7, 2018

Also if we do we should at least add a fat warning in the extend docs that likely one wants to finish this off by doing pad_to_align().

And maybe the default allocator should assert! that the size it is allocating is divisible by the alignment?

pietroalbini added a commit to pietroalbini/rust that referenced this issue Nov 10, 2018
Fix Rc/Arc allocation layout

* Rounds allocation layout up to a multiple of alignment
* Adds a convenience method `Layout::pad_to_align` to perform rounding

Closes rust-lang#55747

cc rust-lang#55724
@RalfJung
Copy link
Member

FWIW, we actually have explicit tests that allocate with size 8, alignment 16 -- i.e., with a size that is not a multiple of the alignment: heap::alloc_system_overaligned_request and heap::std_heap_overaligned_request.

@glandium
Copy link
Contributor

I stumbled upon padding_needed_for and pad_to_align, and it seems they are being too cautious.

That is, per https://doc.rust-lang.org/nightly/src/core/alloc.rs.html#95, layout.size() is always <= usize::MAX - (layout.align() - 1).

Which means:

  • The maximum value layout.size() can have is already aligned for layout.align() (layout.align() being a power of two, usize::MAX - (layout.align() - 1) is a multiple of layout.align())
  • Incidentally, any value smaller than that maximum value will align at most to that maximum value.

IOW, pad_to_align can not return Err(LayoutErr), except for the layout not respecting its invariants, but we shouldn't care about that.

So it seems to me we can drop the Result and make the method return a Layout directly.

@TimDiekmann
Copy link
Member

What exactly needs to be done to stabilize this? What questions has to be answered before?

@p-avital
Copy link

Hello, just wondering if there's any stable alternative to Layout::array<T>(n) to allocate arrays manually right now ? This feature is the only block for me to switch my build to stable :(

@SimonSapin
Copy link
Contributor

The alternative is Layout::from_size_align(size_of::<T>().checked_mul(n)?, align_of::<T>())?

@ghost
Copy link

ghost commented Jul 28, 2019

I think there might be a bug in the implementation of repeat() when it comes to overaligned Layouts. It currently computes the padded size and then multiples that by the repeat count. However, this arguably over-pads, since the last array element does not need to be padded.

@CAD97
Copy link
Contributor

CAD97 commented Nov 9, 2019

I'd like to propose stabilization for at least extend, array, align_to, and pad_to_align, after #66256 is merged (if it is going to be). cc @rust-lang/libs

The longer these go without being stabilized, the more ad-hoc re-implementations we're going to see. I know myself that I wrote incorrect layout calculation for #[repr(C)] structs in edge cases before switching to directly vendoring the standard library's Layout manipulation.

What is stabilized

The functions Layout::array<T>(n: usize) -> Result<Layout, LayoutErr>, Layout::extend(&self, next: Layout) -> Result<(Layout, usize), LayoutErr>, Layout::align_to(&self, align: usize) -> Result<Layout, LayoutErr>, and Layout::pad_to_align(&self) -> Layout.

Combined, these functions allow code to construct and manipulate #[repr(C)] layouts while letting the standard library handle the awkward edge cases. For example uses, consider the usage in hashbrown and this example for the reference:

/// Returns the layout of a repr(C) struct with the given field layouts in the given order,
/// and the individual field's offsets.
pub fn repr_c(fields: &[Layout]) -> (Layout, Vec<usize>) {
    let mut offsets = Vec::new();
    let mut layout = Layout::from_size_align(0, 0).unwrap();
    for field in fields {
        let (new_layout, offset) = layout.extend(*field).unwrap();
        offsets.push(offset);
        layout = new_layout;
    }
    // Don't forget trailing padding!
    (layout.pad_to_align().unwrap(), offsets)
}

@RalfJung
Copy link
Member

RalfJung commented Nov 11, 2019

I am still worried about having to do this:

    // Don't forget trailing padding!
    (layout.pad_to_align().unwrap(), offsets)

I saw experienced Rust compiler devs forget to call this. This is a huge footgun.

The heavy-weight approach to this would be a dedicated type for layouts where this padding has already been done -- letting the type-system track if pad_to_align still needs to be called. But that might be overkill. However, I feel the least we could do is to

  • Call this out very clearly in the docs. The current docs even say about extend that "The resulting layout will be the same as that of a C struct containing two fields with the layouts of self and next, in that order", which is factually wrong.
  • Add a method like your repr_c to Layout, so that people don't have to write their own loop to turn a list of fields into a layout. Basically, extend should be considered a low-level dangerous-to-use helper methods and we should nudge people towards using safer, higher-level APIs.

@CAD97
Copy link
Contributor

CAD97 commented Nov 11, 2019

Automatically padding to align or requiring padding to align is probably not desirable. Consider something like Swift's decoupled size/stride; it's not incorrect to have a layout without trailing padding to align, it's just that this is the common default for both repr(C) and repr(Rust) (any representation where size=stride).

But I definitely agree that we should fix the extend docs (I'll draft a PR today) and maybe we should add a repr_c fn, though ideally the signature would be approximately fn Layout:::repr_c<const N: usize>(fields: [Layout; N]) -> (Layout, [usize; N]) rather than requiring allocation for Vec.

@RalfJung
Copy link
Member

Automatically padding to align or requiring padding to align is probably not desirable. Consider something like Swift's decoupled size/stride; it's not incorrect to have a layout without trailing padding to align, it's just that this is the common default for both repr(C) and repr(Rust) (any representation where size=stride).

Ever decoupling this for real in Rust will be near impossible due to lots of unsafe code already assuming size == stride. But yes.

But I definitely agree that we should fix the extend docs (I'll draft a PR today) and maybe we should add a repr_c fn, though ideally the signature would be approximately fn Layout:::repr_c(fields: [Layout; N]) -> (Layout, [Layout; N]) rather than requiring allocation for Vec.

Just make it repr_c(fields: &[Layout]), no need to allocate. Or repr_c(fields: impl IntoIter<Item=&Layout>) if you want to be fancy.

@CAD97
Copy link
Contributor

CAD97 commented Nov 11, 2019

Just make it repr_c(fields: &[Layout]), no need to allocate. Or repr_c(fields: impl IntoIter<Item=&Layout>) if you want to be fancy.

Sorry, I had a typo there. fn repr_c wants to provide the offsets of the fields, which requires returning an array or an out parameter for the space. (If you don't need offsets, just use Layout::new::<S>() for some #[repr(C)] struct S.) Here's a potential const-generic impl:

pub fn repr_c<const N: usize>(fields: [Layout; N]) -> Result<(Layout, [usize; N]), LayoutErr> {
    unsafe {
        let mut offsets: MaybeUninit<[usize; N]> = MaybeUninit::uninit();
        let mut layout = Layout::from_size_align_unchecked(0, 0);
        let mut offset = offsets.as_mut_ptr().cast::<usize>();
        for field in &fields[..] {
            let (new_layout, this_offset) = layout.extend(*field)?;
            layout = new_layout;
            ptr::write(offset, this_offset);
            offset = offset.offset(1);
        }
        Ok((layout.pad_to_align()?, offsets.assume_init()))
    }
}

(NB: this is unsound as currently implemented as Layout cannot even temporarily have align 0. But it shows effectively what the impl would be.)

@RalfJung
Copy link
Member

Oh I see. Yeah that's kind-of annoying...

@pnkfelix
Copy link
Member

(NB: this is unsound as currently implemented as Layout cannot even temporarily have align 0. But it shows effectively what the impl would be.)

Its been ages since I looked at this stuff, but: Can't you use 1 instead of 0 as the initial align?

@CAD97
Copy link
Contributor

CAD97 commented Nov 20, 2019

This is what I'm using currently, as it avoids the need for MaybeUninit wrangling to avoid zero-initializing a small array. But yes, I think it should work fine if you start with align 1.

pub fn repr_c<const N: usize>(fields: [Layout; N]) -> Result<(Layout, [usize; N]), LayoutErr> {
    let mut offsets: [usize; N] = unsafe { mem::zeroed() }; // replacement for `[0; N]` which doesn't work yet
    let mut layout = fields[0];
    for i in 1..N {
        let (new_layout, this_offset) = layout.extend(fields[i])?;
        layout = new_layout;
        offsets[i] = this_offset;
    }
    Ok((layout.pad_to_align()?, offsets))
}

EDIT 2019-04-15

const generics and alloc_layout_extra have improved a bit from when I first wrote fn repr_c. Here's the updated version, with purely safe code:

pub fn repr_c<const N: usize>(fields: [Layout; N]) -> Result<(Layout, [usize; N]), LayoutErr> {
    let mut offsets: [usize; N] = [0; N];
    let mut layout = fields[0];
    for i in 1..N {
        let (new_layout, this_offset) = layout.extend(fields[i])?;
        layout = new_layout;
        offsets[i] = this_offset;
    }
    Ok((layout.pad_to_align(), offsets))
}

bors added a commit that referenced this issue Dec 3, 2019
Layout::pad_to_align is infallible

As per [this comment](#55724 (comment)) (cc @glandium).

> Per https://github.com/rust-lang/rust/blob/eb981a1/src/libcore/alloc.rs#L63-L65, `layout.size()` is always <= `usize::MAX - (layout.align() - 1)`.
>
> Which means:
>
> * The maximum value `layout.size()` can have is already aligned for `layout.align()` (`layout.align()` being a power of two, `usize::MAX - (layout.align() - 1)` is a multiple of `layout.align()`)
> * Incidentally, any value smaller than that maximum value will align at most to that maximum value.
>
> IOW, `pad_to_align` can not return `Err(LayoutErr)`, except for the layout not respecting its invariants, but we shouldn't care about that.

This PR makes `pad_to_align` return `Layout` directly, representing the fact that it cannot fail.
@JohnTitor JohnTitor added the C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC label Feb 28, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Apr 20, 2020
Stabilize most common subset of alloc_layout_extras

Tracking issue: rust-lang#55724

Specifically, this stabilizes:

```rust
pub fn Layout::align_to(&self, align: usize) -> Result<Layout, LayoutErr>;
pub fn Layout::pad_to_align(&self) -> Layout;
pub fn Layout::extend(&self, next: Layout) -> Result<(Layout, usize), LayoutErr>;
pub fn Layout::array<T>(n: usize) -> Result<Layout, LayoutErr>;
```

Methods that are tracked by rust-lang#55724 but are not stabilized here:

```rust
pub fn Layout::padding_needed_for(&self, align: usize) -> usize;
pub fn Layout::repeat(&self, n: usize) -> Result<(Layout, usize), LayoutErr>;
pub fn Layout::repeat_packed(&self, n: usize) -> Result<Layout, LayoutErr>;
pub fn Layout::extend_packed(&self, next: Layout) -> Result<Layout, LayoutErr>;
```

Combined, these stabilized functions allow code to construct and manipulate `repr(C)` layouts while letting the standard library handle correctness in the face of edge cases. For example use cases, consider the usage in [hashbrown](https://github.com/Amanieu/hashbrown/blob/2f2af1d/src/raw/mod.rs#L143), [crossbeam-skiplist](https://github.com/crossbeam-rs/crossbeam-skiplist/blob/master/src/base.rs#L99), [pointer-utils/slice-dst](https://github.com/CAD97/pointer-utils/blob/92aeefeed9399f28d1b1654b63f8dcbe1242d8d4/crates/slice-dst/src/layout_polyfill.rs), and of course the standard library itself.

Providing a higher-level API such as `Layout::repr_c<const N: usize>(fields: [Layout; N]) -> Result<(Layout, [usize; N]), LayoutErr>` is blocked on const generics, which are a ways off. Providing an API that doesn't provide offsets would be quite suboptimal, as the reason for calculating the layout like this rather than `Layout::new` is to get the field offsets.

The primary issue with the current API is having to call `.pad_to_align()` to match the layout of a `repr(C)` struct. However, I think this is not just a (failing? limitation?) of the API, but rather intrinsic complexity. While all Rust-defined types have size==stride, and probably will for the foreseeable future, there is no inherent reason why this is a limitation of all allocations. As such, the `Layout` manipulation APIs shouldn't impose this limitation, and instead the higher level api of `repr_c` (or just plain old using `Layout::new`) can make keeping it simple.

cc @matklad r? @rust-lang/libs
@CAD97
Copy link
Contributor

CAD97 commented May 9, 2020

Status update:

#69362 stabilized (in 1.44) Layout::align_to, pad_to_align, extend, and array, and #70362 added Layout::dangling to this feature gate.

@Lokathor
Copy link
Contributor

Lokathor commented Jun 5, 2020

I'll speak a moment about repeat as a potential user of it with a practical example. My use case is as follows: I would like for a bitmap allocation to have H scanlines. Each scanline is W pixels of size P, and each scanline is also optionally over-aligned to 16 or 32.

This can't be done using array because no proper type for this exists, so it would be done with something like repeat I suppose. In this case, I need the trailing pad bytes of a scanline to also be trailing on the end of the last scanline as well.

However, manually calling pad_to_align to pad out the scanline layout before "repeating" that value wouldn't be the end of the world.

@KodrAus KodrAus added A-allocators Area: Custom and system allocators Libs-Tracked Libs issues that are tracked on the team's project board. labels Jul 29, 2020
@sunjay
Copy link
Member

sunjay commented Feb 26, 2021

Could repeat be stabilized? I'm using it to know the layout of various sizes of arrays. With repeat I can just store the Layout of a single item and compute the layout of each array on the fly with the array's length. I'm also not storing any proper type, so this can't just be done with array.

@scottmcm
Copy link
Member

scottmcm commented Apr 8, 2022

For padding_needed_for, it could be nice to have it take a specific type (see https://github.com/rust-lang/rust/pull/95361/files#diff-02049689ae3bb7ac98af3418ad8fdca219bfa1227cb6cad31afc72bdbae30e27R28) that's always a valid alignment, rather than needing to document that

The return value of this function has no meaning if align is not a power-of-two.

@dfreese
Copy link
Contributor

dfreese commented May 26, 2022

Curious, are there any major blockers for stabilizing dangling? I've kept an internal implementation of dangling for some aligned buffer code, but it would be great if that could be removed if there are no major concerns that have come up.

@thomcc
Copy link
Member

thomcc commented May 26, 2022

Really, all of these are pretty good and useful, not just dangling. I'd be in favor of stabilizing them.

@RalfJung
Copy link
Member

There is no "dangling" in the summary issue at the top; is that one incomplete?

@CAD97
Copy link
Contributor

CAD97 commented May 26, 2022

dangling got added to this feature by #70362 but never added to the OP.

padding_needed_for doesn't have any outstanding questions (that I know of1) so can likely be stabilized. The same for repeat_packed and extend_packed.

I know repeat is the one that's really desirable to add, but it's the one with an open question: should it be

// modulo dealing with offsets
let mut this = self;
for _ in 1..n {
    this = this.extend(self)?;
}
this

and not have trailing padding, or if it should include trailing padding like array does. The current implementation does include trailing padding, but I'm personally2 in favor of repeat just being defined as a sequence of extend and not having trailing padding (and have attempted to write other code defensively not assuming trailing padding is added). I understand both options, though, so that'll need to be decided in FCP. This code is also notoriously perf-sensitive (see e.g. fn size_align being perf critical to exist despite only being used in one location, somehow).

I'll go ahead and prepare a stabilization PR with the cleanup I think is desirable and we can FCP that. (EDIT: #97438)

Footnotes

  1. actually, I just noticed that its documentation has a line that

    The return value of this function has no meaning if align is not a power-of-two.
    but align is required to be a power of two (it's even encoded in the validity requirement, now).

  2. disclaimer: I'm a member of wg-alloc but this is solely my own opinion and not based on wg-alloc discussion.

@dfreese
Copy link
Contributor

dfreese commented May 26, 2022

Thanks for all of the detail and putting that stabilization PR together, @CAD97!

@scottmcm
Copy link
Member

scottmcm commented May 27, 2022

padding_needed_for doesn't have any outstanding questions

Well, I put one in #55724 (comment) above

The return value of this function has no meaning if align is not a power-of-two.

but align is required to be a power of two (it's even encoded in the validity requirement, now)

I think you're looking at the wrong thing -- there's an align: usize parameter so #95361 didn't fix that.

I'd be happy to make a PR to expose that type so it could be used in padding_needed_for -- any thoughts where it would best fit? (It's current mem::ValidAlign, but it could be core::layout::Alignment or something...)

EDIT: I opened an ACP to add an Alignment newtype to core: rust-lang/libs-team#108

@Kixiron
Copy link
Member

Kixiron commented Jan 25, 2023

Does Layout::repeat() have any holdouts that'd keep it from stabilization?

@ChrisDenton
Copy link
Member

I think at this point Layout::repeat() just needs some improved docs, similar to Layout::extend(), to mention the importance of pad_to_align when e.g. creating repr(C) layouts.

cc @CAD97 who last tried to stabilize some of these methods.

@dtolnay
Copy link
Member

dtolnay commented Nov 23, 2023

I have updated the API summary at the top of this issue with the current const status of all of these functions. They were made const by #102207, and a subset of them subsequently reverted to non-const by #110393.

before #102207 before #110393 after #110393
align_to const
pad_to_align const const
padding_needed_for const const const
repeat const
extend const
repeat_packed const
extend_packed const
array const const
dangling const const const

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-allocators Area: Custom and system allocators B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests