Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the Buffer trait #908

Closed
wants to merge 5 commits into from
Closed

Conversation

notgull
Copy link
Contributor

@notgull notgull commented Oct 28, 2023

This is for discussion purposes before I go through the codebase and replace the
rest of the code.

Closes #81

@sunfishcode @SUPERCILEX @ids1024 You all seem to work a lot with rustix, any thoughts on this API before I commit to it? It doesn't really support vectored reading but neither does libstd.

This is for discussion purposes before I go through the codebase and replace the
rest of the code.

Signed-off-by: John Nunley <[email protected]>
@SUPERCILEX
Copy link
Contributor

Long term, the method signature should probably be fn read<Fd: AsFd>(fd: Fd, buf: BorrowedCursor<'_>) -> Result<usize>. Note that I haven't thought about the lifetime, so might be wrong. Also, returning a usize is technically unnecessary as that information is encoded in the length of the cursor, but having to store a prev_len variable can be annoying. Then again, maybe consistency with the stdlib (fn read_buf(&mut self, buf: BorrowedCursor<'_>) -> Result<()>) is more important.

Sample implementation:
https://github.com/rust-lang/rust/blob/7cc36de72d9a5fd6881946c673ff47586214ad1e/library/std/src/sys/unix/fd.rs#L145-L159


I kind of think rustix should just make a straight copy of the Borrowed{Buf,Cursor} structs and then swap over to the stdlib implementation once they're stable.

Not sure how I feel about a different (though reasonable) route to the stdlib.

}
}

/// Types made up of plain-old-data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is supposed to be equivalent to requirements of the AnyBitPattern type in bytemuck, but avoids adding that as a dependency?

Bytemuck distinguishes that from its Pod trait which has slightly different requirements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A secondary goal for this trait is to act as a buffer for epoll and kqueue events, which can have uninitialized bytes (IIRC). Which is why I'm denoting it like this.

@ids1024
Copy link
Contributor

ids1024 commented Oct 29, 2023

Matching the unstable Read::read_buf does seem like the logical thing to do.That was also my first thought here. It's annoying that still isn't stable.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance, I... want to think about this more 😄 .

On one hand, the Buffer trait is a neat idea.

On the other hand, it feels like the API is getting more complex and reaching in the direction of end-user ergonomics, when functions like read are still low-level primitives.


#[inline]
fn as_buffer(&mut self) -> (*mut T, usize) {
(self.as_mut_ptr(), self.len())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could get away with using self.capacity() here rather than just self.len(), which would make this approach more attractive because one could pass in a Vec::with_capacity(n) and read into it without initializing it first, or take an existing Vec, clear() it, and then reuse the existing memory directly.

Another option would be to use .spare_capacity_mut().as_buffer(), and say the behavior here is to append elements to the Vec rather than rewriting the whole Vec. That seems nice for read_to_end kinds of use cases. On the other hand, I wonder if that's too subtle. Given a v: Vec<T>, passing in v would append, while &mut v would start from the beginning. And passing &v to write would still start from the beginning.

@sunfishcode
Copy link
Member

After thinking about this more, I think I'd like to see an API here where we add new functions that take uninitialized buffers, rather than modifying the existing functions, for two reasons.

I recently learned that replacing an argument of type T with a trait argument where T is one of multiple types that implement a trait is not completely semver-compatible. With plain T, users can do .into() at the callsite and it infers the destination type. With a trait, there is no specific destination that can be inferred, so users get a compile error. I don't know how likely it is that any user is doing .into() to convert into a &mut [u8], but it's theoretically possible.

And, std has not yet decided how it wants to deal with uninitialized buffers, but when it does, I assume we'll want to change rustix to use the same approach. That means we may need to deprecate something in the future, and if we do, it'll be easier to do that with separate functions.

So, what would you think about adding new functions, named something like read_buf or read_raw?

@SUPERCILEX
Copy link
Contributor

My main concern with accepting a raw MaybeUninit slice (and why I suggested copying the std Borrowed structs) is that every caller will need to use unsafe to assume_init the bytes that were written. That unsafe should be part of the library and not be the caller's responsibility.

I also agree that separate functions are a good idea.

@notgull
Copy link
Contributor Author

notgull commented Nov 18, 2023

Closing this so I can rethink this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reading into uninitialized buffers
4 participants