Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design: boxed preads #51

Open
m4b opened this issue Apr 6, 2019 · 6 comments
Open

Design: boxed preads #51

m4b opened this issue Apr 6, 2019 · 6 comments

Comments

@m4b
Copy link
Owner

m4b commented Apr 6, 2019

In anticipation of #47, the final remaining piece for working with arbitrary data safely is to figure out how to pread large objects without blowing the stack.
Most system languages accomplish this by boxing the object and manipulating it through the heap pointer.
I'm wondering if we can accomplish this in a semi-automated way, without compromising scroll's interesting type based reading/writing.

Doing this properly would allow us to round trip a large stack object within scroll.

Specifically, could we do something like:

#[derive(Pread, Pwrite, Copy, Clone)]
struct Big {
    arr: [u8; 100000000],
}
let mut bytes = vec![42; 100000001];
let big = box bytes.pread_with::<Box<Big>>(0, LE)?; // this is boxed
// by the scroll Box impl, using unsafe unitialized memory
// which is only returned if the pread succeeds, so it remains safe?
let _ = bytes.pwrite_with(big, 0, LE)?; // Note: there is no `&*`,
// ideally scroll can deal with this case, though I don't know if its possible?

In worst case, we should be able to expose a new pread_with_box method, that does the above, but pushes the box type information into the method name. This is fine, but i think its a more beautiful api design to allow the user to choose which type to read out via turbofish.

cc @philipc @lzutao @willglynn @luser
What you all think?

@philipc
Copy link
Contributor

philipc commented Apr 7, 2019

Do you have a real world example that would use this?

@m4b
Copy link
Owner Author

m4b commented Apr 7, 2019

Yes, similar to the example above, I was dealing with some C structs which had some header information, etc., then a large byte array containing image bytes, then some more stuff.
I wanted to pread the struct, match on some header values, validate, then do something with the bytes, then write them out to disk with slightly different header values.
Scroll would have been extremely fast way to prototype/work with the data, but because I couldn't box the pread data, it exploded my stack; testing with the pwrite version taking references, I was able to write the data, which was one half of the problem.
Hope that's clear?
I think this will generally come up in FFI with large C structs of byte arrays where the api expects a malloc(sizeof(LargeStruct)) to construct the initial type.

@philipc
Copy link
Contributor

philipc commented Apr 8, 2019

I was hoping for an actual real world example. It seems to me that large fixed size arrays are quite uncommon, and even then you can usually use Vec<u8> anyway. So the use case is:

  • you have a struct containing a large fixed size array
  • you need to pass this struct to C with FFI so you can't use Vec?

@m4b
Copy link
Owner Author

m4b commented Apr 8, 2019

It was the other way around, basically:

struct Foo {
  struct Header header;
  uint8_t data [0x8000];
  struct Footer footer; 
}

In rust land I then wanted to:

let x = bytes.pread::<Foo>(0)?; // this fails because we allocate x on the stack.
// check x.header, do stuff, manipulate x.bytes, pwrite

@m4b
Copy link
Owner Author

m4b commented Apr 8, 2019

So I had something like this in mind 😈 , but it still overflows the stack (it also allocates twice which sucks):

impl<'a, Ctx, T> TryFromCtx<'a, Ctx> for Box<T>
where
    T: TryFromCtx<'a, Ctx>,
    Ctx: Copy,
    error::Error: From<<T as TryFromCtx<'a, Ctx>>::Error>
{
    type Error = error::Error;
    #[inline]
    fn try_from_ctx(src: &'a [u8], ctx: Ctx) -> result::Result<(Self, usize), Self::Error> {
        let res = Box::new(TryFromCtx::try_from_ctx(src, ctx)?);
        Ok((Box::new(res.0), res.1))
    }
}

#[test]
fn pwrite_big_struct() {
    use scroll::{LE, Pwrite};
    #[derive(Pread, Pwrite, Clone)]
    struct Big {
        arr: [u8; 10000000],
    }
    let mut bytes = vec![0; 10000001];
    let big = bytes.pread_with::<Box<Big>>(0, LE).unwrap();
    let _ = bytes.pwrite_with(big.as_ref(), 0, LE).unwrap();
    assert!(false);
}

I'm wondering if its even possible to write in scroll TryFromCtx as is without blowing up the stack?

@philipc
Copy link
Contributor

philipc commented Apr 8, 2019

If Big::try_from_ctx(src, ctx) overflows the stack, then wrapping it in Box::new won't change anything. You need placement new or an equivalent (pass in a preallocated pointer for the destination of the read, instead of returning a value), which is incompatible with returning by value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants