Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial: how to add paddings #48

Open
Pzixel opened this issue Dec 10, 2018 · 6 comments
Open

Tutorial: how to add paddings #48

Pzixel opened this issue Dec 10, 2018 · 6 comments

Comments

@Pzixel
Copy link

Pzixel commented Dec 10, 2018

Hey! It's me again.

I'm continuing implementing CLR runtime in rust, and I wonder if there is some ideomatic way to express paddings between fields. Let's take an example:

image

I'm writing following code:

#[repr(C)]
#[derive(Debug)]
struct MetadataRoot<'a> {
    pub signature: u32,
    pub major_version: u16,
    pub minor_version: u16,
    _reserved: u32,
    pub length: u32,
    pub version: &'a str,
    pub flags: u16,
    pub streams: u16,
    pub stream_headers: Vec<StreamHeader<'a>>
}

#[repr(C)]
#[derive(Debug)]
struct StreamHeader<'a> {
    pub offset: u32,
    pub size: u32,
    pub name: &'a str
}

impl<'a> TryFromCtx<'a, Endian> for MetadataRoot<'a> {
    type Error = scroll::Error;
    type Size = usize;
    // and the lifetime annotation on `&'a [u8]` here
    fn try_from_ctx(src: &'a [u8], endian: Endian) -> Result<(Self, Self::Size), Self::Error> {
        let offset = &mut 0;
        let signature = src.gread_with(offset, endian)?;
        let major_version = src.gread_with(offset, endian)?;
        let minor_version = src.gread_with(offset, endian)?;
        let reserved = src.gread_with(offset, endian)?;
        let length = src.gread_with(offset, endian)?;
        let version = src.gread(offset)?;
        let padding = 4 - *offset % 4;
        if padding < 4 {
            *offset += padding;
        }
        let flags = src.gread_with(offset, endian)?;
        let streams: u16 = src.gread_with(offset, endian)?;
        let mut stream_headers = Vec::with_capacity(streams as usize);
        for _ in 0..streams {
            stream_headers.push( src.gread(offset)?);
            let padding = 4 - *offset % 4;
            if padding < 4 {
                *offset += padding;
            }
        }
        Ok((
            Self {
                signature,
                major_version,
                minor_version,
                _reserved: reserved,
                length,
                version,
                flags,
                streams,
                stream_headers
            },
            *offset,
        ))
    }
}

impl<'a> TryFromCtx<'a, Endian> for StreamHeader<'a> {
    type Error = scroll::Error;
    type Size = usize;
    // and the lifetime annotation on `&'a [u8]` here
    fn try_from_ctx(src: &'a [u8], endian: Endian) -> Result<(Self, Self::Size), Self::Error> {
        let offset = &mut 0;
        let offset_field = src.gread_with(offset, endian)?;
        let size = src.gread_with(offset, endian)?;
        let name = src.gread(offset)?;
        Ok((
            Self {
                offset: offset_field,
                size,
                name
            },
            *offset,
        ))
    }
}

I'd like to ask if there is more ideomatic way to express such a structure, and if there isn't, maybe it may give you some thoughts about implementing it? It may be really useful.

P.S. Great lib, thank you for your work :)

@m4b
Copy link
Owner

m4b commented Dec 11, 2018

Hmm, why doesn’t this work ?

I’m also slightly confused by their spec; they say:

call x the amount allocated to hold the string.
call m the length of the string.
And then if I understood correctly, x is m rounded up to a multiple of 4.

But is the value in the struct field “Length” x or m?

Anyway, your code looks good; for padding I don’t think there’s a robust approach. Do be aware that scroll basically reads and writes packed structs (which I think you’re aware) though.

For your particular case I’d say you’re doing it just right except probably some off by one issue or other similar things when dealing with bits at that level, unless I’ve misunderstood your issue.

Only other thing, and not sure precisely how it would work with your second struct with a str, but if you don’t want to allocate and are ok with unsafe grabbing a zero copy slice of your stream headers should be possible with std::slice::from_raw_parts though be aware the backing StreamHeader would have to be precisely the in memory layout (and not the nicer api you have now with a string slice), but it would be zero copy and almost certainly much faster depending on how many headers there are :)

@m4b
Copy link
Owner

m4b commented Dec 11, 2018

Alternatively, you could play around with doing something like:

struct PaddedString(&’a str);

And then implement all the padding logic as a TryFromCtx impl for it, then you could gread it out, which would properly update the offset to just past the padding boundary.

If you had a similar wrapper for other non standard types, and as long as you update the offset appropriately in each, you should be then able to just derive the base Pread impl.

Just exploring though, maybe not a great idea, some other issue later on with lifetimes or whatever comes up, etc :)

@Pzixel
Copy link
Author

Pzixel commented Dec 11, 2018

No-no-no, I edited this issue several times. This code works seamless, I just wonder if there is some more "derivish" way to specify offset via attribute or something. I noticed that attribute works pretty well unless there are &str or paddings.

Only other thing, and not sure precisely how it would work with your second struct with a str, but if you don’t want to allocate and are ok with unsafe grabbing a zero copy slice of your stream headers should be possible with std::slice::from_raw_parts though be aware the backing StreamHeader would have to be precisely the in memory layout (and not the nicer api you have now with a string slice), but it would be zero copy and almost certainly much faster depending on how many headers there are :)

I'm almost sure that &str is already zero-copy since it should point somewhere, isn't it? And you can't refer locals so it's should be some slice from src.

Alternatively, you could play around with doing something like:

Hmm, interesting idea. However, in this case either I should write wrappers to extract underlying &str or make client work with newtype, just because serializer isn't that smart.

I just think that cases like StreamHeader should be derived automatically. It's more complicated in case of MetadataRoot since you have to populate a vec, but it may be automated too, I guess.

@m4b
Copy link
Owner

m4b commented Dec 11, 2018

Yes &str is zero copy, but I was referring to your Vec of stream headers.

For the padded string yes would force client to use newtype or you can just extract the contents and place it in your struct in your TryFrom impl. But probably only worthwhile if you have multiple padding cases for strs.

Dunno :)

@Pzixel
Copy link
Author

Pzixel commented Dec 11, 2018

Ideally, it should be something like:

#[repr(C)]
#[derive(Debug, Pread)]
struct MetadataRoot<'a> {
    pub signature: u32,
    pub major_version: u16,
    pub minor_version: u16,
    _reserved: u32,
    pub length: u32,
    #[padding(4)]
    pub version: &'a str,
    pub flags: u16,
    pub streams: u16,
    #[padding_each(4)]
    pub stream_headers: Vec<StreamHeader<'a>>
}

#[repr(C)]
#[derive(Debug, Pread)]
struct StreamHeader<'a> {
    pub offset: u32,
    pub size: u32,
    pub name: &'a str
}

Yes &str is zero copy, but I was referring to your Vec of stream headers.

Ah, ok. I don't seem much troubles in copying 2 x u32+reference :)

The main reason for Vec because there is an empty padding between elements since slice expects elements to be tied one by one.

@m4b
Copy link
Owner

m4b commented Dec 11, 2018

Ah yea, that’s very similar to #33. There is definitely some desire, and I’d be interested in a backwards compatible proposal, but I don’t think anyone’s found the time to play around with what it would look like, both at implementation level and api level.

I will note it’s starting to turn scroll into a more complicated deserialization and serialization library, which was never it’s explicit goal; on the other hand it’s probably very reasonable to have padding like attributes or other such details for advanced use cases like yours ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants