Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow for deep expressions in bincode serialization/deserialization #875

Closed
maximebuyse opened this issue Sep 3, 2024 · 8 comments · Fixed by #877 or #1029
Closed

Stack overflow for deep expressions in bincode serialization/deserialization #875

maximebuyse opened this issue Sep 3, 2024 · 8 comments · Fixed by #877 or #1029
Assignees
Labels
bug Something isn't working frontend Issue in the Rust to JSON translation

Comments

@maximebuyse
Copy link
Contributor

With rust code containing deep expressions, hax fails with a stack overflow. For example:

fn main() -> i32 {
        1 + 1 + 1 + 1 + 1 + 1 + 1 + 
        1 + 1 + 1 + 1 + 1 + 1 + 1 + 
        1 + 1 + 1 + 1 + 1 + 1 + 1 + 
        1 + 1 + 1 + 1 + 1 + 1 + 1
}

The stack trace suggests that this has to do with serialization/deserialization with bincode that was introduced in #743. However this problem cannot be reproduced with the merge commit of #743 itself (but it loops for a long time). The version of bincode hasn't changed since then so there might be an interaction with another later change that triggers the stack overflow.

@franziskuskiefer
Copy link
Member

cargo hax json works fine for me. So the frontend seems fine. Only when extracting F* it fails for me (actually it just doesn't finish before I kill it).

@maximebuyse
Copy link
Contributor Author

@franziskuskiefer You were right, the problem is in the engine. It happens in phase specialize where in some cases recursive calls are done twice over the same thing resulting in exponential complexity.
I don't know why this started making Rust stack overflows instead of just taking a long time.

@maximebuyse
Copy link
Contributor Author

It turns out this fix doesn't improve the original problem. The example of deep expressions was only an attempt to minimize that had a different problem.

My current understanding is that when we have a rather big (or deep) module hierarchy, the items in HaxMeta contain a lot of data, and the data for a submodule seems to be duplicated in every parent module. There is no problem for encoding this with bincode, but decoding uses recursion and gives a stack overflow.

I tested a few serialization libraries and most don't support recursive types. Among those that do it seemed that bincode was one of the most efficient (I found none that seemed to behave better with deep recursive data).

@W95Psp
Copy link
Collaborator

W95Psp commented Sep 23, 2024

I think we should bisect here, to understand exactly what went wrong and when

@maximebuyse
Copy link
Contributor Author

The result of bisecting is that the first "bad" commit is 748c488

@maximebuyse
Copy link
Contributor Author

The result of bisecting is that the first "bad" commit is 748c488

Removing is_local from the DefId struct (on the latest main) does fix the problem. It would be really surprising that just this boolean field would make bincode crash. We tried to compare what else is different a json version of the HaxMeta object (that we save with bincode). There are some differences but it is hard to analyze as the file is very big. Minimizing the example is also very hard.

@W95Psp W95Psp added bug Something isn't working frontend Issue in the Rust to JSON translation labels Oct 7, 2024
@W95Psp W95Psp mentioned this issue Oct 7, 2024
4 tasks
@Nadrieril
Copy link
Collaborator

Nadrieril commented Oct 16, 2024

Is the recursive call on the side of hax or inside bincode? If it's in hax you can use stacker to dynamically allocate more stack at each recursive call. This is what charon and rustc use:

// This is the amount of bytes that need to be left on the stack before increasing the size.
// It must be at least as large as the stack required by any code that does not call
// `ensure_sufficient_stack`.
const RED_ZONE: usize = 100 * 1024; // 100k

// Only the first stack that is pushed, grows exponentially (2^n * STACK_PER_RECURSION) from then
// on. This flag has performance relevant characteristics. Don't set it too high.
const STACK_PER_RECURSION: usize = 1024 * 1024; // 1MB

/// Grows the stack on demand to prevent stack overflow. Call this in strategic locations
/// to "break up" recursive calls. E.g. almost any call to `visit_expr` or equivalent can benefit
/// from this.
///
/// Should not be sprinkled around carelessly, as it causes a little bit of overhead.
#[inline]
pub fn ensure_sufficient_stack<R>(f: impl FnOnce() -> R) -> R {
    stacker::maybe_grow(RED_ZONE, STACK_PER_RECURSION, f)
}

If it's in bincode then you should report it upstream, that sounds like something they should be handling gracefully (and they can use stacker too).

@W95Psp
Copy link
Collaborator

W95Psp commented Oct 16, 2024

Yeah, we wanted to use stacker as well, thanks for the tip! :)
But there are a number of odd things around this bug, we need to understand exactly what's going on there!
And I'd like to address #1001 in the same time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working frontend Issue in the Rust to JSON translation
Projects
No open projects
Status: Done
4 participants