Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Box anonymous function #299

Merged
merged 2 commits into from
Jun 9, 2024
Merged

Box anonymous function #299

merged 2 commits into from
Jun 9, 2024

Conversation

JohnnyMorganz
Copy link
Collaborator

We are continuing to see stack overflows in the new parser. This is mainly because the AST type have quite large sizes.

Using the command cargo +nightly rustc -- -Zprint-type-sizes, we can find the size of each AST structure.

The most expensive struct in terms of size is NumericFor:

print-type-size type: `ast::NumericFor`: 3936 bytes, alignment: 8 bytes
print-type-size     field `.for_token`: 136 bytes
print-type-size     field `.index_variable`: 136 bytes
print-type-size     field `.equal_token`: 136 bytes
print-type-size     field `.start_end_comma`: 136 bytes
print-type-size     field `.do_token`: 136 bytes
print-type-size     field `.block`: 320 bytes
print-type-size     field `.end_token`: 136 bytes
print-type-size     field `.end_step_comma`: 136 bytes
print-type-size     field `.start`: 888 bytes
print-type-size     field `.end`: 888 bytes
print-type-size     field `.step`: 888 bytes

The biggest sizes come from start, end and step, which are all Expression nodes. So Expression has a size of 888 bytes.

Looking at the size of Expression, we can see the most significant size is used by the Function variant, at 888 bytes, whilst the next largest variant is TableConstructor at 304 bytes: a 584 bytes difference:

print-type-size type: `ast::Expression`: 888 bytes, alignment: 8 bytes
print-type-size     variant `Function`: 888 bytes
print-type-size         field `.0`: 888 bytes
print-type-size     variant `TableConstructor`: 304 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 296 bytes, alignment: 8 bytes

If we Box the anonymous function variant, we will be able to shave off at least 584 bytes on every single Expression node, which can add up to a significant saving. We do that in this PR.

After doing so, the size of Expression is now 296 bytes, taken up by the largest variant TableConstructor. NumericFor decreases from 3936 bytes to 2160 bytes, a 1176 bytes decrease.

The complete type size list are attached for reference

type-sizes-before-change.txt
type-sizes-after-change.txt

Copy link
Contributor

@chriscerie chriscerie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes stack overflow in #297 in my testing.

@JohnnyMorganz
Copy link
Collaborator Author

JohnnyMorganz commented Jun 9, 2024

With all features enabled, Expression is 2080 bytes large:

print-type-size type: `ast::Expression`: 2080 bytes, alignment: 8 bytes
print-type-size     variant `Function`: 2080 bytes
print-type-size         field `.0`: 2080 bytes
print-type-size     variant `TypeAssertion`: 888 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.type_assertion`: 872 bytes, alignment: 8 bytes
print-type-size         field `.expression`: 8 bytes
print-type-size     variant `IfExpression`: 464 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 456 bytes, alignment: 8 bytes
print-type-size     variant `TableConstructor`: 304 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 296 bytes, alignment: 8 bytes

Meaning NumericFor is 8384 bytes. Boxing anoynmous function brings Expression down to 880 bytes:

print-type-size type: `ast::Expression`: 880 bytes, alignment: 8 bytes
print-type-size     variant `TypeAssertion`: 880 bytes
print-type-size         field `.type_assertion`: 872 bytes
print-type-size         field `.expression`: 8 bytes
print-type-size     variant `IfExpression`: 464 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 456 bytes, alignment: 8 bytes
print-type-size     variant `TableConstructor`: 304 bytes
print-type-size         padding: 8 bytes
print-type-size         field `.0`: 296 bytes, alignment: 8 bytes

The largest type right now being TypeInfo. It is difficult to reduce TypeInfo without improvements to TokenReference sizes. For now, we will leave it alone, but maybe it is worth boxing TypeAssertion in future.

All sizes for reference:
type-sizes-all-features-before.txt
type-sizes-all-features-after.txt

@JohnnyMorganz JohnnyMorganz merged commit 093fe84 into main Jun 9, 2024
2 checks passed
@JohnnyMorganz JohnnyMorganz deleted the box-anonymous-function branch June 9, 2024 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants