Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DecInt without itoa #1245

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Kijewski
Copy link
Contributor

@Kijewski Kijewski commented Dec 19, 2024

Using itoa is unlikely to have any actual benefits for any reasonable usecase. You are unlikely to format lots of integers in a tight loop without performing IO operations in between, so any speed-benefits of itoa are likely moot. The buffer itoa writes to is not re-used by DecInt, which makes the generated code use a memcpy call. itoa's buffer is twice the size as we need, because it is implemented to be able to hold an i128/u128. The i128/u128 implementation of DecInt would panic if the input was outside of -10**19 + 1 ..= 10**20 - 1.

This PR implements the integer stringification naively using a div-mod-10 loop. Users will be able to use integers as path arguments without opting-in to any feature.

This is a breaking change because the feature itoa was removed, and because the implementation for i128/u128 was removed.

Resolves #1202.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me. @SUPERCILEX do you have any concerns about replacing the itoa crate?

If rustix will have its own integer formatting code, could you also add tests covering the min, max and zero cases, and some other assorted positive and negative values, of the supported types, and test that the NUL termination is correct?

src/path/dec_int.rs Outdated Show resolved Hide resolved
@SUPERCILEX
Copy link
Contributor

I don't have strong objections, but I disagree with the premise. There's no memcpy in the assembly: https://rust.godbolt.org/z/qj3zoE7b6 (though there are mov instructions and I haven't looked at it deep enough to see if that's an inlined memcpy).

Can we see a comparison of the assembly please? If it's smaller then code size is a good argument to merge this PR. Otherwise maintaining our own lesser version of itoa seems like a bad idea.

@Kijewski
Copy link
Contributor Author

Kijewski commented Dec 21, 2024

(though there are mov instructions and I haven't looked at it deep enough to see if that's an inlined memcpy).

In the code you linked in line 79 of the assembly code memcpy is called.

Will address the other questions this afternoon/evening (CET).

@SUPERCILEX
Copy link
Contributor

My bad, turns out if you Ctrl+f on godbolt (mobile website) it's not actually rendering all lines so it didn't find anything.

Ok I take back my objection then, I think as long as the code size isn't bad this is an improvement. I'm definitely sad that we have to maintain our own itoa though.

@Kijewski
Copy link
Contributor Author

That's the generated code: https://godbolt.org/z/jdPKs4h7h.

Copy link
Contributor

@SUPERCILEX SUPERCILEX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, but this looks great!

buf,
len: str_buf.len(),
// SAFETY: cannot be less than 2 (initialized with 1 + at least one loop)
len: unsafe { NonZeroU8::new_unchecked(len as u8) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for the unsafe, the compiler optimizes out len: NonZeroU8::new(len as u8).unwrap(),.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's nice.

@@ -96,7 +218,12 @@ impl DecInt {
/// Return the raw byte buffer including the NUL byte.
#[inline]
pub fn as_bytes_with_nul(&self) -> &[u8] {
let init = &self.buf[..=self.len];
let len = self.len.get() as usize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use usize::from(self.len.get()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Using `itoa` is unlikely to have any actual benefits for any reasonable
usecase. You are unlikely to format lots of integers in a tight loop
without performing IO operations in between, so any speed-benefits of
`itoa` are likely moot. The buffer `itoa` writes to is not re-used by
`DecInt`, which makes the generated code use a `memcpy` call. `itoa`'s
buffer is twice the size as we need, because it is implemented to be
able to hold an `i128`/`u128`. The `i128`/`u128` implementation of
`DecInt` would panic if the input was outside of `-10**19 + 1 ..=
10**20 - 1`.

This PR implements the integer stringification naively using a
div-mod-10 loop. Users will be able to use integers as path arguments
without opting-in to any feature.

This is a breaking change because the feature `itoa` was removed, and
because the implementation for `i128`/`u128` was removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove itoa from the public API
3 participants