-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX backend #364
AVX backend #364
Conversation
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
16a6c91
to
07995bf
Compare
bfe3288
to
f78e2db
Compare
01261a4
to
4147e06
Compare
f78e2db
to
47a0728
Compare
4147e06
to
14d65ee
Compare
47a0728
to
cd7297a
Compare
895d064
to
2098156
Compare
2098156
to
d026f8d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 12 files at r1, 7 of 9 files at r2, all commit messages.
Reviewable status: 9 of 13 files reviewed, 8 unresolved discussions (waiting on @spapinistarkware)
src/core/utils.rs
line 16 at r2 (raw file):
} /// Performs a naive bit-reversal permutation.
Suggestion:
inplace.
src/core/utils.rs
line 21 at r2 (raw file):
/// /// Panics if the length of the slice is not a power of two. // TODO(AlonH): Consider benchmarking this function.
Remove
Code quote:
// TODO(AlonH): Consider benchmarking this function.
src/core/backend/avx512/mod.rs
line 17 at r2 (raw file):
// BaseField. type PackedBaseField = [BaseField; 16];
Extract to a constant in the base field avx file?
Code quote:
16
src/core/backend/avx512/mod.rs
line 27 at r2 (raw file):
fn bit_reverse_column(column: &mut Self::Column) { if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {
// Fallback to cpu bit_reverse.
Code quote:
if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {
src/core/backend/avx512/mod.rs
line 39 at r2 (raw file):
fn zeros(len: usize) -> Self { Self { data: vec![PackedBaseField::default(); (len + 15) / 16],
you can use
math::usize_div_ceil
Code quote:
(len + 15) / 16
src/core/backend/avx512/mod.rs
line 59 at r2 (raw file):
type Output = BaseField; fn index(&self, index: usize) -> &Self::Output { &self.data[index / 8][index % 8]
why is it 8 here and not 16?
Code quote:
8
src/core/backend/cpu/mod.rs
line 22 at r2 (raw file):
fn bit_reverse_column(column: &mut Self::Column) { bit_reverse(&mut column[..])
Doesn't it work?
Suggestion:
bit_reverse(column)
src/core/fields/m31.rs
line 16 at r2 (raw file):
#[repr(transparent)] #[derive(Copy, Clone, Debug, Default, PartialEq, Eq, PartialOrd, Ord, Hash, Pod, Zeroable)]
What's this change?
Code quote:
Pod, Zeroable
d026f8d
to
03e1b4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 13 files reviewed, 8 unresolved discussions (waiting on @shaharsamocha7)
src/core/utils.rs
line 16 at r2 (raw file):
} /// Performs a naive bit-reversal permutation.
Done.
src/core/utils.rs
line 21 at r2 (raw file):
Previously, shaharsamocha7 wrote…
Remove
Done.
src/core/backend/avx512/mod.rs
line 17 at r2 (raw file):
Previously, shaharsamocha7 wrote…
Extract to a constant in the base field avx file?
Done.
src/core/backend/avx512/mod.rs
line 27 at r2 (raw file):
Previously, shaharsamocha7 wrote…
// Fallback to cpu bit_reverse.
Done.
src/core/backend/avx512/mod.rs
line 39 at r2 (raw file):
Previously, shaharsamocha7 wrote…
you can use
math::usize_div_ceil
Done.
src/core/backend/avx512/mod.rs
line 59 at r2 (raw file):
Previously, shaharsamocha7 wrote…
why is it 8 here and not 16?
Bug that is fixed later in the stack. Fixed here now.
src/core/backend/cpu/mod.rs
line 22 at r2 (raw file):
Previously, shaharsamocha7 wrote…
Doesn't it work?
Done.
src/core/fields/m31.rs
line 16 at r2 (raw file):
Previously, shaharsamocha7 wrote…
What's this change?
This enables the cast_slice_mut() call above, in the cpu fallback. It's from the bytemuck library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 12 files at r1, 2 of 9 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @spapinistarkware)
src/core/backend/avx512/mod.rs
line 59 at r2 (raw file):
Previously, spapinistarkware (Shahar Papini) wrote…
Bug that is fixed later in the stack. Fixed here now.
Add a unit test
src/core/fields/m31.rs
line 16 at r2 (raw file):
Previously, spapinistarkware (Shahar Papini) wrote…
This enables the cast_slice_mut() call above, in the cpu fallback. It's from the bytemuck library
Can we add the fallback thing in another pr?
Together with this lib and the base field change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)
src/core/fields/m31.rs
line 16 at r2 (raw file):
Previously, shaharsamocha7 wrote…
Can we add the fallback thing in another pr?
Together with this lib and the base field change?
Why do you want it in another PR? It's a few lines. Will it help you check it better?
03e1b4a
to
9404723
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 13 files reviewed, 1 unresolved discussion (waiting on @spapinistarkware)
src/core/fields/m31.rs
line 16 at r2 (raw file):
Previously, spapinistarkware (Shahar Papini) wrote…
Why do you want it in another PR? It's a few lines. Will it help you check it better?
It won't help me to check it.
IIUC, this is a different logic that maps the way we store columns between CPU, AVX impls.
I think it is better that it will enter in a separate pr. You don't agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @spapinistarkware)
src/core/backend/avx512/mod.rs
line 31 at r4 (raw file):
// Fallback to cpu bit_reverse. if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE { let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]);
Can it go to a separate function that cast AVX column to CPU column?
Also, I think this function should have a test.
Code quote:
let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]);
src/core/backend/avx512/mod.rs
line 32 at r4 (raw file):
if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE { let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]); utils::bit_reverse(&mut data[..column.length]);
Suggestion:
data
5f28446
to
1aa5592
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r5.
Reviewable status: 11 of 13 files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)
src/core/backend/avx512/mod.rs
line 31 at r4 (raw file):
Previously, shaharsamocha7 wrote…
Can it go to a separate function that cast AVX column to CPU column?
Also, I think this function should have a test.
Done.
src/core/backend/avx512/mod.rs
line 32 at r4 (raw file):
if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE { let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]); utils::bit_reverse(&mut data[..column.length]);
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @spapinistarkware)
src/core/backend/avx512/mod.rs
line 6 at r5 (raw file):
use bytemuck::cast_slice; use bytemuck::checked::cast_slice_mut;
Why cast_slice is not from checked?
Code quote:
use bytemuck::cast_slice;
use bytemuck::checked::cast_slice_mut;
src/core/backend/avx512/mod.rs
line 54 at r5 (raw file):
fn zeros(len: usize) -> Self { Self { data: vec![PackedBaseField::default(); len.div_ceil(K_ELEMENTS)],
cargo test fails me on:
error[E0658]: use of unstable library feature 'int_roundings'
How is that pass CI?
Code quote:
div_ceil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)
src/core/backend/avx512/mod.rs
line 6 at r5 (raw file):
Previously, shaharsamocha7 wrote…
Why cast_slice is not from checked?
Becaause it adds runtime checks:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)
src/core/backend/avx512/mod.rs
line 54 at r5 (raw file):
Previously, shaharsamocha7 wrote…
cargo test fails me on:
error[E0658]: use of unstable library feature 'int_roundings'
How is that pass CI?
Works for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @spapinistarkware)
src/core/backend/avx512/mod.rs
line 6 at r5 (raw file):
Previously, spapinistarkware (Shahar Papini) wrote…
Becaause it adds runtime checks:)
so why the cast_slice_mut
is checked?
1aa5592
to
f5f1bba
Compare
f5f1bba
to
057b34c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r6, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @spapinistarkware)
057b34c
to
c681c24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 13 files reviewed, all discussions resolved (waiting on @shaharsamocha7)
src/core/backend/avx512/mod.rs
line 6 at r5 (raw file):
Previously, shaharsamocha7 wrote…
so why the
cast_slice_mut
is checked?
Oops. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 12 files at r1, 3 of 9 files at r2, 2 of 3 files at r3, 2 of 3 files at r5, 3 of 4 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @spapinistarkware)
This change is