Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX backend #364

Merged
merged 1 commit into from
Feb 26, 2024
Merged

AVX backend #364

merged 1 commit into from
Feb 26, 2024

Conversation

spapinistarkware
Copy link
Contributor

@spapinistarkware spapinistarkware commented Feb 20, 2024

This change is Reviewable

@spapinistarkware
Copy link
Contributor Author

spapinistarkware commented Feb 20, 2024

Current dependencies on/for this PR:

This stack of pull requests is managed by Graphite.

@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_bit_reverse branch from 16a6c91 to 07995bf Compare February 22, 2024 06:32
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch from bfe3288 to f78e2db Compare February 22, 2024 06:32
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_bit_reverse branch 2 times, most recently from 01261a4 to 4147e06 Compare February 22, 2024 12:31
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch from f78e2db to 47a0728 Compare February 22, 2024 12:32
This was referenced Feb 22, 2024
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_bit_reverse branch from 4147e06 to 14d65ee Compare February 22, 2024 12:43
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch from 47a0728 to cd7297a Compare February 22, 2024 12:43
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch 2 times, most recently from 895d064 to 2098156 Compare February 26, 2024 11:44
@spapinistarkware spapinistarkware changed the base branch from spapini/02-16-AVX_bit_reverse to dev February 26, 2024 11:54
@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch from 2098156 to d026f8d Compare February 26, 2024 11:54
Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 12 files at r1, 7 of 9 files at r2, all commit messages.
Reviewable status: 9 of 13 files reviewed, 8 unresolved discussions (waiting on @spapinistarkware)


src/core/utils.rs line 16 at r2 (raw file):

}

/// Performs a naive bit-reversal permutation.

Suggestion:

 inplace.

src/core/utils.rs line 21 at r2 (raw file):

///
/// Panics if the length of the slice is not a power of two.
// TODO(AlonH): Consider benchmarking this function.

Remove

Code quote:

// TODO(AlonH): Consider benchmarking this function.

src/core/backend/avx512/mod.rs line 17 at r2 (raw file):

// BaseField.
type PackedBaseField = [BaseField; 16];

Extract to a constant in the base field avx file?

Code quote:

16

src/core/backend/avx512/mod.rs line 27 at r2 (raw file):

    fn bit_reverse_column(column: &mut Self::Column) {
        if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {

// Fallback to cpu bit_reverse.

Code quote:

if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {

src/core/backend/avx512/mod.rs line 39 at r2 (raw file):

    fn zeros(len: usize) -> Self {
        Self {
            data: vec![PackedBaseField::default(); (len + 15) / 16],

you can use
math::usize_div_ceil

Code quote:

(len + 15) / 16

src/core/backend/avx512/mod.rs line 59 at r2 (raw file):

    type Output = BaseField;
    fn index(&self, index: usize) -> &Self::Output {
        &self.data[index / 8][index % 8]

why is it 8 here and not 16?

Code quote:

8

src/core/backend/cpu/mod.rs line 22 at r2 (raw file):

    fn bit_reverse_column(column: &mut Self::Column) {
        bit_reverse(&mut column[..])

Doesn't it work?

Suggestion:

bit_reverse(column)

src/core/fields/m31.rs line 16 at r2 (raw file):

#[repr(transparent)]
#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, PartialOrd, Ord, Hash, Pod, Zeroable)]

What's this change?

Code quote:

Pod, Zeroable

@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch from d026f8d to 03e1b4a Compare February 26, 2024 13:07
Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 9 of 13 files reviewed, 8 unresolved discussions (waiting on @shaharsamocha7)


src/core/utils.rs line 16 at r2 (raw file):

}

/// Performs a naive bit-reversal permutation.

Done.


src/core/utils.rs line 21 at r2 (raw file):

Previously, shaharsamocha7 wrote…

Remove

Done.


src/core/backend/avx512/mod.rs line 17 at r2 (raw file):

Previously, shaharsamocha7 wrote…

Extract to a constant in the base field avx file?

Done.


src/core/backend/avx512/mod.rs line 27 at r2 (raw file):

Previously, shaharsamocha7 wrote…

// Fallback to cpu bit_reverse.

Done.


src/core/backend/avx512/mod.rs line 39 at r2 (raw file):

Previously, shaharsamocha7 wrote…

you can use
math::usize_div_ceil

Done.


src/core/backend/avx512/mod.rs line 59 at r2 (raw file):

Previously, shaharsamocha7 wrote…

why is it 8 here and not 16?

Bug that is fixed later in the stack. Fixed here now.


src/core/backend/cpu/mod.rs line 22 at r2 (raw file):

Previously, shaharsamocha7 wrote…

Doesn't it work?

Done.


src/core/fields/m31.rs line 16 at r2 (raw file):

Previously, shaharsamocha7 wrote…

What's this change?

This enables the cast_slice_mut() call above, in the cpu fallback. It's from the bytemuck library

Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 12 files at r1, 2 of 9 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @spapinistarkware)


src/core/backend/avx512/mod.rs line 59 at r2 (raw file):

Previously, spapinistarkware (Shahar Papini) wrote…

Bug that is fixed later in the stack. Fixed here now.

Add a unit test


src/core/fields/m31.rs line 16 at r2 (raw file):

Previously, spapinistarkware (Shahar Papini) wrote…

This enables the cast_slice_mut() call above, in the cpu fallback. It's from the bytemuck library

Can we add the fallback thing in another pr?
Together with this lib and the base field change?

Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)


src/core/fields/m31.rs line 16 at r2 (raw file):

Previously, shaharsamocha7 wrote…

Can we add the fallback thing in another pr?
Together with this lib and the base field change?

Why do you want it in another PR? It's a few lines. Will it help you check it better?

@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch from 03e1b4a to 9404723 Compare February 26, 2024 13:42
Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 13 files reviewed, 1 unresolved discussion (waiting on @spapinistarkware)


src/core/fields/m31.rs line 16 at r2 (raw file):

Previously, spapinistarkware (Shahar Papini) wrote…

Why do you want it in another PR? It's a few lines. Will it help you check it better?

It won't help me to check it.

IIUC, this is a different logic that maps the way we store columns between CPU, AVX impls.
I think it is better that it will enter in a separate pr. You don't agree?

Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @spapinistarkware)


src/core/backend/avx512/mod.rs line 31 at r4 (raw file):

        // Fallback to cpu bit_reverse.
        if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {
            let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]);

Can it go to a separate function that cast AVX column to CPU column?
Also, I think this function should have a test.

Code quote:

let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]);

src/core/backend/avx512/mod.rs line 32 at r4 (raw file):

        if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {
            let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]);
            utils::bit_reverse(&mut data[..column.length]);

Suggestion:

data

@spapinistarkware spapinistarkware force-pushed the spapini/02-16-AVX_backend branch 2 times, most recently from 5f28446 to 1aa5592 Compare February 26, 2024 15:10
Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r5.
Reviewable status: 11 of 13 files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)


src/core/backend/avx512/mod.rs line 31 at r4 (raw file):

Previously, shaharsamocha7 wrote…

Can it go to a separate function that cast AVX column to CPU column?
Also, I think this function should have a test.

Done.


src/core/backend/avx512/mod.rs line 32 at r4 (raw file):

        if column.data.len().ilog2() < bit_reverse::MIN_LOG_SIZE {
            let data: &mut [BaseField] = cast_slice_mut(&mut column.data[..]);
            utils::bit_reverse(&mut data[..column.length]);

Done.

Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @spapinistarkware)


src/core/backend/avx512/mod.rs line 6 at r5 (raw file):

use bytemuck::cast_slice;
use bytemuck::checked::cast_slice_mut;

Why cast_slice is not from checked?

Code quote:

use bytemuck::cast_slice;
use bytemuck::checked::cast_slice_mut;

src/core/backend/avx512/mod.rs line 54 at r5 (raw file):

    fn zeros(len: usize) -> Self {
        Self {
            data: vec![PackedBaseField::default(); len.div_ceil(K_ELEMENTS)],

cargo test fails me on:
error[E0658]: use of unstable library feature 'int_roundings'

How is that pass CI?

Code quote:

div_ceil

Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)


src/core/backend/avx512/mod.rs line 6 at r5 (raw file):

Previously, shaharsamocha7 wrote…

Why cast_slice is not from checked?

Becaause it adds runtime checks:)

Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @shaharsamocha7)


src/core/backend/avx512/mod.rs line 54 at r5 (raw file):

Previously, shaharsamocha7 wrote…

cargo test fails me on:
error[E0658]: use of unstable library feature 'int_roundings'

How is that pass CI?

Works for me.

shaharsamocha7
shaharsamocha7 previously approved these changes Feb 26, 2024
Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @spapinistarkware)


src/core/backend/avx512/mod.rs line 6 at r5 (raw file):

Previously, spapinistarkware (Shahar Papini) wrote…

Becaause it adds runtime checks:)

so why the cast_slice_mut is checked?

shaharsamocha7
shaharsamocha7 previously approved these changes Feb 26, 2024
Copy link
Collaborator

@shaharsamocha7 shaharsamocha7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r6, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @spapinistarkware)

Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 13 files reviewed, all discussions resolved (waiting on @shaharsamocha7)


src/core/backend/avx512/mod.rs line 6 at r5 (raw file):

Previously, shaharsamocha7 wrote…

so why the cast_slice_mut is checked?

Oops. Thanks

@spapinistarkware spapinistarkware enabled auto-merge (squash) February 26, 2024 16:09
Copy link
Contributor Author

@spapinistarkware spapinistarkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 12 files at r1, 3 of 9 files at r2, 2 of 3 files at r3, 2 of 3 files at r5, 3 of 4 files at r6, 1 of 1 files at r7, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @spapinistarkware)

@spapinistarkware spapinistarkware merged commit fe76a48 into dev Feb 26, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants