Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#13901: Wide Reductions with Non-8-Tile Multiples #16251

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

wransom-TT
Copy link
Contributor

Ticket

Link to Github Issue

Problem description

MaxPool2D should support wide reductions with non-max tile multiple C dimensions. This PR implements this feature for non-large kernels.

What's changed

The number of C tiles has been adjusted to use ceiling instead of a normal divide. The reader kernel has been updated to read either the max bytes per reduction, or a small remainder. The compute kernel has been updated to process max tile multiples for the first N-1 blocks, followed by a smaller number of tiles for the last block.

Currently large kernels will not work with non-max tile multiples so test skips have also been added to bypass these combinations with the new non-multiple test cases.

Checklist

  • Post commit CI passes
  • [N/A] Blackhole Post commit (if applicable)
  • [N/A] Model regression CI testing passes (if applicable)
  • [N/A] Device performance regression CI testing passes (if applicable)
  • [N/A] (For models and ops writers) Full new models tests passes
  • New/Existing tests provide coverage for changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant