You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug Shape and LegacyShape are using std::size_t for the return type of rank(). In some places in tt-metal/ttnn/cpp/ttnn/operations/data_movement/reshape_view/reshape.cpp there are statements like this: for (int i=0; i <tensor_shape.rank()-2; i++) where unsigned-signed can lead to overflow for certain ranks (in this case when rank is 1).
To Reproduce
Steps to reproduce the behavior:
Run the reshape op in TILE layout for this shape: tensor<128x1xf32> -> tensor<128xf32>
Overflow will cause the for loop to continue past the shape rank which will lead to error: Index is out of bounds for the rank, should be between 0 and 0 however is 1
`
The text was updated successfully, but these errors were encountered:
This PR adds TTNN workarounds for these Metal issues:
- tenstorrent/tt-metal#13361 - By decomposing
`reduce(keepDim=false)` into `reduce(keepDim=true) + reshape`
- tenstorrent/tt-metal#16118 - By annulling
dimensions argument when all dims are being reduced
As part of this work I've also:
- Enabled conversion of `stablehlo.reduce` op with multiple reduce
dimensions
- Added reduce ops verifiers in TTIR
- Added a separate function in TTNNWorkarounds to run rewrite patterns
for decomposition and layout workarounds
- Added lots of unit tests for reduce ops to cover conversions and
verifiers
- Added lots of silicon tests for reduce ops
Opened issue #1624 on
myself to revert these workarounds once Metal issues are fixed.
Closes#805, #848
After implementing these workarounds and running tests, I've encountered
[another Metal
issue](tenstorrent/tt-metal#16104), this time
in `reshape` op. I've debugged it and I have a local fix, I will send a
PR to fix it in Metal repo, confirmed with reshape op owners. I've
opened myself an issue
#1640 to enable Reduce ops
silicon tests after this fix is uplifted.
Another issue that I've encountered while working on this is that after
the workaround pass decompositions, if we are changing the shapes of the
ops tensors, that means that their layout needs to be changed too, but
layout pass is done before the workaround pass. I've managed to solve it
by reusing the layout of the input tensor, but I am not sure if that is
a good solution and maybe we need to repeat some of the layout logic
again after workaround decompositions. FYI @sdjordjevicTT
Here is the example TTNN IR before the workarounds:
```
%3 = "ttnn.sum"(%2) <{dim_arg = [0: i32, 1 : i32, 2: i32], keep_dim = false}> : (tensor<128x32x4xf32, #ttnn_layout2>) -> tensor<1xf32, #ttnn_layout2>
```
and after the workarounds:
```
%3 = "ttnn.sum"(%2) <{keep_dim = true}> : (tensor<128x32x4xf32, #ttnn_layout2>) -> tensor<1x1x1xf32, #ttnn_layout2>
%4 = "ttnn.reshape"(%3) <{shape = [1 : i32]}> : (tensor<1x1x1xf32, #ttnn_layout2>) -> tensor<1xf32, #ttnn_layout3>
```
Describe the bug
Shape
andLegacyShape
are usingstd::size_t
for the return type ofrank()
. In some places intt-metal/ttnn/cpp/ttnn/operations/data_movement/reshape_view/reshape.cpp
there are statements like this:for (int i=0; i <tensor_shape.rank()-2; i++)
where unsigned-signed can lead to overflow for certain ranks (in this case when rank is 1).To Reproduce
Steps to reproduce the behavior:
tensor<128x1xf32> -> tensor<128xf32>
Index is out of bounds for the rank, should be between 0 and 0 however is 1
`
The text was updated successfully, but these errors were encountered: