Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NDTensorsCUDAExt] Remove and test for scalar indexing #1245

Merged
merged 62 commits into from
Nov 15, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
7576c3b
When calling `mul!` with a `Transpose(CuArray)` generic CPU mul is ca…
kmp5VT Nov 9, 2023
efcde0c
format
kmp5VT Nov 9, 2023
ebb1713
Replace undef with rand
kmp5VT Nov 10, 2023
077527b
Implement Adjoint and Transpose expose mul! functions for cuda and Metal
kmp5VT Nov 10, 2023
65ea45a
Add transpose and adjoint tests to Unwrap
kmp5VT Nov 10, 2023
0b6b07e
Merge branch 'main' into kmp5/debug/scalar_indexing
kmp5VT Nov 10, 2023
3ad5551
Changes per matts comments
kmp5VT Nov 10, 2023
256276f
Merge branch 'kmp5/debug/scalar_indexing' of github.com:kmp5VT/ITenso…
kmp5VT Nov 10, 2023
dd31b94
Fix typeo and mark mul! tests as broken to keep record in case they a…
kmp5VT Nov 10, 2023
d7893e9
add svd to using linearalgebra
kmp5VT Nov 12, 2023
b8e732b
Add GPUArraysCore to NDTensors
kmp5VT Nov 13, 2023
a567d28
Start removing scalar and update combiner test
kmp5VT Nov 13, 2023
5d103cf
Move using GPUArraysCore to devicelist.jl
kmp5VT Nov 13, 2023
77411a9
Allowscalar in dense and reorder to put allowscalar together in code …
kmp5VT Nov 13, 2023
8f4505a
Use GPUArraysCore. because things are failing without it
kmp5VT Nov 13, 2023
21aa74e
only run `test_broken` if not using CPU
kmp5VT Nov 13, 2023
0deb8c0
Add a few definitions in NDTensorsCUDAExt
kmp5VT Nov 13, 2023
686e3d6
Add allowscalar to blocksparse tests
kmp5VT Nov 13, 2023
1c54aa1
Add some fixes for scalar indexing block dims
kmp5VT Nov 13, 2023
7fa60f6
No need to expose here since they are Tensors not arrays
kmp5VT Nov 13, 2023
9667482
Add allowscalar to unwrap test
kmp5VT Nov 13, 2023
1897e2e
Fix linearalgebra with allowscalar
kmp5VT Nov 13, 2023
6eec341
format
kmp5VT Nov 13, 2023
c40a76f
Add allowscalar theres an issue in zygote
kmp5VT Nov 13, 2023
53fac06
return unexposed(dest)
kmp5VT Nov 13, 2023
a55745b
Remove unecessary GPUArraysCore and add @allowscalar
kmp5VT Nov 13, 2023
6f7c9b0
remove extra parenthesis
kmp5VT Nov 13, 2023
f3f5e18
Remove unecessary code
kmp5VT Nov 13, 2023
6afb6db
Remove using GPUArraysCore from top level test
kmp5VT Nov 13, 2023
2c01b12
need `Base.setindex!` missing the base portion
kmp5VT Nov 13, 2023
23256a9
Remove CPU check and fix expose setindex call
kmp5VT Nov 13, 2023
6f2b593
Remove CUDA.allowscalar()
kmp5VT Nov 13, 2023
24da742
Update the Dense function to be no-copy
kmp5VT Nov 13, 2023
3ee4f70
Allowscalar checks and silence inner with an H
kmp5VT Nov 13, 2023
afd718c
format
kmp5VT Nov 13, 2023
b61d17c
Use array over Array
kmp5VT Nov 13, 2023
3b87a47
Move GPUArraysCore: @allowscalar to top of files which use it
kmp5VT Nov 13, 2023
a74dc1c
Fix CuArray scalar index issue in copyto! in the same way as metal
kmp5VT Nov 13, 2023
657ef3e
Revert and remove H
kmp5VT Nov 13, 2023
df4dde9
Todo message
kmp5VT Nov 13, 2023
0299eda
format
kmp5VT Nov 13, 2023
1412eb6
Move base.copy version to `copyto.jl`
kmp5VT Nov 14, 2023
b1ace97
Revert changes to Dense constructor
kmp5VT Nov 14, 2023
76a5c8f
Add a mul! function to address CUDA problem
kmp5VT Nov 14, 2023
44696d3
Copy parent then transpose in mul! call
kmp5VT Nov 14, 2023
3956a58
format
kmp5VT Nov 14, 2023
3aaabec
Merge branch 'main' into kmp5/debug/scalar_indexing
kmp5VT Nov 14, 2023
9d020ce
Add mul!! test to `diag.jl` to test issue in `CUDA.jl`
kmp5VT Nov 14, 2023
b1ec109
format
kmp5VT Nov 14, 2023
da6aca0
Add tests for complicated wrapper (breaks in CUDA.jl)
kmp5VT Nov 15, 2023
78d4d06
Consistently annotate loops as @allowscalar
kmp5VT Nov 15, 2023
b40f227
test mul! instead of mul!!
kmp5VT Nov 15, 2023
ae3fb94
format
kmp5VT Nov 15, 2023
180da60
remove outdated todo comment
kmp5VT Nov 15, 2023
2e05392
change .+ to +, comment out code and add todo
kmp5VT Nov 15, 2023
7e99a57
Don't comment out testset
kmp5VT Nov 15, 2023
4d330e5
add mul!! test as a second verification
kmp5VT Nov 15, 2023
5c019bc
Update NDTensors/ext/NDTensorsCUDAExt/NDTensorsCUDAExt.jl
kmp5VT Nov 15, 2023
c76a6bb
Update NDTensors/ext/NDTensorsCUDAExt/mul.jl
kmp5VT Nov 15, 2023
2650f60
Update NDTensors/ext/NDTensorsCUDAExt/copyto.jl
kmp5VT Nov 15, 2023
85aee99
Aphabetize and add compat
kmp5VT Nov 15, 2023
21aa398
Merge branch 'kmp5/debug/scalar_indexing' of github.com:kmp5VT/ITenso…
kmp5VT Nov 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion NDTensors/ext/NDTensorsCUDAExt/NDTensorsCUDAExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ using NDTensors.SetParameters
using NDTensors.Unwrap
using Adapt
using Functors
using LinearAlgebra
using LinearAlgebra: LinearAlgebra, Adjoint, Transpose

if isdefined(Base, :get_extension)
using CUDA
Expand All @@ -24,4 +24,5 @@ include("iscu.jl")
include("adapt.jl")
include("indexing.jl")
include("linearalgebra.jl")
include("mul.jl")
end
1 change: 1 addition & 0 deletions NDTensors/ext/NDTensorsCUDAExt/imports.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ import NDTensors:
import NDTensors.SetParameters: nparameters, get_parameter, set_parameter, default_parameter

import .CUDA: CuArrayAdaptor
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
import LinearAlgebra: mul!
27 changes: 27 additions & 0 deletions NDTensors/ext/NDTensorsCUDAExt/mul.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# This was calling generic matrix multiplication.
# TODO: Raise an issue with `CUDA.jl`.
function mul!(
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
CM::Exposed{<:CuArray,<:LinearAlgebra.Transpose},
AM::Exposed{<:CuArray},
BM::Exposed{<:CuArray},
α,
β,
)
return mul!(parent(CM), transpose(BM), transpose(AM), α, β)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved

kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
return unexpose(CM)
end

# This was calling generic matrix multiplication.
# TODO: Raise an issue with `CUDA.jl`.
function mul!(
CM::Exposed{<:CuArray,<:LinearAlgebra.Adjoint},
AM::Exposed{<:CuArray},
BM::Exposed{<:CuArray},
α,
β,
)
return mul!(parent(CM), BM', AM', α, β)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved

kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
return unexpose(CM)
end
10 changes: 10 additions & 0 deletions NDTensors/ext/NDTensorsMetalExt/mul.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,13 @@ function LinearAlgebra.mul!(
mul!(transpose(CM), transpose(BM), transpose(AM), α, β)
return unexpose(CM)
end

# This was calling generic matrix multiplication.
# TODO: Raise an issue with `Metal.jl`.
function LinearAlgebra.mul!(
CM::Exposed{<:MtlArray,<:Adjoint}, AM::Exposed{<:MtlArray}, BM::Exposed{<:MtlArray}, α, β
)
return mul!(parent(CM), BM', AM', α, β)

kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
return unexpose(CM)
end
2 changes: 2 additions & 0 deletions NDTensors/src/Unwrap/src/functions/abstractarray.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ parent(E::Exposed) = parent(unexpose(E))

transpose(E::Exposed) = transpose(unexpose(E))

adjoint(E::Exposed) = adjoint(unexpose(E))

cpu(E::Exposed) = cpu(unexpose(E))

getindex(E::Exposed) = unexpose(E)[]
Expand Down
1 change: 1 addition & 0 deletions NDTensors/src/Unwrap/src/import.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import Base:
adjoint,
permutedims,
permutedims!,
copy,
Expand Down
29 changes: 29 additions & 0 deletions NDTensors/src/Unwrap/test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -138,4 +138,33 @@ include("../../../test/device_list.jl")
x = dev(randn(elt, 4, 4))
permutedims!(expose(y), expose(x), (2, 1))
@test NDTensors.cpu(y) == transpose(NDTensors.cpu(x))

##########################################
### Testing an issue with CUDA&Metal transpose/adjoint mul
A = dev(randn(Float64, (3, 2)))
B = dev(randn(Float64, (3, 4)))
C = dev(randn(Float64, (4, 2)))
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
Cp = copy(C)
if (dev == NDTensors.cu)
CUDA.allowscalar(false)
end
mtfishman marked this conversation as resolved.
Show resolved Hide resolved
## This fails with scalar indexing
#mul!(transpose(C), transpose(A), B, 1.0, 0.0)
mul!(C, transpose(B), A, 1.0, 0.0)
mul!(expose(transpose(Cp)), expose(transpose(A)), expose(B), 1.0, 0.0)
@test C ≈ Cp
Cp = fill!(similar(C), 0.0)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
## Try calling mul!! with transposes to verify that code works
NDTensors.mul!!(transpose(Cp), transpose(A), B, 1.0, 0.0)
@test C ≈ Cp

Cp = fill!(similar(C), 0.0)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
## This fails with scalar indexing
#mul!(C', A', B, 1.0, 0)
mul!(C, B', A, 1.0, 0.0)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
mul!(expose(Cp'), expose(A'), expose(B), 1.0, 0)
@test C ≈ Cp
Cp = fill!(similar(C), 0.0)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
NDTensors.mul!!(Cp', A', B, 1.0, 0.0)
@test Cp ≈ C
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
end
Loading