Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NDTensors] Fix scalar indexing issue for Diag broadcast on GPU #1497

Merged
merged 49 commits into from
Jun 21, 2024
Merged
Changes from 47 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
3e77951
Working on GPU diag problem
kmp5VT Jun 13, 2024
721d0f0
format
kmp5VT Jun 13, 2024
9e222ed
Use data over array
kmp5VT Jun 13, 2024
7e95335
remove adapt
kmp5VT Jun 13, 2024
aef7db6
Make expose permutedims for diagtensor to fix cpu error
kmp5VT Jun 13, 2024
f8a80c9
format
kmp5VT Jun 13, 2024
c3df57f
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 13, 2024
b61b4de
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 13, 2024
f96f322
Remove permutedims function
kmp5VT Jun 13, 2024
7807882
Try to make GPUs more supported by Diag
kmp5VT Jun 13, 2024
8f01962
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 13, 2024
6ef078f
Add a comment with a link to the bug in Metal.jl
kmp5VT Jun 13, 2024
c0ad602
Remove unused line (A request from Miles)
kmp5VT Jun 13, 2024
b3892df
Fix diag function for GPU code
kmp5VT Jun 13, 2024
cc24e62
Make diagview functions for Tensor types
kmp5VT Jun 13, 2024
2af30bf
Remove unecessary function
kmp5VT Jun 13, 2024
9f555bc
Update permutedim functions
kmp5VT Jun 13, 2024
a3c13c6
return fill for diagview uniformdiagtensor
kmp5VT Jun 13, 2024
41bf2af
remove uniformdiag definition
kmp5VT Jun 13, 2024
7d1ef42
remove comment
kmp5VT Jun 13, 2024
ad4d96c
Move dense diagview to densetensor
kmp5VT Jun 13, 2024
06d9503
typo
kmp5VT Jun 13, 2024
2140eb4
use diagview over data
kmp5VT Jun 13, 2024
51323fc
remove diaview
kmp5VT Jun 13, 2024
dcf778e
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 14, 2024
d8cb8bd
array necessary here because .= fails for CUDA gpu
kmp5VT Jun 14, 2024
1907a47
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 14, 2024
4f5e96c
Working on diag function
kmp5VT Jun 17, 2024
b0f1535
Attempt to update diag using expose
kmp5VT Jun 17, 2024
80a9097
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 17, 2024
1fac291
Update if statement
kmp5VT Jun 17, 2024
7758242
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 19, 2024
7e2ce77
Move map_diag! to NDTensors
kmp5VT Jun 19, 2024
6ed349a
Don't use NDTensors map_diag yet
kmp5VT Jun 19, 2024
5d04949
make map_diag expose based and create GPU version for blocksparse wit…
kmp5VT Jun 19, 2024
60f2abc
format
kmp5VT Jun 19, 2024
9433798
remove expose
kmp5VT Jun 19, 2024
3df285e
simplify dense definition for DiagTensor
kmp5VT Jun 19, 2024
8e49288
remove unused code
kmp5VT Jun 19, 2024
caba6a4
Rename diag to blocksparsetensor.jl
kmp5VT Jun 20, 2024
ea52a82
Try forcing Pkg to update the registry to fix ci issue
kmp5VT Jun 20, 2024
a48ddb0
Use approx because of numerical noise
kmp5VT Jun 20, 2024
b95c01e
Merge branch 'main' into kmp5/debug/issue_1482
kmp5VT Jun 20, 2024
f5c2d39
Add map_diag tests
kmp5VT Jun 20, 2024
d0dcb4d
format
kmp5VT Jun 20, 2024
ae19547
remove w
kmp5VT Jun 20, 2024
b2b0b19
Format
mtfishman Jun 20, 2024
255f66e
Format
mtfishman Jun 20, 2024
ab0728f
Format
mtfishman Jun 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
module NDTensorsGPUArraysCoreExt
include("contract.jl")
include("blocksparsetensor.jl")
end
26 changes: 26 additions & 0 deletions NDTensors/ext/NDTensorsGPUArraysCoreExt/blocksparsetensor.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
using GPUArraysCore: @allowscalar, AbstractGPUArray
using NDTensors: NDTensors, BlockSparseTensor, dense, diag, map_diag!
using NDTensors.DiagonalArrays: diaglength
using NDTensors.Expose: Exposed, unexpose

## TODO to circumvent issues with blocksparse and scalar indexing
## convert blocksparse GPU tensors to dense tensors and call diag
## copying will probably have some impact on timing but this code
## currently isn't used in the main code, just in tests.
function NDTensors.diag(ETensor::Exposed{<:AbstractGPUArray,<:BlockSparseTensor})
return diag(dense(unexpose(ETensor)))
end

## TODO scalar indexing is slow here
function NDTensors.map_diag!(
f::Function,
exposed_t_destination::Exposed{<:AbstractGPUArray,<:BlockSparseTensor},
exposed_t_source::Exposed{<:AbstractGPUArray,<:BlockSparseTensor},
)
t_destination = unexpose(exposed_t_destination)
t_source = unexpose(exposed_t_source)
@allowscalar for i in 1:diaglength(t_destination)
NDTensors.setdiagindex!(t_destination, f(NDTensors.getdiagindex(t_source, i)), i)
end
return t_destination
end
26 changes: 25 additions & 1 deletion NDTensors/src/blocksparse/blocksparsetensor.jl
Original file line number Diff line number Diff line change
@@ -256,7 +256,7 @@ end
# Returns the offset of the new block added.
# XXX rename to insertblock!, no need to return offset
using .TypeParameterAccessors: unwrap_array_type
using .Expose: expose
using .Expose: Exposed, expose, unexpose
function insertblock_offset!(T::BlockSparseTensor{ElT,N}, newblock::Block{N}) where {ElT,N}
newdim = blockdim(T, newblock)
newoffset = nnz(T)
@@ -356,6 +356,30 @@ function dense(T::TensorT) where {TensorT<:BlockSparseTensor}
return tensor(Dense(r), inds(T))
end

function diag(ETensor::Exposed{<:AbstractArray,<:BlockSparseTensor})
tensor = unexpose(ETensor)
tensordiag = NDTensors.similar(
dense(typeof(tensor)), eltype(tensor), (diaglength(tensor),)
)
for j in 1:diaglength(tensor)
@inbounds tensordiag[j] = getdiagindex(tensor, j)
end
return tensordiag
end

## TODO currently this fails on GPU with scalar indexing
function map_diag!(
f::Function,
exposed_t_destination::Exposed{<:AbstractArray,<:BlockSparseTensor},
exposed_t_source::Exposed{<:AbstractArray,<:BlockSparseTensor},
)
t_destination = unexpose(exposed_t_destination)
t_source = unexpose(exposed_t_source)
for i in 1:diaglength(t_destination)
NDTensors.setdiagindex!(t_destination, f(NDTensors.getdiagindex(t_source, i)), i)
end
return t_destination
end
#
# Operations
#
6 changes: 6 additions & 0 deletions NDTensors/src/dense/densetensor.jl
Original file line number Diff line number Diff line change
@@ -68,6 +68,12 @@ convert(::Type{Array}, T::DenseTensor) = reshape(data(storage(T)), dims(inds(T))
# Useful for using Base Array functions
array(T::DenseTensor) = convert(Array, T)

using .DiagonalArrays: DiagonalArrays, diagview

function DiagonalArrays.diagview(T::DenseTensor)
return diagview(array(T))
end

function Array{ElT,N}(T::DenseTensor{ElT,N}) where {ElT,N}
return copy(array(T))
end
43 changes: 10 additions & 33 deletions NDTensors/src/diag/diagtensor.jl
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
using .DiagonalArrays: diaglength
using .DiagonalArrays: diaglength, diagview

const DiagTensor{ElT,N,StoreT,IndsT} = Tensor{ElT,N,StoreT,IndsT} where {StoreT<:Diag}
const NonuniformDiagTensor{ElT,N,StoreT,IndsT} =
@@ -9,9 +9,7 @@ const UniformDiagTensor{ElT,N,StoreT,IndsT} =
function diag(tensor::DiagTensor)
tensor_diag = NDTensors.similar(dense(typeof(tensor)), (diaglength(tensor),))
# TODO: Define `eachdiagindex`.
for j in 1:diaglength(tensor)
tensor_diag[j] = getdiagindex(tensor, j)
end
diagview(tensor_diag) .= diagview(tensor)
return tensor_diag
end

@@ -33,6 +31,10 @@ function Array(T::DiagTensor{ElT,N}) where {ElT,N}
return Array{ElT,N}(T)
end

function DiagonalArrays.diagview(T::NonuniformDiagTensor)
return data(T)
end
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved

function zeros(tensortype::Type{<:DiagTensor}, inds)
return tensor(generic_zeros(storagetype(tensortype), mindim(inds)), inds)
end
@@ -110,32 +112,11 @@ end
using .TypeParameterAccessors: unwrap_array_type
# convert to Dense
function dense(T::DiagTensor)
return dense(unwrap_array_type(T), T)
end

# CPU version
function dense(::Type{<:Array}, T::DiagTensor)
R = zeros(dense(typeof(T)), inds(T))
for i in 1:diaglength(T)
setdiagindex!(R, getdiagindex(T, i), i)
end
diagview(R) .= diagview(T)
return R
end

# GPU version
function dense(::Type{<:AbstractArray}, T::DiagTensor)
D_cpu = dense(Array, cpu(T))
return adapt(unwrap_array_type(T), D_cpu)
end

# UniformDiag version
# TODO: Delete once new DiagonalArray is designed.
# TODO: This creates a tensor on CPU by default so may cause
# problems for GPU.
function dense(::Type{<:Number}, T::DiagTensor)
return dense(Tensor(Diag(fill(getdiagindex(T, 1), diaglength(T))), inds(T)))
end

denseblocks(T::DiagTensor) = dense(T)

function permutedims!(
@@ -145,16 +126,14 @@ function permutedims!(
f::Function=(r, t) -> t,
) where {N}
# TODO: check that inds(R)==permute(inds(T),perm)?
for i in 1:diaglength(R)
@inbounds setdiagindex!(R, f(getdiagindex(R, i), getdiagindex(T, i)), i)
end
diagview(R) .= f.(diagview(R), diagview(T))
return R
end

function permutedims(
T::DiagTensor{<:Number,N}, perm::NTuple{N,Int}, f::Function=identity
) where {N}
R = NDTensors.similar(T, permute(inds(T), perm))
R = NDTensors.similar(T)
g(r, t) = f(t)
permutedims!(R, T, perm, g)
return R
@@ -193,9 +172,7 @@ end
function permutedims!(
R::DenseTensor{ElR,N}, T::DiagTensor{ElT,N}, perm::NTuple{N,Int}, f::Function=(r, t) -> t
) where {ElR,ElT,N}
for i in 1:diaglength(T)
@inbounds setdiagindex!(R, f(getdiagindex(R, i), getdiagindex(T, i)), i)
end
diagview(R) .= f.(diagview(R), diagview(T))
return R
end

1 change: 0 additions & 1 deletion NDTensors/src/linearalgebra/linearalgebra.jl
Original file line number Diff line number Diff line change
@@ -369,7 +369,6 @@ matrix is unique. Returns a tuple (Q,R).
function qr_positive(M::AbstractMatrix)
sparseQ, R = qr(M)
Q = convert(typeof(R), sparseQ)
nc = size(Q, 2)
kmp5VT marked this conversation as resolved.
Show resolved Hide resolved
signs = nonzero_sign.(diag(R))
Q = Q * Diagonal(signs)
R = Diagonal(conj.(signs)) * R
16 changes: 12 additions & 4 deletions NDTensors/src/tensor/tensor.jl
Original file line number Diff line number Diff line change
@@ -361,16 +361,18 @@ function getdiagindex(T::Tensor{<:Number,N}, ind::Int) where {N}
return getindex(T, CartesianIndex(ntuple(_ -> ind, Val(N))))
end

using .Expose: Exposed, expose, unexpose
# TODO: add support for off-diagonals, return
# block sparse vector instead of dense.
function diag(tensor::Tensor)
diag(tensor::Tensor) = diag(expose(tensor))

function diag(ETensor::Exposed)
tensor = unexpose(ETensor)
## d = NDTensors.similar(T, ElT, (diaglength(T),))
tensordiag = NDTensors.similar(
dense(typeof(tensor)), eltype(tensor), (diaglength(tensor),)
)
for n in 1:diaglength(tensor)
tensordiag[n] = tensor[n, n]
end
array(tensordiag) .= diagview(tensor)
return tensordiag
end

@@ -384,6 +386,12 @@ function setdiagindex!(T::Tensor{<:Number,N}, val, ind::Int) where {N}
return T
end

function map_diag!(f::Function, exposed_t_destination::Exposed, exposed_t_source::Exposed)
diagview(unexpose(exposed_t_destination)) .= f.(diagview(unexpose(exposed_t_source)))
return unexpose(exposed_t_destination)
end
map_diag(f::Function, t::Tensor) = map_diag!(f, expose(copy(t)), expose(t))

#
# Some generic contraction functionality
#
8 changes: 8 additions & 0 deletions NDTensors/test/test_blocksparse.jl
Original file line number Diff line number Diff line change
@@ -10,6 +10,8 @@ using NDTensors:
blockview,
data,
dense,
diag,
diaglength,
dims,
eachnzblock,
inds,
@@ -52,6 +54,8 @@ using Test: @test, @test_throws, @testset
@test isblocknz(A, (1, 2))
@test !isblocknz(A, (1, 1))
@test !isblocknz(A, (2, 2))
dA = diag(A)
@test @allowscalar dA ≈ diag(dense(A))

# Test different ways of getting nnz
@test nnz(blockoffsets(A), inds(A)) == nnz(A)
@@ -104,6 +108,10 @@ using Test: @test, @test_throws, @testset
@allowscalar for I in eachindex(C)
@test C[I] == A[I] + B[I]
end
Cp = NDTensors.map_diag(i -> 2 * i, C)
@allowscalar for i in 1:diaglength(Cp)
@test Cp[i, i] == 2 * C[i, i]
end

Ap = permutedims(A, (2, 1))

4 changes: 4 additions & 0 deletions NDTensors/test/test_dense.jl
Original file line number Diff line number Diff line change
@@ -48,6 +48,10 @@ NDTensors.dim(i::MyInd) = i.dim
randn!(B)
C = copy(A)
C = permutedims!!(C, B, (1, 2), +)
Cp = NDTensors.map_diag((i -> i * 2), C)
@allowscalar for i in 1:diaglength(Cp)
@test Cp[i, i] == 2.0 * C[i, i]
end
mtfishman marked this conversation as resolved.
Show resolved Hide resolved

Ap = permutedims(A, (2, 1))
@allowscalar begin
24 changes: 22 additions & 2 deletions NDTensors/test/test_diag.jl
Original file line number Diff line number Diff line change
@@ -34,19 +34,39 @@ using LinearAlgebra: dot
D = Tensor(Diag(1), (2, 2))
@test norm(D) == √2
d = 3
## TODO this fails because uniform diag tensors are immutable
#S = NDTensors.map_diag((i->i * 2), dev(D))
# @allowscalar for i in 1:diaglength(S)
# @test S[i,i] == 2.0 * D[i,i]
# end

vr = rand(elt, d)
D = dev(tensor(Diag(vr), (d, d)))
Da = Array(D)
Dm = Matrix(D)
Da = permutedims(D, (2, 1))
@allowscalar begin
@test Da == NDTensors.LinearAlgebra.diagm(0 => vr)
@test Da == NDTensors.LinearAlgebra.diagm(0 => vr)

## TODO Currently this permutedims requires scalar indexing on GPU.
Da = permutedims(D, (2, 1))
@test Da == D
end

# This if statement corresponds to the reported bug:
# https://github.com/JuliaGPU/Metal.jl/issues/364
if !(dev == NDTensors.mtl && elt === ComplexF32)
S = permutedims(dev(D), (1, 2), sqrt)
@allowscalar begin
for i in 1:diaglength(S)
@test S[i, i] ≈ sqrt(D[i, i])
end
end
end
S = NDTensors.map_diag((i -> i * 2), dev(D))
@allowscalar for i in 1:diaglength(S)
@test S[i, i] == 2.0 * D[i, i]
end
mtfishman marked this conversation as resolved.
Show resolved Hide resolved

# Regression test for https://github.com/ITensor/ITensors.jl/issues/1199
S = dev(tensor(Diag(randn(elt, 2)), (2, 2)))
## This was creating a `Dense{ReshapedArray{Adjoint{Matrix}}}` which, in mul!, was
6 changes: 3 additions & 3 deletions jenkins/Jenkinsfile
Original file line number Diff line number Diff line change
@@ -27,7 +27,7 @@ pipeline {
}
steps {
sh '''
julia -e 'using Pkg; Pkg.activate(temp=true); Pkg.develop(path="./NDTensors"); Pkg.develop(path="."); Pkg.test("NDTensors"; test_args=["cuda"])'
julia -e 'using Pkg; Pkg.Registry.update(); Pkg.update(); Pkg.activate(temp=true); Pkg.develop(path="./NDTensors"); Pkg.develop(path="."); Pkg.test("NDTensors"; test_args=["cuda"])'
'''
}
}
@@ -51,7 +51,7 @@ pipeline {
}
steps {
sh '''
julia -e 'using Pkg; Pkg.activate(temp=true); Pkg.develop(path="./NDTensors"); Pkg.develop(path="."); Pkg.test("NDTensors"; test_args=["cuda"])'
julia -e 'using Pkg; Pkg.Registry.update(); Pkg.update(); Pkg.activate(temp=true); Pkg.develop(path="./NDTensors"); Pkg.develop(path="."); Pkg.test("NDTensors"; test_args=["cuda"])'
'''
}
}
@@ -75,7 +75,7 @@ pipeline {
}
steps {
sh '''
julia -e 'using Pkg; Pkg.activate(temp=true); Pkg.develop(path="./NDTensors"); Pkg.develop(path="."); Pkg.test("NDTensors"; test_args=["cutensor"])'
julia -e 'using Pkg; Pkg.Registry.update(); Pkg.update(); Pkg.activate(temp=true); Pkg.develop(path="./NDTensors"); Pkg.develop(path="."); Pkg.test("NDTensors"; test_args=["cutensor"])'
'''
}
}