-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NDTensorsCUDAExt] [BUG] DMRG does not converge to correct energy #1232
Comments
@jtschneider thanks for the report. Could you try running the GPU code with double precision? You can use |
Also, have you tried other GPUs, i.e. do you have reason to believe this is particular to your cluster GPU? |
Hi @mtfishman ! Thanks for the quick reply, I just ran the code with Code
using ITensors
using CUDA
# from Eq. (13) in T W Burkhardt and I Guim (1985), J. Phys. A: Math. Gen. 18 L33, https://doi.org/10.1088/0305-4470/18/1/006
exact_crit_ising_energy_OBC(N::Int) = 1.0 - csc(π/(2.0*(2.0*N+1)))
platform = NDTensors.cu
# platform = identity
begin
N = 80
χ = 100
SVDcutoff = 1e-15
sites = siteinds("S=1/2",N)
ampo = OpSum()
for j=1:N-1
ampo += -1.0,"X",j,"X",j+1
ampo += -1.0,"Z",j
end
ampo += -1.0,"Z",N
H = MPO(ampo,sites)
psi0 = randomMPS(sites,10);
sweeps = Sweeps(12)
mindim!(sweeps, 5, 10, 20)
maxdim!(sweeps, floor.(Int, LinRange(10, χ, 10))...)
cutoff!(sweeps, SVDcutoff)
energy, psi = dmrg(platform(H),platform(psi0), sweeps, eigsolve_krylovdim=3)
@show energy
@show exact_crit_ising_energy_OBC(N)
@show abs(energy - exact_crit_ising_energy_OBC(N))
end and the code is still not converging: Output
Yeah perhaps the title is a bit misleading, I have access to another nVidia GPU: the model Tesla T4, but I have to queue for that quite some time and I could not get resources allocated for a quick interactive session. My PC does not have a GPU, it has an AMD CPU which is also why I cannot test against a different GPU-runnable version of the DMRG script. |
Hmm pretty strange. I tested the Metal backend and it seems to work with this example (after fixing a few unrelated bugs in that backend). @kmp5VT could you take a look? |
Hi all, @mtfishman I am looking into this now! |
@jtschneider I think Matt and I were able to find/address your issue. #1236 should fix this bug. Thanks!! |
Description of bug
Hi Matt and Miles!
I was pretty excited about your announcement of the GPU capabilities super-seeding the ITensorGPU package, so I went ahead and tested it. Unfortunately, I have to report that the DMRG function does not run correctly on the GPU on my local cluster.
First, I made sure that CUDA and ITensor are properly installed and the behaviour is as expected and ran your example code:
Checking if my ITensors and CUDA installation work
Minimal code demonstrating the bug or unexpected behavior
The following code runs the DMRG for the critical Ising chain and does not return the correct ground state when executed on my GPU, in particular not the correct energy:
Minimal runnable code
Actual output or behavior
The DMRG sweeps do not converge at all over 12 iterations from left to right and back.
The maximal estimated error is also not really converging:
Output of minimal runnable code
Expected output or behavior
I expect the DMRG code to converge, and even towards the exact ground state energy of the (in this case) critical Ising chain. Here is the code and the output of the same code executed on a CPU (up to changes such that it runs on a CPU):
Output of expected behaviour with runnable code
Version information
versioninfo()
:using Pkg; Pkg.status("ITensors")
:The text was updated successfully, but these errors were encountered: