Adjusting example 3 to a GPU backend #21

YotamKa · 2024-06-02T08:30:11Z

Dear Matt,

I tried using Nvidia's GPU backend for ITensorMPS TDVP calculations, but encountered an issue with the data types being passed to the ODE solver.

The problem is described below using code snippets.

Do you have a quick fix in mind for this issue?

Thanks,
Yotam

Consider, for example, the code in:
https://github.com/ITensor/ITensorMPS.jl/tree/main/examples/03_tdvp_time_dependent.jl'

I changed it a bit so it would run with my current versions:
"ITensors v0.3.68"
"ITensorMPS v0.2.1"

The following code runs:

using ITensors
using ITensorMPS
using LinearAlgebra: norm
using Random: Random

include("03_models.jl")
include("03_updaters.jl")

function main()
  Random.seed!(1234)

  # Time dependent Hamiltonian is:
  # H(t) = H₁(t) + H₂(t) + …
  #      = f₁(t) H₁(0) + f₂(t) H₂(0) + …
  #      = cos(ω₁t) H₁(0) + cos(ω₂t) H₂(0) + …

  # Number of sites
  n = 6

  # How much information to output from TDVP
  # Set to 2 to get information about each bond/site
  # evolution, and 3 to get information about the
  # updater.
  outputlevel = 3

  # Frequency of time dependent terms
  ω₁ = 0.1
  ω₂ = 0.2

  # Nearest and next-nearest neighbor
  # Heisenberg couplings.
  J₁ = 1.0
  J₂ = 1.0

  time_step = 0.1
  time_stop = 1.0

  # nsite-update TDVP
  nsite = 2

  # Starting state bond/link dimension.
  # A product state starting state can
  # cause issues for TDVP without
  # subspace expansion.
  start_linkdim = 4

  # TDVP truncation parameters
  maxdim = 100
  cutoff = 1e-8

  tol = 1e-15

  @show n
  @show ω₁, ω₂
  @show J₁, J₂
  @show maxdim, cutoff, nsite
  @show start_linkdim
  @show time_step, time_stop

  ω⃗ = (ω₁, ω₂)
  f⃗ = map(ω -> (t -> cos(ω * t)), ω⃗)

  # H₀ = H(0) = H₁(0) + H₂(0) + …
  ℋ₁₀ = heisenberg(n; J=J₁, J2=0.0)
  ℋ₂₀ = heisenberg(n; J=0.0, J2=J₂)
  ℋ⃗₀ = (ℋ₁₀, ℋ₂₀)

  s = siteinds("S=1/2", n)

  H⃗₀ = map(ℋ₀ -> MPO(ℋ₀, s), ℋ⃗₀)

  # Initial state, ψ₀ = ψ(0)
  # Initialize as complex since that is what OrdinaryDiffEq.jl/DifferentialEquations.jl
  # expects.
  ψ₀ = complex.(randomMPS(s, j -> isodd(j) ? "↑" : "↓"; linkdims=start_linkdim))

  @show norm(ψ₀)

  println()
  println("#"^100)
  println("Running TDVP with ODE updater")
  println("#"^100)
  println()

  ψₜ_ode = tdvp(
    -im * TimeDependentSum(f⃗, H⃗₀),
    time_stop,
    ψ₀;
    updater=ode_updater,
    updater_kwargs=(; reltol=tol, abstol=tol),
    time_step,
    maxdim,
    cutoff,
    nsite,
    outputlevel,
  )

  println()
  println("Finished running TDVP with ODE updater")
  println()

  println()
  println("#"^100)
  println("Running TDVP with Krylov updater")
  println("#"^100)
  println()

  ψₜ_krylov = tdvp(
    -im * TimeDependentSum(f⃗, H⃗₀),
    time_stop,
    ψ₀;
    updater=krylov_updater,
    updater_kwargs=(; tol, eager=true),
    time_step,
    cutoff,
    nsite,
    outputlevel,
  )

  println()
  println("Finished running TDVP with Krylov updater")
  println()

  println()
  println("#"^100)
  println("Running full state evolution with ODE updater")
  println("#"^100)
  println()

  @disable_warn_order begin
    ψₜ_full, _ = ode_updater(
      -im * TimeDependentSum(f⃗, contract.(H⃗₀)),
      contract(ψ₀);
      internal_kwargs=(; time_step=time_stop, outputlevel),
      reltol=tol,
      abstol=tol,
    )
  end

  println()
  println("Finished full state evolution with ODE updater")
  println()

  @show norm(ψₜ_ode)
  @show norm(ψₜ_krylov)
  @show norm(ψₜ_full)

  @show 1 - abs(inner(contract(ψₜ_ode), ψₜ_full))
  @show 1 - abs(inner(contract(ψₜ_krylov), ψₜ_full))
  return nothing
end

main()

But, when I pass psi and H as cuMPS and cuMPO, i.e.:

using ITensors
using ITensorMPS
using LinearAlgebra: norm
using Random: Random
using CUDA

include("03_models.jl")
include("03_updaters.jl")

function main()
  Random.seed!(1234)

  # Time dependent Hamiltonian is:
  # H(t) = H₁(t) + H₂(t) + …
  #      = f₁(t) H₁(0) + f₂(t) H₂(0) + …
  #      = cos(ω₁t) H₁(0) + cos(ω₂t) H₂(0) + …

  # Number of sites
  n = 6

  # How much information to output from TDVP
  # Set to 2 to get information about each bond/site
  # evolution, and 3 to get information about the
  # updater.
  outputlevel = 3

  # Frequency of time dependent terms
  ω₁ = 0.1
  ω₂ = 0.2

  # Nearest and next-nearest neighbor
  # Heisenberg couplings.
  J₁ = 1.0
  J₂ = 1.0

  time_step = 0.1
  time_stop = 1.0

  # nsite-update TDVP
  nsite = 2

  # Starting state bond/link dimension.
  # A product state starting state can
  # cause issues for TDVP without
  # subspace expansion.
  start_linkdim = 4

  # TDVP truncation parameters
  maxdim = 100
  cutoff = 1e-8

  tol = 1e-15

  @show n
  @show ω₁, ω₂
  @show J₁, J₂
  @show maxdim, cutoff, nsite
  @show start_linkdim
  @show time_step, time_stop

  ω⃗ = (ω₁, ω₂)
  f⃗ = map(ω -> (t -> cos(ω * t)), ω⃗)

  # H₀ = H(0) = H₁(0) + H₂(0) + …
  ℋ₁₀ = heisenberg(n; J=J₁, J2=0.0)
  ℋ₂₀ = heisenberg(n; J=0.0, J2=J₂)
  ℋ⃗₀ = (ℋ₁₀, ℋ₂₀)

  s = siteinds("S=1/2", n)

""" Define as H as cu(MPO) for GPU calculation """
  H⃗₀ = map(ℋ₀ -> cu(MPO(ℋ₀, s)), ℋ⃗₀)

  # Initial state, ψ₀ = ψ(0)
  # Initialize as complex since that is what OrdinaryDiffEq.jl/DifferentialEquations.jl
  # expects.
""" Define as psi as cu(MPS) for GPU calculation """
  ψ₀ = cu(complex.(randomMPS(s, j -> isodd(j) ? "↑" : "↓"; linkdims=start_linkdim)))

  @show norm(ψ₀)

  println()
  println("#"^100)
  println("Running TDVP with ODE updater")
  println("#"^100)
  println()

  ψₜ_ode = tdvp(
    -im * TimeDependentSum(f⃗, H⃗₀),
    time_stop,
    ψ₀;
    updater=ode_updater,
    updater_kwargs=(; reltol=tol, abstol=tol),
    time_step,
    maxdim,
    cutoff,
    nsite,
    outputlevel,
  )

  println()
  println("Finished running TDVP with ODE updater")
  println()

  println()
  println("#"^100)
  println("Running TDVP with Krylov updater")
  println("#"^100)
  println()

  ψₜ_krylov = tdvp(
    -im * TimeDependentSum(f⃗, H⃗₀),
    time_stop,
    ψ₀;
    updater=krylov_updater,
    updater_kwargs=(; tol, eager=true),
    time_step,
    cutoff,
    nsite,
    outputlevel,
  )

  println()
  println("Finished running TDVP with Krylov updater")
  println()

  println()
  println("#"^100)
  println("Running full state evolution with ODE updater")
  println("#"^100)
  println()

  @disable_warn_order begin
    ψₜ_full, _ = ode_updater(
      -im * TimeDependentSum(f⃗, contract.(H⃗₀)),
      contract(ψ₀);
      internal_kwargs=(; time_step=time_stop, outputlevel),
      reltol=tol,
      abstol=tol,
    )
  end

  println()
  println("Finished full state evolution with ODE updater")
  println()

  @show norm(ψₜ_ode)
  @show norm(ψₜ_krylov)
  @show norm(ψₜ_full)

  @show 1 - abs(inner(contract(ψₜ_ode), ψₜ_full))
  @show 1 - abs(inner(contract(ψₜ_krylov), ψₜ_full))
  return nothing
end

main()

I get the following error:

MethodError: no method matching (::var"#f#7"{…})(::CuArray{…}, ::SciMLBase.NullParameters, ::Float64)
An arithmetic operation was performed on a NullParameters object. This means no parameters were passed
into the AbstractSciMLProblem (e.x.: ODEProblem) but the parameters object `p` was used in an arithmetic
expression.

The text was updated successfully, but these errors were encountered:

emstoudenmire · 2024-06-02T08:44:46Z

Do you get the same error when running the calculation on CPU ie not using CUDA? I ask because the error message seems more related to SciML parameters than the presence of the CuArray though I could be wrong.

YotamKa · 2024-06-02T08:55:47Z

Hi, thanks for the quick response.

No, with a CPU backend (the first code snippet), the integrator works and produces the correct answers (compared with ED).

The output is:

n = 4
(ω₁, ω₂) = (0.1, 0.2)
(J₁, J₂) = (1.0, 1.0)
(maxdim, cutoff, nsite) = (100, 1.0e-8, 2)
start_linkdim = 4
(time_step, time_stop) = (0.1, 1.0)
norm(ψ₀) = 1.0000000000000002

####################################################################################################
Running TDVP with ODE updater
####################################################################################################

After sweep 1: maxlinkdim=4 maxerr=0.00E+00 current_time=0.1 time=0.701
After sweep 2: maxlinkdim=4 maxerr=8.05E-17 current_time=0.2 time=0.028
After sweep 3: maxlinkdim=4 maxerr=3.34E-17 current_time=0.3 time=0.025
After sweep 4: maxlinkdim=4 maxerr=0.00E+00 current_time=0.4 time=0.025
After sweep 5: maxlinkdim=4 maxerr=6.16E-17 current_time=0.5 time=0.023
After sweep 6: maxlinkdim=4 maxerr=2.07E-17 current_time=0.6 time=0.025
After sweep 7: maxlinkdim=4 maxerr=0.00E+00 current_time=0.7 time=0.023
After sweep 8: maxlinkdim=4 maxerr=0.00E+00 current_time=0.8 time=0.025
After sweep 9: maxlinkdim=4 maxerr=0.00E+00 current_time=0.9 time=0.023
After sweep 10: maxlinkdim=4 maxerr=0.00E+00 current_time=1.0 time=0.025

Finished running TDVP with ODE updater


####################################################################################################
Running TDVP with Krylov updater
####################################################################################################

After sweep 1-: maxlinkdim=4 maxerr=0.00E+00 current_time=0.1 time=5.487
After sweep 2: maxlinkdim=4 maxerr=2.14E-16 current_time=0.2 time=0.004
After sweep 3: maxlinkdim=4 maxerr=0.00E+00 current_time=0.3 time=0.004
After sweep 4: maxlinkdim=4 maxerr=3.33E-17 current_time=0.4 time=0.004
After sweep 5: maxlinkdim=4 maxerr=6.02E-18 current_time=0.5 time=0.004
After sweep 6: maxlinkdim=4 maxerr=7.39E-17 current_time=0.6 time=0.004
After sweep 7: maxlinkdim=4 maxerr=9.73E-17 current_time=0.7 time=0.004
After sweep 8: maxlinkdim=4 maxerr=0.00E+00 current_time=0.8 time=0.004
After sweep 9: maxlinkdim=4 maxerr=6.92E-17 current_time=0.9 time=0.006
After sweep 10: maxlinkdim=4 maxerr=0.00E+00 current_time=1.0 time=0.006

Finished running TDVP with Krylov updater


####################################################################################################
Running full state evolution with ODE updater
####################################################################################################


Finished full state evolution with ODE updater

norm(ψₜ_ode) = 1.0000000000000007
norm(ψₜ_krylov) = 1.0000000000000022
norm(ψₜ_full) = 0.9999999999999999
1 - abs(inner(contract(ψₜ_ode), ψₜ_full)) = -2.220446049250313e-16
1 - abs(inner(contract(ψₜ_krylov), ψₜ_full)) = 9.623815895309917e-7

mtfishman · 2024-06-02T17:43:23Z

Interesting, thanks for the report. Maybe worth looking at https://github.com/SciML/DiffEqGPU.jl?tab=readme-ov-file#example-of-within-method-gpu-parallelism. The only difference I see in that example from our example is that we aren't defining an in-place version of the solver time stepping function: https://github.com/ITensor/ITensorMPS.jl/blob/v0.2.3/examples/03_updaters.jl#L11-L14, or there is something more subtle about how we are calling the ODE solver that I am missing.

mtfishman · 2024-06-02T17:58:48Z

Also note that if you are using GPU, probably you want to use single precision (i.e. Float32 or Complex{Float32}). cu(...) converts to single precision automatically but you should probably change parameters in the code to single precision, for example ω₁ = 0.1f0, tol = 1f-7, etc. Something we do to easily switch between number types is define the number type at the top of the code and then use it as a variable throughout the code:

elt = Float32

ω₁ = one(elt) / 10
tol = 10 * eps(elt)

but of course feel free to handle that however you prefer.

mtfishman · 2024-06-02T18:41:25Z

I see the issue, this line here is too restrictive: https://github.com/ITensor/ITensorMPS.jl/blob/v0.2.3/examples/03_updaters.jl#L12 and should allow any AbstractVector rather than just a Vector. I'll make a PR, and additionally make the example easier to run on GPU and select the element type.

mtfishman · 2024-06-02T18:47:25Z

This is being fixed in #22.

YotamKa · 2024-06-03T05:59:20Z

Thanks a lot! It works also on my PC over a GPU backend.

mtfishman mentioned this issue Jun 2, 2024

Make it easier to run example 3 on GPU #22

Merged

mtfishman closed this as completed in #22 Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjusting example 3 to a GPU backend #21

Adjusting example 3 to a GPU backend #21

YotamKa commented Jun 2, 2024

emstoudenmire commented Jun 2, 2024

YotamKa commented Jun 2, 2024

mtfishman commented Jun 2, 2024

mtfishman commented Jun 2, 2024

mtfishman commented Jun 2, 2024

mtfishman commented Jun 2, 2024

YotamKa commented Jun 3, 2024 •

edited

Loading

Adjusting example 3 to a GPU backend #21

Adjusting example 3 to a GPU backend #21

Comments

YotamKa commented Jun 2, 2024

emstoudenmire commented Jun 2, 2024

YotamKa commented Jun 2, 2024

mtfishman commented Jun 2, 2024

mtfishman commented Jun 2, 2024

mtfishman commented Jun 2, 2024

mtfishman commented Jun 2, 2024

YotamKa commented Jun 3, 2024 • edited Loading

YotamKa commented Jun 3, 2024 •

edited

Loading