-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NDTensors] JLArrays Extension #1508
[NDTensors] JLArrays Extension #1508
Conversation
There is an issue in JLArrays, there is a missing |
Co-authored-by: Matt Fishman <[email protected]>
if "jlarrays" in ARGS || "all" in ARGS | ||
Pkg.add("JLArrays") | ||
using JLArrays | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a reason to not run the tests for JLArrays since they run on CPU anyway. So I think it should just be added as a normal test dependency, and inserted into the device list by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I am seeing an issue with the GPU tests when I have JLArrays
is in the Project.toml
. To fix the problem, I moved JLArrays
to [extra]
and I add it to the project if isempty(ARGS) || "base" in ARGS
. However, this would run into a problem if the test args is ["base","metal"]
. Right now we aren't testing in that way, though if you would prefer, I can just include JLArrays
if no GPUs are being tested, i.e. only allow JLArrays
if isempty(ARGS)
. Let me know, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue you see? It would be best to try to address that issue and have JLArrays
as a dependency in Project.toml
and run by default in the tests as we planned, so I would prefer to try to aim for that rather than work around an issue with a more convoluted solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that GPUArrays does not compile during Pkg.test
ERROR: LoadError: UndefVarError: `SimplifyCFGPassOptions` not defined
Stacktrace:
[1] top-level scope
@ ~/.julia/packages/GPUCompiler/nWT2N/src/optim.jl:57
[2] include(mod::Module, _path::String)
@ Base ./Base.jl:495
[3] include(x::String)
@ GPUCompiler ~/.julia/packages/GPUCompiler/nWT2N/src/GPUCompiler.jl:1
[4] top-level scope
@ ~/.julia/packages/GPUCompiler/nWT2N/src/GPUCompiler.jl:37
[5] include
@ Base ./Base.jl:495 [inlined]
[6] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::String)
@ Base ./loading.jl:2216
[7] top-level scope
@ stdin:3
in expression starting at /home/jenkins/workspace/ITensor_ITensors.jl_PR-1508@tmp/.julia/packages/GPUCompiler/nWT2N/src/optim.jl:56
in expression starting at /home/jenkins/workspace/ITensor_ITensors.jl_PR-1508@tmp/.julia/packages/GPUCompiler/nWT2N/src/GPUCompiler.jl:1
in expression starting at stdin:3
ERROR: LoadError: Failed to precompile GPUCompiler [61eb1bfa-7361-4325-ad38-22787b887f55] to "/home/jenkins/workspace/ITensor_ITensors.jl_PR-1508@tmp/.julia/compiled/v1.10/GPUCompiler/jl_xlhZZG".
However, I think I was able to fix the problem by moving using JLArrays
to after the GPU using
statements.
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1508 +/- ##
==========================================
- Coverage 78.05% 77.38% -0.68%
==========================================
Files 148 140 -8
Lines 9679 9103 -576
==========================================
- Hits 7555 7044 -511
+ Misses 2124 2059 -65 ☔ View full report in Codecov by Sentry. |
@mtfishman It looks like modifying the (@v1.8) pkg> add JLArrays@0.1.5
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package Adapt [79e6a3ab]:
Adapt [79e6a3ab] log:
├─possible versions are: 0.3.0-4.0.4 or uninstalled
├─restricted to versions 3.7.0-4 by NDTensors [23ae76d9], leaving only versions 3.7.0-4.0.4
│ └─NDTensors [23ae76d9] log:
│ ├─possible versions are: 0.3.42 or uninstalled
│ └─NDTensors [23ae76d9] is fixed to version 0.3.42
├─restricted by compatibility requirements with GPUArrays [0c68f7d7] to versions: 4.0.0-4.0.4
│ └─GPUArrays [0c68f7d7] log:
│ ├─possible versions are: 0.3.0-10.3.1 or uninstalled
│ ├─restricted by compatibility requirements with JLArrays [27aeb0d3] to versions: 10.0.0-10.3.1
│ │ └─JLArrays [27aeb0d3] log:
│ │ ├─possible versions are: 0.1.0-0.1.5 or uninstalled
│ │ └─restricted to versions 0.1.5 by an explicit requirement, leaving only versions 0.1.5
│ ├─restricted by julia compatibility requirements to versions: 0.3.0-10.2.3 or uninstalled, leaving only versions: 10.0.0-10.2.3
│ └─restricted by compatibility requirements with CUDA [052768ef] to versions: [8.6.0-9.1.0, 10.0.1-10.3.1], leaving only versions: 10.0.1-10.2.3
│ └─CUDA [052768ef] log:
│ ├─possible versions are: 0.1.0-5.4.3 or uninstalled
│ ├─restricted by compatibility requirements with StridedViews [4db3bf67] to versions: 4.0.0-5.4.3
│ │ └─StridedViews [4db3bf67] log:
│ │ ├─possible versions are: 0.1.0-0.3.1 or uninstalled
│ │ └─restricted to versions 0.2.2-0.3 by NDTensors [23ae76d9], leaving only versions 0.2.2-0.3.1
│ │ └─NDTensors [23ae76d9] log: see above
│ └─restricted by compatibility requirements with GPUArrays [0c68f7d7] to versions: 5.2.0-5.4.3 or uninstalled, leaving only versions: 5.2.0-5.4.3
│ └─GPUArrays [0c68f7d7] log: see above
└─restricted by compatibility requirements with ArrayInterface [4fba245c] to versions: 3.0.0-3.7.2 — no versions left
└─ArrayInterface [4fba245c] log:
├─possible versions are: 0.0.1-7.16.0 or uninstalled
├─restricted by compatibility requirements with LinearAlgebra [37e2e46d] to versions: 0.0.1-7.5.1 or uninstalled
│ └─LinearAlgebra [37e2e46d] log:
│ ├─possible versions are: 1.8.5 or uninstalled (package in sysimage!)
│ └─restricted to versions 1.6.0-1 by NDTensors [23ae76d9], leaving only versions 1.8.5
│ └─NDTensors [23ae76d9] log: see above
├─restricted by compatibility requirements with Compat [34da2185] to versions: [0.0.1-3.1.32, 6.0.2-7.16.0] or uninstalled, leaving only versions: [0.0.1-3.1.32, 6.0.2-7.5.1] or uninstalled
│ └─Compat [34da2185] log:
│ ├─possible versions are: 1.0.0-4.16.0 or uninstalled
│ └─restricted to versions 4.9.0-4 by NDTensors [23ae76d9], leaving only versions 4.9.0-4.16.0
│ └─NDTensors [23ae76d9] log: see above
└─restricted by compatibility requirements with StaticArrayInterface [0d7ed370] to versions: 7.0.0-7.16.0, leaving only versions: 7.0.0-7.5.1
└─StaticArrayInterface [0d7ed370] log:
├─possible versions are: 1.0.0-1.8.0 or uninstalled
└─restricted by julia compatibility requirements to versions: 1.0.0-1.6.0 or uninstalled, leaving only versions: 1.0.0-1.6.0 and here are the current packages I have (@v1.8) pkg> st
Status `~/.julia/environments/v1.8/Project.toml`
⌅ [46192b85] GPUArraysCore v0.1.5
⌃ [27aeb0d3] JLArrays v0.1.2
[23ae76d9] NDTensors v0.3.42 `../../dev/ITensors/NDTensors`
Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated` |
Gotcha, thanks for checking. I made similar changes to the compat entry of LinearAlgebra and Random since those are also standard libraries and the CI log showed a similar issue for LinearAlgebra. If after those changes it still shows that error you are seeing with compatibility of JLArrays, let's just stop testing the GPU backends (including JLArrays.jl) against Julia 1.8. That issue may be caused by the fact that GPUArrays.jl v10 requires Julia 1.10: https://github.com/JuliaGPU/GPUArrays.jl/blob/v10.3.1/Project.toml#L26, and JLArrays.jl v0.1.5 requires GPUArrays v10: https://github.com/JuliaGPU/GPUArrays.jl/blob/v10.3.1/lib/JLArrays/Project.toml#L13. |
@kmp5VT I'm not sure what's going on with the CI in Julia 1.8, it's showing the same compat issue with LinearAlgebra even though I changed the compat bound to allow LinearAlgebra v0. Let's just stop testing against Julia 1.8 in the Jenkins GPU CI workflow. Beyond that, are there issues that need to be addressed before this can be merged? How is the JLArrays extension being tested, as part of the Github Actions CPU test or as a Jenkins test? I don't see it run in Jenkins so I assume it is the former but if you could give a summary of the current status of this PR including how the tests are working that would be helpful since I've forgotten at this point. |
@mtfishman All of the JLArrays code seems to work now! A questions for this PR:
|
I already see: https://github.com/ITensor/ITensors.jl/tree/main/NDTensors/ext/NDTensorsGPUArraysCoreExt. I guess you mean adding more functionality to that extension? Let's hold off on that and focus on wrapping up this PR and keeping it more self-contained.
Isn't it already running as part of the tests? I.e. the |
@mtfishman oh I had forgotten that I created a GPUArraysCore extension. But yes this is what I mean, adding functionality to Regarding testing: Yes JLArrays is being used in the Thanks! |
Sounds good, sounds like we are in agreement there. One comment about the philosophy there is that even if there are overlapping definitions across extensions, it doesn't necessarily mean we should always turn them into generic definitions in
I see, thanks for clarifying. Let's hold off on incorporating it into the ITensors test suite, I think those tests already take too long to run, and would prefer to make more tests at a lower level (i.e. at the NDTensors level, and more specifically at the submodule level) going forward anyway, since that is a better strategy for having more systematic and efficient test coverage. |
Thanks! This will be helpful for testing the GPU backends. |
Great to see this |
Description
Introduce the JLArrays extension to test GPU for ITensors
Checklist: