-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test for autodifferentiating hydrostatic turbulence simulation #3867
base: main
Are you sure you want to change the base?
Conversation
Here's the error that I get right now when running this locally:
Farther down the error message says:
And the stacktrace says:
|
It's coming from this annoying function: Oceananigans.jl/src/Fields/field_tuples.jl Lines 56 to 83 in 30e3e05
which we can certainly simplify. We can also play around this avoiding hitting this line altogether, at least for a certain class of models. |
Ok problem may be fixed. |
oh cool, well in any case this is why we should make PRs/land everything we want to experiment with [and things that fail I can fix] |
100% --- it seems like the problems here are all on the Oceananigans side |
…com/CliMA/Oceananigans.jl into glw/autodiff-hydrostatic-turbulence
Is this correct in lines 274=275 of
When I run it locally it returns this error:
I found a few possible errors locally and can push. |
Oceananigans.jl/test/test_enzyme.jl Lines 274 to 278 in 45dbdf8
Yes that's a bug -- fixed now PS to generate a link to code, click the "..." on the left side. You'll get an option to create a permalink. You can edit the link to refer to a range of lines after creating it as I did above (to see what I wrote, attempt to edit my post and you will see the hyperlink) |
Here's another code snipped that I have been using to test this: using Oceananigans
using Oceananigans.Utils: with_tracers
using Random
using Enzyme
Random.seed!(123)
arch = CPU()
Nx = Ny = 32
x = y = (0, 2π)
z = (0, 1)
g = 4^2
c = sqrt(g)
grid = RectilinearGrid(arch, size=(Nx, Ny, 1); x, y, z, topology=(Periodic, Periodic, Bounded))
closure = ScalarDiffusivity(ν=1e-2)
momentum_advection = Centered(order=2)
free_surface = ExplicitFreeSurface(gravitational_acceleration=g)
model = HydrostaticFreeSurfaceModel(; grid, momentum_advection, free_surface, closure)
ϵ(x, y, z) = 2randn() - 1
set!(model, u=ϵ, v=ϵ)
u_init = deepcopy(model.velocities.u)
v_init = deepcopy(model.velocities.v)
Δx = minimum_xspacing(grid)
Δt = 0.01 * Δx / c
for n = 1:10
time_step!(model, Δt; euler=true)
end
u_truth = deepcopy(model.velocities.u)
v_truth = deepcopy(model.velocities.v)
function set_viscosity!(model, viscosity)
new_closure = ScalarDiffusivity(ν=viscosity)
names = ()
new_closure = with_tracers(names, new_closure)
model.closure = new_closure
return nothing
end
function viscous_hydrostatic_turbulence(ν, model, u_init, v_init, Δt, u_truth, v_truth)
# Initialize the model
model.clock.iteration = 0
model.clock.time = 0
#model.clock.last_Δt = Inf
set_viscosity!(model, ν)
#set!(model, u=u_init, v=v_init, η=0)
set!(model, u=u_init, v=v_init)
fill!(parent(model.free_surface.η), 0)
# Step it forward
for n = 1:10
time_step!(model, Δt; euler=true)
end
# Compute the sum square error
u, v, w = model.velocities
Nx, Ny, Nz = size(model.grid)
err = 0.0
for j = 1:Ny, i = 1:Nx
err += @inbounds (u[i, j, 1] - u_truth[i, j, 1])^2 +
(v[i, j, 1] - v_truth[i, j, 1])^2
end
return err::Float64
end
# Use a manual finite difference to compute a gradient
Δν = 1e-6
ν1 = 1.1e-2
ν2 = ν1 + Δν
e1 = viscous_hydrostatic_turbulence(ν1, model, u_init, v_init, Δt, u_truth, v_truth)
e2 = viscous_hydrostatic_turbulence(ν2, model, u_init, v_init, Δt, u_truth, v_truth)
Δe = e2 - e1
ΔeΔν = (e2 - e1) / Δν
@info "Finite difference computed: $ΔeΔν"
@info "Now with autodiff..."
start_time = time_ns()
# Use autodiff to compute a gradient
dmodel = Enzyme.make_zero(model)
dedν = autodiff(set_runtime_activity(Enzyme.Reverse),
viscous_hydrostatic_turbulence,
Active(ν1),
Duplicated(model, dmodel),
Const(u_init),
Const(v_init),
Const(Δt),
Const(u_truth),
Const(v_truth))
@info "Automatically computed: $dedν."
@info "Elapsed time: " * prettytime(1e-9 * (time_ns() - start_time)) |
Cool! Two other possible bugs I found: Oceananigans.jl/test/test_enzyme.jl Line 331 in 1522ae4
I think should be Oceananigans.jl/test/test_enzyme.jl Lines 261 to 264 in 1522ae4
produces an out of bounds error with the z axis. I replaced Happy to push those changes. |
I noticed that. Why was it [1][3] before? We can't use |
Not exactly sure why it was [1][3], but if |
Hm ok got it, it depends on the status of what's returned |
Ok, so autodiff on the current commit at (Δν, ΔeΔν) = (1.0e-5, -1.7467288660086175e-7) However, since |0 - FD result| / |FD result| = 1.0, the test technically fails. Before you were applying AD to |
I believe this is because |
The CPU tests should be back up, so when you make another commit all the tests will run hopefully. It looks like the regression tests are passing so this should be ready to merge soon. |
@jlk9 you will have to give it an approval since I opened the PR |
Perfect, 0 derivative is expected for |
Some of the distributed tests are failing, and I wonder if the changes to
The stack trace goes through Looking at the function handles:
The main difference in the handles is the
we see that the Clock object is getting treated as a buffer instead of part of args. The PR doesn't change |
i'll look into this |
I see you have changed Oceananigans.jl/src/DistributedComputations/halo_communication.jl Lines 81 to 85 in 4e22432
In this PR it seems like this extension has been bypassed, and the tests are trying to send halos of field tuples. Maybe it's a useful information to debug |
…egions! is the issue
Oceananigans.jl/src/Fields/field_tuples.jl Lines 56 to 72 in 11809b0
Probably not the way to permanently fix it, but I've just added |
Co-authored-by: Gregory L. Wagner <[email protected]>
…glw/autodiff-hydrostatic-turbulence
This PR adds a test that automatically differentiates a hydrostatic turbulence simulation that uses
Centered
momentum advection and anExplicitFreeSurface
.@wsmoses @jlk9
Previously we had deduced that this model was differentiable, but it looks like there are still some issues. Not sure what the other test was doing. I think we should consider this separately from #3822 since that PR is currently working on the
SplitExplicitFreeSurface
.