Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite fit_simulation.md for clarity. #197

Merged
merged 10 commits into from
Dec 25, 2023

Conversation

nathanrboyer
Copy link
Contributor

I am learning SciML and its related packages for the first time by going through the SciML tutorials. The fit_simulation was especially confusing, so I attempted to rewrite it for clarity.

nathanrboyer and others added 3 commits November 14, 2023 16:55
Needs more work around explaining `p`, the function to `optf`, and why we are fitting to parameters we already know.
@nathanrboyer
Copy link
Contributor Author

nathanrboyer commented Nov 16, 2023

I have some remaining questions.

  1. I tried to change x and y to be countable numbers by multiplying them by 100 (and also changing u0 and pguess), but nothing I tried would optimize close to the correct solution. I could only get my initial pguess to move around by order O1 not O10 or O100. How can I allow for larger state variable movement in Optimization.jl, so that the number of wolves is not 0.3 but 30?
  2. The callback function was initially defined as a named anonymous function, and I found the syntax of that very confusing. Is there any reason to keep it that way? The documentation shows callback defined both the original way and as a normal function. If it's all the same, I'd just use a regular function definition.
  3. I could not find any documentation at all for PolyOpt. This tutorial should say something about why it was chosen, especially since the previous tutorial used NLopt.LD_LBFGS instead.

I think I understand the tutorial now, but due to the questions I still have above, I'm not confident about applying these approaches more generally. Hopefully you can review and provide some insight. Thanks.

@Vaibhavdixit02
Copy link
Member

Vaibhavdixit02 commented Nov 16, 2023

I tried to change x and y to be countable numbers by multiplying them by 100 (and also changing u0 and pguess), but nothing I tried would optimize close to the correct solution. I could only get my initial pguess to move around by order O1 not O10 or O100. How can I allow for larger state variable movement in Optimization.jl, so that the number of wolves is not 0.3 but 30?

Can you show the code you ran?

The documentation shows callback defined both the original way and as a normal function. If it's all the same, I'd just use a regular function definition.

Yes feel free to change that, there's no reason to prefer the anonymous function, it's just easier to write but makes sense that not for reading for beginners 😅

I could not find any documentation at all for PolyOpt. This tutorial should say something about why it was chosen, especially since the previous tutorial used NLopt.LD_LBFGS instead.

I'll add that in, I just forgot to add it :) SciML/Optimization.jl#627

@nathanrboyer
Copy link
Contributor Author

nathanrboyer commented Nov 17, 2023

Can you show the code you ran?

None of the commented pguess vectors worked, even after 10 or 100x the original number of iterations.

using DifferentialEquations, Optimization, OptimizationPolyalgorithms, SciMLSensitivity
using ForwardDiff, Plots

# Define experimental data
t_data = 0:10
x_data = [100 277 677  97 189 610 140 134 435 325 103]
y_data = [100  26 202 191  32  63 346  51  31 455  91]
xy_data = vcat(x_data, y_data)

# Plot the provided data
scatter(t_data, xy_data', label=["x Data" "y Data"])

# Setup the ODE function
function lotka_volterra!(du, u, p, t)
    x, y = u
    α, β, δ, γ = p
    du[1] = α * x - β * x * y
    du[2] = -δ * y + γ * x * y
end

# Initial condition. u0 = [100, 100]
u0 = xy_data[:, begin]

# Simulation interval. tspan = (0, 10)
tspan = extrema(t_data)

# LV equation parameter. p = [α, β, δ, γ]
pguess = [100.0, 1.2, 250.0, 1.2]

# Set up the ODE problem with our guessed parameter values
prob = ODEProblem(lotka_volterra!, u0, tspan, pguess)

# Solve the ODE problem with our guessed parameter values
initial_sol = solve(prob, saveat = 1)

# View the guessed model solution
plt = plot(initial_sol, label = ["x Prediction" "y Prediction"])
scatter!(plt, t_data, xy_data', label = ["x Data" "y Data"])

# Define a loss metric function to be minimized
function loss(newp)
    newprob = remake(prob, p = newp)
    sol = solve(newprob, saveat = 1)
    loss = sum(abs2, sol .- xy_data)
    return loss, sol
end

# Define a callback function to monitor optimization progress
function callback(_, l, sol)
    display(l)
    plt = plot(sol, ylim = (0, 600), label = ["Current x Prediction" "Current y Prediction"])
    scatter!(plt, t_data, xy_data', label = ["x Data" "y Data"])
    display(plt)
    return false
end

# Set up the optimization problem with our loss function and initial guess
adtype = AutoForwardDiff()
optf = OptimizationFunction((x, _) -> loss(x), adtype)
# pguess = [1.0, 1.2, 2.5, 1.2]
# pguess = [100.0, 1.2, 250.0, 1.2]
# pguess = [1.5, 1.0, 3.0, 1.0]
# pguess = [150.0, 100.0, 300.0, 100.0]
# pguess = [150.0, 1/100, 300.0, 1/100]
pguess = [150.0, 1.0, 300.0, 1.0]
optprob = OptimizationProblem(optf, pguess)

# Optimize the ODE parameters for best fit to our data
pfinal = solve(optprob,
               PolyOpt(),
               callback = callback,
               maxiters = 2000)
α, β, γ, δ = round.(pfinal, digits=1)

Yes feel free to change that

Okay, I have done so.

I'll add that in

Thank you for documenting, but "Adam" and "BFGS" still mean nothing to me. Can you add a commit here about why PolyOpt would be chosen in this case as opposed to NLopt.LD_LBFGS? NLopt.LD_LBFGS was described as both "robust" and "performant" in the previous tutorial, so I don't understand why we switched algorithms or how to chose an appropriate algorithm.

@nathanrboyer
Copy link
Contributor Author

The PR as written is accurate and works, but I will change it to use the above code instead if you know an easy way to get the optimization to converge.

@Vaibhavdixit02
Copy link
Member

Yeah that doesn't converge for me either. I didn't look too deeply into it though, I am guessing the scaling difference shows up somewhere making things ill-conditioned. Are you finished with the all other changes then?

docs/src/getting_started/fit_simulation.md Outdated Show resolved Hide resolved
docs/src/getting_started/fit_simulation.md Outdated Show resolved Hide resolved
```

In this case, we will provide the callback the arguments `(_, l, sol)`,
since there are no additional optimization function parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect since they are the optimization variables as changed above

Copy link
Contributor Author

@nathanrboyer nathanrboyer Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "variable" wording seems too vague. There are "state variables", "parameters", and "hyperparameters" which are all variables in some sense. There are two different equation contexts where those definitions change, as I tried to explain at the bottom, and that creates a lot of the confusion for this passage.

I don't really understand what is incorrect about this. p = [\alpha, \beta, \gamma, \delta] are not the optimization parameters in this context. There are no optimization parameters which is why passing an underscore works.
I misunderstood. p is correct to pass here, but then why does it work with an underscore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make a table.

@ChrisRackauckas
Copy link
Member

@Vaibhavdixit02 what are the remaining steps here?

@Vaibhavdixit02
Copy link
Member

Just the one review comment left above

@ChrisRackauckas ChrisRackauckas merged commit 2e2e239 into SciML:main Dec 25, 2023
1 check failed

and the answer from the optimization is our desired parameters.
Here: our `loss` function does not require any hyperparameters, so we pass `_` for this `p`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line needs changed now to be consistent; _ was changed to p above.

I used _ and wrote this section in an attempt to avoid the confusion I had between this p and p = [\alpha, \beta, \gamma, \delta] which is actually passed as u.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this just underwent a change anyways. @Vaibhavdixit02 can you update this as part of that whole update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should instead say something like " Here: our loss function does not require any hyperparameters, so we do not provide the p argument to OptimizationProblem."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's some OptimizationState thing, so it should probably be named state? But this is part of @Vaibhavdixit02 's latest Optimization.jl versions anyways so it's not the only docs to update.

@nathanrboyer
Copy link
Contributor Author

I'll also reiterate that if someone can provide rabbit/wolf population data with countable numbers that still converges under optimization, that would also make this section easier to understand. y=0.2 wolves doesn't make any sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants