-
-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Manopt.jl wrapper #712
Conversation
In addition to the existing implementation picked from the PR, it is desired to support an interface to specify the Manifold as a constraint or metadata of the optimization variable especially from the MTK interface. |
But maybe the solver approach here makes more sense, need to think more about this |
Hi, I prefer to think of the manifold as a domain, not a constraint. The reason is, that the algorithms in Manopt work intrinsically, without having any embedding (where the manifold would be the constraint) in mind: They require an inner product on the tangent spaces, a vector transport (sometimes) and a retraction (maybe its inverse as well) but not a projection (for example for SPD matrices, such a projection does not even exist, since that set is open in its embedding). The JuMP extension Benôit was so nice to program, does this in a nice way I think, see https://manoptjl.org/stable/extensions/#JuMP.jl They use |
In theory, manifold is clearly a part of the problem definition, not a solver. As Ronny wrote, it may be even considered a domain and not a constraint, similarly to how boolean and integer parameters are usually not handled as constrains on real values but a thing on its own. In those cases there is overwhelmingly clear evidence that this is the way to go but IMO in the case of manifolds it's still an open problem. One interesting benefit of treating them like domains is that manifold optimizers like Manopt.jl minimize the need to flatten the structure of variables and can naturally support tangents of different types than optimized variable. I'm not sure how well known that is but "you don't have to flatten" would be my best argument at this time for considering manifolds a domain. In practice, if you only ever use manifolds with Manopt.jl, it would likely be easier to make them a part of solver. |
With the symbolic interface (through MTK) we should be able to support something similar to the domain definition, to me that seems most intuitive as well. But I can see that be a bit cumbersome when you have to define multiple variables and since all of them have to usually belong to the same manifold this solver approach does turn out to be easier from both the user and development point of view, so I'd like to support both |
For multiple variables we have the power manifold (even with the shortcut |
The newest test already looks quite nice, just that philosophically I would prefer to have the manifold in the objective not the solver. |
I completely agree @kellertuer but I can't think of a solution to this right now. In a functional interface it is pretty hard to find a membership syntax. If you think about it this is pretty similar to how the Manopt.jl's interface looks as well, so it shouldn't be very unintuitive.
vs
in Manopt.jl too the manifold gets passed only at the optimizer invocation |
My fear is that this is too prone to error for the following reason. you can think of
as
so internally we actually have an objective (that would store but that is not so nice to call, so I would prefer a syntax more like
(yes I have not looked at the internals, sorry) Otherwise I would fear that a user might to easily do
and I fear this might really be something that I would avoid to lead the user to do – not only because I do not want to explain this (again and again). |
That makes sense, I'll figure out the best way to make it work |
One thing one could do, is to define a new constructor for the I checked both the objective and the problem. While both have quite quite a few fields, they all seem to be very very much tailored towards constraint problems and objectives (up to second order) solvers. |
…ot passes or doesn't match solver
This should be ready now. The syntax looks like this
and if the optimizer's manifold doesn't match the problem's it throws an error
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two quick comments for now.
BTW, I'm currently working on adding box-constrained optimization to Manopt. Activity analysis is currently not planned (though entirely possible) but simple projection should be much better than nothing.
callback = (args...) -> (false), | ||
maxiters::Union{Number, Nothing} = nothing, | ||
maxtime::Union{Number, Nothing} = nothing, | ||
abstol::Union{Number, Nothing} = nothing, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can translate abstol
into StopWhenChangeLess
I think. reltol
is really hard to define generically so we don't use that.
# we unwrap DebugOptions here | ||
minimizer = Manopt.get_solver_result(opts) | ||
return (; minimizer = minimizer, minimum = loss(opt.M, minimizer), options = opts), | ||
:who_knows |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get the retcode
by calling
asc = get_active_stopping_criteria(o.stop)
Then asc
is a vector of stopping criteria that triggered. For example if it's a one-element vector with a StopAfterIteration
object, we know that it reached the limit of iterations. If you want a quick "does it look like it converged?" information, you can use
any(Manopt.indicates_convergence, asc)
That is exactly what I had in mind
It is already wuite late here so I will only have time tomorrow afternoon to check the whole code – but what is the reason to store the manifold in the optimiser? Sure the constructors sometimes need the manifold so we could keep that (for them to fill reasonable defaults), but if the optimiser would not store the manifold (since it is now stored in the problem anyways), then why store-twice-and-error instead of store-once-be-happy? |
|
||
gradF = build_gradF(f, prob, cur) | ||
|
||
opt_res, opt_ret = call_manopt_optimizer(opt, _loss, gradF, prob.u0, stopping_criterion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change this interface for example to
opt_res, opt_ret = call_manopt_optimizer(opt, _loss, gradF, prob.u0, stopping_criterion) | |
opt_res, opt_ret = call_manopt_optimizer(opt, prob.M, _loss, gradF, prob.u0, stopping_criterion) |
remove the manifold from all optimisers and be happy :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, good point. Will just remove it.
Oh I was wrong, since you don't have any method structs defined in Manopt.jl we'd need to create them here for the API to match other solver wrappers. |
Can you elaborate a bit what you mean with method structs? The reason I cam to that question is, that I think all the wrapper structs here are very similar (just less variables stored) to the structs call state in Manopt.jl (https://github.com/JuliaManifolds/Manopt.jl/blob/1ea2ac69d1ec48f3029841ddcee4af8958d361af/src/solvers/gradient_descent.jl#L2 ). |
@kellertuer the Optimization.jl API looks like this in general
hence it needs a struct to instantiate the I did see the
but since the |
That is why I am not sure it is worth “inventing” half-states here that store parts of those instead just to dispatch on the |
I'd be very happy to not have to reimplement all that for every solver but right now it's impossible based on my understanding, but maybe @mateuszbaran can also suggest if we can avoid this and rely on some more generic interface to solvers in Manopt? |
I hope I made clear, that I am not 100% sure that works, since I am not yet 100% sure how Optimization.jl works In all details, it just seems, the new structs are a bit of “double structures” introduced. Yeah but maybe Mateusz sees that a bit better. |
Manopt currently lacks a solver structure that fits here. State is supposed to be constructed when you have at least a bit of information about the problem, and the only other option is I had a very similar problem when working on hyperparameter optimization for Manopt: https://juliamanifolds.github.io/ManoptExamples.jl/stable/examples/HyperparameterOptimization/ . Lack of a problem-independent solver structure made that code more ugly than I'd like. Anyway, I think something like this would be a bit nicer then specializing in great detail for each solver: struct ConcreteManoptOptimizer{TF,TKW<:NamedTuple} <: AbstractManoptOptimizer
optimizer::TF
kwargs::TKW
end
ConcreteManoptOptimizer(f::TF) where {TF} = ConcreteManoptOptimizer{TF}(f, (;))
# for example
# solver = ConcreteManoptOptimizer(quasi_Newton)
function call_manopt_optimizer(
M::ManifoldsBase.AbstractManifold, opt::ConcreteManoptOptimizer,,
obj::AbstractManifoldObjective,
x0;
stopping_criterion::Union{Manopt.StoppingCriterion, Manopt.StoppingCriterionSet},
kwargs...)
opts = opt.optimizer(M,
loss,
obj,
x0;
return_state = true,
stopping_criterion,
kwargs...)
# we unwrap DebugOptions here
minimizer = Manopt.get_solver_result(opts)
return (; minimizer = minimizer, minimum = loss(M, minimizer), options = opts)
end And here: opt_res = call_manopt_optimizer(manifold, cache.opt, _loss, gradF, cache.u0;
solver_kwarg..., stopping_criterion = stopping_criterion) You'd have to construct a manifold objective: obj = ManifoldGradientObjective(_loss, gradF; evaluation=evaluation)
opt_res = call_manopt_optimizer(manifold, cache.opt, obj, cache.u0;
solver_kwarg..., stopping_criterion = stopping_criterion) or ManifoldCostObjective(f) for gradient-free optimization, or ManifoldHessianObjective(f, grad_f, Hess_f, preconditioner; evaluation=evaluation) for optimization with Hessian-vector products. If this doesn't work for some solver, we could either fix it in Manopt or specialize Note that some solvers don't strictly require Maybe Another note, Armijo and WolfePowell are not great line search algorithms. https://github.com/mateuszbaran/ImprovedHagerZhangLinesearch.jl/ is currently the best one available for Manopt. It might get integrated into LineSearches.jl at some point but currently it doesn't have a particularly high priority (see JuliaNLSolvers/LineSearches.jl#174 ). |
I do agree that the `solver_name´ (high-level) functions were always meant to be the ones used by end users, but could you elaborate on
? For example https://manoptjl.org/v0.4/solvers/gradient_descent/#Manopt.GradientDescentState – sure needs a manifold like nearly anything in Manopt, but that is also just needed to initialise the internal fields. Concerning the need of a x0 (or whether a random is chocse) – we should unify that to always fall back to a rand. I like your approach and if some states do not yet fit that we could check how we can adapt them :) |
States sometimes also need a point for initialization, and requiring a manifold simply rules them out in hyperparameter optimization, and in your own words:
This is IMO not a great solution, these states don't look like they were intended to be used like that. And ugly things start happening when the default point doesn't match what user wants to provide through the Optimization.jl interface. And we have two places where manifold is needed, not great. |
Well for the point, we should then move to using Sure the manifold is needed in two places – but I do not see how we can avoid that (in like: at all, that is structurally necessary to initialise some fields and types). |
Ideally we should have manifold in only one place, either objective or solver. You suggested you prefer manifold to be in the objective. So we can't use Manopt states to represent solvers here. And IMO states shouldn't really be constructed until we have initial search point. Which gives another reason why states are not suitable here. |
The *State should map to OptimizationState https://docs.sciml.ai/Optimization/stable/API/optimization_state/. The solver structs are for telling the |
Hm. I do not see how states (as they are designed in Manopt.jl) should ever be able to be generated without a manifold; something like storing the iterate requires at some point to initialise that field (and know its type, since all states are parametrised with that). So we can not do magic. Ah, now I see, I thought until now “solver struct” is the Optimization.jl alias for state, though I saw the state thingy. Then the state should definelty not be used in the “solver struct” setting – even more since it is not suitable. The question is now should we do one “solver struct” for every Manopt Solver? Or is a generic one enough (and more flexible)? |
It'd need to be for each solver. |
Then my conclusion is: the current way of doing this is probably the best we can do. |
Cool, I'll merge this PR and continue fleshing it out more with a fresh PR to keep it contained and easier to review |
Ui. I am still always surprised how fast SciML merges, I am not yet convinced this works safe and sound – but it seems our philosophies on merging are maybe different. Again, thanks for working on this :) Once this works, I will adapt the Manopt readme for sure. |
and with works, I also mean officially mentioned in the Readme and docs :) |
Notice that it's not going to be a registered package until we reach convergence so the merging is more for dev and review convenience rather than to put it out for end users yet! |
Yes, I saw that. As I said, it is just a slightly different philosophy. My idea is usually that what is on master is basically “ready to put out” and maybe just waiting for one or two other PRs to make a nice new registered version. |
@kellertuer @mateuszbaran Sorry I got busy with end-of-semester things since we last talked about this. I am hoping to get this out in the next few days.
|
No worries, I am happy that Manopt will be available here somewhen and then link to it, but it is not urgent; and sure sometimes there is teaching and semester things.
|
We are currently reworking constraint handling in Manopt. For box constraints I have added the Hyperrectangle manifold with corners, and our quasi-Newton solvers should work correctly with it since JuliaManifolds/Manopt.jl#388 . How is this different from L-BFGS-B? Well, the classic L-BFGS-B is more careful about computing directions of descent so that it never encounters a non-positive direction while Manopt.jl currently just resets the qN operator and goes towards negative gradient. It's less efficient but this way we can handle mixed box-manifold constraints and it appears to still be convergent. Another thing is line search -- L-BFGS-B has a specialized line search that looks for potential points of nondifferentiability while we do not. Again, it's not an ideal approach but I wouldn't be surprised if this were the SoTA for mixed box-manifold constraints. I'm also going to try a similar approach for the positive semi-definite manifold with corners. The second part of our rework is JuliaManifolds/Manopt.jl#386 where we are going to provide a more performance-oriented interface for arbitrary equality and inequality constraints (and more). I'm looking for some standard benchmark for such problems, if you have anything I'd very much like to add Manopt there. I can write Manopt benchmarking myself if you have it working for other optimization libraries. I was thinking about CUTEst.jl but I couldn't find any complete benchmark based on it for any Julia library.
For box constraints and mixed box-manifold constraints, you should be good to go. For arbitrary constraints I would suggest waiting until JuliaManifolds/Manopt.jl#386 is finished. Though if you have a benchmark, I could use it identify potential performance issues in that PR. |
Aimed at moving the already substantial work done in #437 forward
Checklist
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Additional context
Add any other context about the problem here.