Optimal stepsize #52

antoine-levitt · 2017-05-23T19:18:58Z

It would be good to have an optimal linesearch, mostly for testing purposes (it somewhat reduces the noise in comparing two methods). I took a stab at implementing this using Optim.jl, but ran into the following circular dependency problem :
https://discourse.julialang.org/t/circular-modules-and-precompilation-errors/3855/2
(optim uses linesearches which uses optim)
Ideas for bypassing this welcome!

pkofod · 2017-05-23T19:35:08Z

Can you post the code? If you don't want to make it public, then can you at least send it to me in a private message on gitter? https://gitter.im/pkofod

Edit: Are you just using an optimizer from Optim to minimize along the search direction?

antoine-levitt · 2017-05-23T19:38:00Z

Oh of course, it's completely trivial:

import Optim

function exact!{T}(df,
                          x::Vector{T},
                          s::Vector,
                          x_scratch::Vector,
                          gr_scratch::Vector,
                          lsr::LineSearches.LineSearchResults,
                          alpha::Real = 1.0,
                          mayterminate::Bool = false)
    @assert alpha > 0
    f_calls = 1
    g_calls = 0

    function f_inner(alpha)
        push!(lsr.alpha, alpha)
        f_calls = f_calls + 1

        x_scratch .= x .+ alpha .* s
        f_x_scratch = df.f(x_scratch)
        push!(lsr.value, f_x_scratch)
        return f_x_scratch
    end

    res = Optim.optimize(f_inner, 0.0, 2alpha)
    alpha = Optim.minimizer(res)

    return alpha, f_calls, g_calls
end

I put that in an "exact.jl" file and included it from LineSearches.jl.

anriseth · 2017-05-23T20:04:32Z

An exact linesearches is a great idea:) Why are you searching on [0,2alpha]? Sounds somewhat arbitrary, when alpha is a parameter.

…

On Tue, 23 May 2017, 12:38 Antoine Levitt, ***@***.***> wrote: Oh of course, it's completely trivial: import Optim function exact!{T}(df, x::Vector{T}, s::Vector, x_scratch::Vector, gr_scratch::Vector, lsr::LineSearches.LineSearchResults, alpha::Real = 1.0, mayterminate::Bool = false) @Assert alpha > 0 f_calls = 1 g_calls = 0 function f_inner(alpha) push!(lsr.alpha, alpha) f_calls = f_calls + 1 x_scratch .= x .+ alpha .* s f_x_scratch = df.f(x_scratch) push!(lsr.value, f_x_scratch) return f_x_scratch end res = Optim.optimize(f_inner, 0.0, 2alpha) alpha = Optim.minimizer(res) return alpha, f_calls, g_calls end I put that in an "exact.jl" file and included it from LineSearches.jl. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#52 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFV6DaB7c_0act_2x_WZfp5R8MCh49VGks5r8zWZgaJpZM4NkJsn> .

pkofod · 2017-05-23T20:09:53Z

An exact linesearches is a great idea:)

It is at the very least great for illustrational purposes. I had thought of showing eg BFGS with exact vs BFGS with HagerZhang at the JuliaCon workshop.

Why are you searching on [0,2alpha]? Sounds somewhat arbitrary, when alpha is a parameter.

Agree it seems arbitrary. You could also use one of the other methods, just remember to use a 1-element array:

res = Optim.optimize(f_inner, [0.0], GradientDescent())

anriseth · 2017-05-24T01:11:04Z

It is at the very least great for illustrational purposes.

Hehe, yeah that's what I meant. It would be a bit extreme for normal usage (although a recursive call to optimize to do linesearch on the linesearch would be fun)

antoine-levitt · 2017-05-24T06:10:50Z

It is arbitrary yes, I put together an example code and then got stumped by the circular dependency issue. I was planning on increasing alpha until f(alpha) > f(0) and then calling Brent's method on [0,alpha]. It's probably non-optimal, but then that's not really the point here.

pkofod · 2017-06-08T19:46:14Z

So, it would be "fun" to have this, but I don't think it is feasible to have it in LineSearches.jl, as we certainly do not want LineSearches.jl to depend on Optim.jl. There are a few different things we could do, but feel free to keep working on it, and we can try to find a solution :)

antoine-levitt · 2017-06-08T20:26:26Z

Well, we could just reimplement (or copy/paste from somewhere) a simplified optimal linesearch, but that feels wrong. Another solution is to put the 1D optimizer in LineSearches, and have Optim call that?

mohamed82008 · 2018-04-08T04:54:54Z

Why not implement the optimal line search in Optim? I think this is certainly a point against the splitting of Optim and LineSearches even though it makes sense from an organizational point of view. And as a user, if I want to optimize the step size, it makes sense that I would need to use Optim. So LineSearches can be just for the line searches not based on optimization. Another added feature here might be a user-defined step size callback. This would be nice for the cases when it is possible to analytically derive an optimal step size given a search direction, which I imagine would be useful when obtaining an analytical derivative is possible. Analytical (or trivially computed) optimal step sizes are used in the LOBPCG algorithm for eigenvalues when minimizing or maximizing the Rayleigh quotient.

antoine-levitt · 2018-04-08T06:23:27Z

That's a very good point, user defined stepsizes would be useful, and a very elegant way to implement optimal stepsizes (although I have doubts against trying to fit LOBPCG in an optimization framework). The linesearch simplification should help with that.

mohamed82008 · 2018-04-08T07:19:23Z

LOBPCG would require multidimensional space-search to implement properly, so it's probably not worth the fuss, especially that we already have a fast implementation of the algorithm in pure Julia https://github.com/mohamed82008/LOBPCG.jl ;) But it was my inspiration to make this comment in the first place.

antoine-levitt · 2018-04-08T07:31:58Z

Nice! I have some single-vector LOBPCG code lying around that might be of interest: https://gist.github.com/antoine-levitt/67ef8771726b2b95b7a909c188a89ad5. I never did manage to get below 1e-8 precision on the residuals; there are a few papers on the subject (you need a couple more orthogonalizations) but never took the time to do it properly. If you get this done properly please consider putting it in IterativeSolvers.jl!

mohamed82008 · 2018-04-08T08:40:53Z

I tried the low precision, it's a bit of a hit and miss. For big matrices, I often get the B-norm to be negative due to computational error, but using norm(sqrt(Complex(dot(Bx,x)))) fixes it. I needed to do QR-orthogonalization with pivoting and increase the tolerance on the determinant for switching to orthogonalization. But doing all of these slowed down the program by a factor of 2 for lower tolerance so I think I am going to pass on the high precision :) A more elaborate relationship between these factors and the tolerance can be researched to come up with reasonable settings that work for high precision without compromising speed for lower precision.

pkofod · 2018-04-08T09:38:34Z

This is possible in the next version and very easy to do. Examples are always great! So don’t hesitate posting example code you wished worked.

I agree that it makes sense to handle the exact linesearxh in optim and that will also be very simple after the update.

mohamed82008 · 2018-10-31T11:49:20Z

Two more suggestions:

Split UnivariateOptim from Optim for use in LineSearches.
Use IntervalOptimisation for the optimal line search.

pkofod · 2018-10-31T18:14:42Z

That could be an option yes

anriseth added enhancement help wanted new feature labels Oct 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal stepsize #52

Optimal stepsize #52

antoine-levitt commented May 23, 2017

pkofod commented May 23, 2017 •

edited

Loading

antoine-levitt commented May 23, 2017

anriseth commented May 23, 2017 via email

pkofod commented May 23, 2017

anriseth commented May 24, 2017

antoine-levitt commented May 24, 2017

pkofod commented Jun 8, 2017

antoine-levitt commented Jun 8, 2017

mohamed82008 commented Apr 8, 2018

antoine-levitt commented Apr 8, 2018

mohamed82008 commented Apr 8, 2018

antoine-levitt commented Apr 8, 2018

mohamed82008 commented Apr 8, 2018

pkofod commented Apr 8, 2018

mohamed82008 commented Oct 31, 2018

pkofod commented Oct 31, 2018

Optimal stepsize #52

Optimal stepsize #52

Comments

antoine-levitt commented May 23, 2017

pkofod commented May 23, 2017 • edited Loading

antoine-levitt commented May 23, 2017

anriseth commented May 23, 2017 via email

pkofod commented May 23, 2017

anriseth commented May 24, 2017

antoine-levitt commented May 24, 2017

pkofod commented Jun 8, 2017

antoine-levitt commented Jun 8, 2017

mohamed82008 commented Apr 8, 2018

antoine-levitt commented Apr 8, 2018

mohamed82008 commented Apr 8, 2018

antoine-levitt commented Apr 8, 2018

mohamed82008 commented Apr 8, 2018

pkofod commented Apr 8, 2018

mohamed82008 commented Oct 31, 2018

pkofod commented Oct 31, 2018

pkofod commented May 23, 2017 •

edited

Loading