-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Including derivative in parallel L-BFGS-B method #309
Comments
@AdrianPerezSalinas thanks for this issue. I think this is feasible, by computing the analytic gradients manually or automatically (via tensorflow, or other backends). |
Do you think automatic gradients are compatible with the finite differences techniques? |
Usually, automatic analytic gradients are more efficient than finite differences, which are already computed by the scipy minimize. |
I think I did not explain myself properly. When I say finite differences for quantum circuits I am talking about a method that allows to compute the exact analytical gradient by shifting the values of the parameters a large quantity (for most operators this quantity is pi/2). This is robust against inherent statistical noise, so it is useful for the experiment. See Eqs. 13 and 14 from here: https://arxiv.org/pdf/1811.11184.pdf |
Thanks for the clarification, sorry for the misunderstanding. |
Nice! We can discuss it tomorrow |
Hi @scarrazza , I leave here a document explaining my point with more details. Hope you find it useful! |
Hi @scarrazza , However, if I do the same with measurements (approximation is kind of rough), optimization is really different. It gets stuck quickly since the derivatives do not give any information. In addition it looks like it does not matter how many shots you perform to estimate the hamiltonian, results are not good Today I will check the next steps |
Ok, thanks for this tests, lets see. |
I have tested a fitting problem like qPDF, results are comparable |
Same test for an easy classifier, I think that it is clear, we need to implement this functionality when applying optimization to measurements. In addition, at least for measurement simulations, it is way faster to do it in this new way. Do you want me to show you the code? |
Hi @AdrianPerezSalinas , just wondering if you are aware of this paper by Simon Benjamin on "Quantum Analytic Descent": |
I was not aware of it, but thank you very much for pointing it out! I have taken a look at it and it sounds really interesting. However, after the discussion today I think that the exact derivatives method or this one can reduce the error due to sampling, but cannot deal with imperfect circuits. We will have to further investigate about it |
I leave here the results of some tests made with the exact derivative and a VQE model. As you may see, with errors of order 0.1% we can still have some minimization. Noisier circuits are chaotic. I think that implementing this kind of exact derivatives could be interesting only if we know that the circuit noise is below a certain threshold. |
OK and how this compares to the numerical derivative for similar configurations? |
Nothing returns result extremely good, but exact derivatives are more resilient to noise and errors than numerical ones. Not too much, though VQE_0shots_0.01.pdf |
Hi everyone, have a Happy New Year.
We are most of the time using variational circuits that depend on some tunable parameters. We have been using scipy methods to find the optimal parameters, and recently the parallel L-BFGS-B method was added to the repository. My proposal is to extend this method to the case where the gradient of the function is given for optimizing.
I think this is useful for two main reasons:
I have been looking to the code and saw that the core of the computation is somehow delegated to the standard scipy recipe
with self.mp.Pool(processes=self.processes) as self.pool:
from scipy.optimize import minimize
out = minimize(fun=self.fun, x0=x0, jac=self.jac, method='L-BFGS-B', bounds=self.bounds, callback=self.callback, options=self.options)
Thus, it should be easy to implement what I ask for. The first step should be allowing the use of the keyword
fprime
which is already defined for this function. This is trivial, I think. The second step should be including the gradient function. However, this could be left to the user to pass as an argument of the code.What do you think?
The text was updated successfully, but these errors were encountered: